Java - How to avoid join 'explosions' in JPA with Hibernate when fields from multiple tables are needed?
Let's say I have the following method in a JpaRepository:
@EntityGraph(value = "Subject.allJoins", type = EntityGraphType.FETCH) @Query("select s from Subject s" + FIND_QUERY_WHERE) Page<Subject> findInProject(@Param("projectId") UUID projectId, <additional params>
As you can see I already use an EntityGraph with the joins I need. The resulting SQL query that Hibernate generates is as follows (with the where mostly omitted)
select subject0_.id, <all kinds of fields including some duplicates> from subject subject0_ left outer join project project1_ on subject0_.project_id = project1_.id left outer join subject_property_value properties2_ on subject0_.id = properties2_.subject_id left outer join property_value propertyva3_ on properties2_.property_value_id = propertyva3_.id left outer join ingestion_id_mapping ingestedme4_ on subject0_.id = ingestedme4_.subject_id where subject0_.project_id = '123' order by subject0_.name asc
Because of all the joins here that multiply the result for the number of rows as the result of a join, the result set explodes into hundreds of thousands of rows even if the total amount of subjects is just a few hundred.
Note that I will be making a projection and that will already avoid selecting some of the fields, but the joins are still needed.
What can I do to optimize this?
Note I do actually need all the data to immediately serialize to a client, so just leaving it to Hibernate by getting the model entity and using Getter methods for each association takes a lot longer than even this.
My current idea is that I have to do the query multiple times with the same where for each separate join and then later merge the results into a single object. It's not the end of the world if I read more or fewer rows on a subsequent query because of added or deleted rows in the original table because I can just take the smallest subset of subject ids and make a result from that.
But is there something smarter and/or simpler to do than that?
There are 3 suggested solutions here and each one has been listed below with a detailed description. The following topics have been covered briefly such as Java, Spring, Hibernate, Jpa, Jpql. These have been categorized in sections for a clear and precise explanation.