Intro to JPQL for ActiveRecord Users

Introduction

In the previous blog post, I demonstrated how to define a JPA entity, and how it differs from an ActiveRecord model. In this and subsequent posts I will show some ways to load entities, and when to, and - more importantly - when not to use them.

To recap, querying ActiveRecord models looks like this:

# Example of fetching multiple Contacts by last name,
# and pre-loading each contact's Client to avoid N+1
# queries.
contacts = Contact.where(last_name: "Flintstone")
                  .includes(:client)

contacts.each do |contact|
  puts "#{contact.last_name}: #{contact.client.name}"
end

Basically you use the model class directly, and a simple DSL that you can chain as much as you like. It is “lazy”, so nothing will be loaded until you call “load”, “each”, “map”, etc. This will more or less generate the SQL you expect, as at any time you may call #to_sql to see what will happen.

Here are the JPA choices for achieving the same result that I will discuss:

JPQL - this post
Repositories
QueryDSL

Here are the choices I will not discuss, because they are an over-complicated waste of time:

Criteria Query API (google it yourself and cry)

Already you may be asking: Why do we need three separate APIs, with three separate blog posts? This is a very good question!!! If you have an answer as to why this sucks so much, please let me know!

What is JPQL?

JPQL is an SQL-like language for querying objects/entities (not tables), defined by the JPA specification. It is the lowest-level interface for loading entities, and typically featured in the most basic examples.

Let me repeat: objects, not tables.

Detour: Getting an EntityManager instance

Before we can do anything with JPQL, we need an instance of the EntityManager. In Spring Boot you can have this auto-injected into your class as follows:

@PersistenceContext
private EntityManager entityManager;

This will handle threading properly. Don’t ask me how it works, but apparently it is possible for the framework to modify private variables through Java’s reflection API. Who knew! (It makes me feel slightly queezy.)

JPQL Simple Example

List<Contact> contacts = entityManager.createQuery(
    "SELECT c FROM Contact c WHERE c.lastName = :lastName",
    Contact.class
  )
  .setParameter("lastName", "Flintstone")
  .getResultList();

// Example of looping through them to print their names:
for (Contact contact : contacts) {
  System.out.println( contact.getLastName() );
}

Not too bad, right? You might be thinking it looks type-safe, fairly intuitive, and succinct (as much as Java ever can be).

Lazy Loading

Now that we have our Contact entities, what happens if we try to access relationships?

for (Contact contact : contacts) {
  System.out.println( contact.getLastName() + ": " +
                      contact.getClient().getName() );
}

As you might expect, the Client object will be lazily loaded. How, you ask? The Contacts are actually wrapped by “proxy” objects that do the lazy loading. getClient() returns a proxy, which only goes to the database if you call a method such as client.getId(). (This is yet another piece of Hibernate magic that doesn’t seem possible in the Java type system, but there you are.)

This could be a whole blog post itself, so we will leave it at that for now.

Preventing N+1

Since that way of accessing clients has an N+1 issue, let’s modify this query to pre-load each Contact’s Client relationship, like we did in Ruby:

List<Contact> contacts = entityManager.createQuery(
    "SELECT c FROM Contact c " +
    "JOIN FETCH c.client cl " +
    "WHERE c.lastName = :lastName",
    Contact.class
  )
  .setParameter("lastName", "Flintstone")
  .getResultList();

JPQL will figure out the joins for you, and has a special “fetch” keyword to signify pre-loading. You also have to reference “client” as a property of “contact”: c.client. Again, in this query language we are dealing with entities, not tables.

Aside: The blogosphere on avoiding N+1 issues in JPA seems woefully lacking. Many posts treat the reader like they are five years old, and maybe aren’t ready to tackle such “complex” issues, or spend pages beating about the bush. Documentation rarely offers straightforward solutions that work in every circumstance. Note that ActiveRecord’s .includes(...) method is obvious and easy to use, and may be explained in five seconds. This is typical. Do I sound frustrated?

When to use it

When you have a custom one-off query of some complexity, with an un-changing structure.

When NOT to use it

When the query structure needs to change based on user input. (Most search queries.)
Any time there is an equally clear and succinct alternative.

Comparison to ActiveRecord

ActiveRecord::Relations are chain-able and composable (see example in introduction). JPQL is not (hence why it is bad for searches).
If you want to pre-load any relationships, you will need to specify the join type (left, inner, outer, cross, etc.).
JOIN FETCH actually does a join. ActiveRecord #includes(...) will usually result in a separate query.

Useful Tricks

Don’t ask me how this works, but it can be handy:

class ContactIdAndName {
  public final Long id;
  public final String lastName;

  public ContactIdAndName( Long id, String lastName ) {
    this.id = id;
    this.lastName = lastName;
  }
}

List<ContactIdAndName> contacts = entityManager.createQuery(
    "SELECT new com.mycompany.ContactIdAndName(c.id, c.lastName) FROM Contact c"
    ContactIdAndName.class
  )
  .getResultList();

If you have ever used Java reflection API to instantiate objects, you know how much code this is saving. This is the only feature of JPQL that I have found particularly exciting.

What I think

I am not sure why anyone bothered to create JPQL. It combines all the downsides of writing SQL by hand with all the downsides of writing in a language that looks like SQL, but isn’t actually SQL. I have lost count of the number of things I thought should work, but didn’t, and the number of utterly incomprehensible runtime error messages I have had to Google. One of the main things I want from Java (in exchange for increased verbosity) is compile-time type-safety. JPQL takes it away. :(

I also find the entire idea of thinking in objects instead of tables to be mind-boggling. Like, why is maintaining that layer of abstraction so sacred we need to invent a language on top of SQL? It seems farcical. No wonder everyone has been complaining about the “object-relational impedence mismatch” for years, going so far as to call it “the Vietnam of Computer Science”.

Unfortunately JPQL is unavoidable, and most things you do with JPA will hearken back to it in some way. Sigh.