Optimize the Ruby-on-Rails database – part one

Reduce the number of database queries

A typical web application does many database queries before it delivers the response page to the web browser. The application needs to wait for the response of each of these database queries. It gets an additional slow down because of process switching. The database server must analyze each request, and also the communication between the database and the application takes time. So less database queries reduce the overall load on the system. The system scales better.

Container vessels and flag states

As an example, we implement a marine information system. The data model includes shipping companies (companies), vessels (container_vessels) and flag states (countries). A shipping company has a number of vessels, each vessel is registered in a flag state.

class Company < ActiveRecord::Base
    has_many :container_vessels
end
 
class ContainerVessel < ActiveRecord::Base     
  belongs_to :company     
  belongs_to :legal_country, :class_name => 'Country'
end

We now want to show a table with the vessels for a given shipping company. The table should contain all their vessels together with the flag state of the vessel. A straight-forward implementation just uses the associations of the model.

In the controller, the vessels will be retrieved:

@company = Company.find(params[:company_id])
@container_vessels = @company.container_vessels.order(:name)

And in the view we just access to the country through the association

<% @container_vessels.each do |vessel| %>
  <%= vessel.name %>
  <%= vessel.legal_country.name if vessel.legal_country.present? %>
<% end %>

The solution works and is easy to understand. Let’s see what database queries are triggered:

SELECT "companies".* FROM "companies" 
        WHERE "companies"."id" = ? LIMIT 1 [["id", "3"]]
SELECT "container_vessels".* FROM "container_vessels" 
        WHERE "container_vessels"."company_id" = 3 ORDER BY name
 
SELECT "countries".* FROM "countries" WHERE "countries"."id" = 4 LIMIT 1
SELECT "countries".* FROM "countries" WHERE "countries"."id" = 10 LIMIT 1
SELECT "countries".* FROM "countries" WHERE "countries"."id" = 9 LIMIT 1

All vessels of a shipping company are retrieved in a swing with a single database query. The country data sets, however, are retrieved individually. The more countries, the more database queries. Obviously there is potential for improvement.

Eager loading

Active Record offers a simple improvement for these situations: the includes method. When we request the vessel-objects we are able to tell Rails that we are also interested in the countries-objects:

@container_vessels = @company.container_vessels.order(:name).
        includes(:legal_country)

That was all, no additional changes needed! As a result, these database queries are triggered::

SELECT "companies".* FROM "companies" 
        WHERE "companies"."id" = ? LIMIT 1 [["id", "2"]]
SELECT "container_vessels".* FROM "container_vessels" 
        WHERE "container_vessels"."company_id" = 2 ORDER BY name
SELECT "countries".* FROM "countries" 
        WHERE "countries"."id" IN (8, 7, 4)

So instead of retrieving each country individually, all countries are loaded in a single query (eager loading). The advantage can be huge for table-heavy information pages.

Unfortunately, there still are some problems:

  1. The query of related records is difficult to customize. For example, one can not simply restrict the data fields with select.
  2. The optimization potential is not exhausted. All vessel records are retrieved in one step, in a second step the country data sets are retrieved, and each is identified individually by ID. But SQL should make it possible to get the records together in only query.

So there is still enough material for the continuation of this article.

Leave a comment