Sign in

About some Active Record methods or what you may forgot about Active Record

Like all good stories, this one also has been started with fuck up on production. Once we started receiving randomly failed jobs, their processes were killed by our service. It was because of the bad request below:

def ip_segment_needed?
targeting_strategies.single.that_are_archived.where('ip_addresses = ?, true).present? && ip_addresses.present?
end

A little spoiled: the problem was in ip_addresses.present?

The problem was in IpAddresses table: for that moment it had more than 2 million records and the usual time for this request was more than 30 minutes. So the goal of this story is to tell you about Active Record methods that we’re using every day and may not know how they are working inside and have performance problems because of it.

It’s a pretty easy recipe to improve your performance:

  1. Less number of calls to the database
  2. Less the memory allocations
  3. Less the duration of complete the operation

Count and Size

Based on the source activerecord/lib/active_record/relation/calculations.rb you can see that count always makes a database query:

def count(column_name = nil)  if block_given?    ………    return super()  end
calculate(:count, column_name)
end

In comparison with the size method (activerecord/lib/active_record/relation.rb) you can see that it’s a simple ruby method on an array:

def size  loaded? ? @records.length : count(:all)end

In total:

  • count always makes a database query
  • call size If the data is already loaded
  • or always use size, it makes the database query only if the association is not loaded

Present? and Blank?

These relation’s methods are essentially the same, you can be sure of this by checking activesupport/lib/active_support/core_ext/object/blank.rb:

def blank?    
respond_to?(:empty?) ? !!empty? : !self
end
def present?
!blank?
end

So that means that every time when you call present? you actually call blank? and there’re some things that you need to know before using these methods:

  • It generates SQL, e.g. SELECT "users".* FROM "users"
  • The result is Memorized
  • It allows to not run a query if it already loaded
  • But should not be used if the ActiveRecord::Relation will never be used in its entirety after present? or blank?

    Example of good using (in that case all relations were uploaded only in the first string):
@relation.present?
@relation.first(3).each ...

Exists?

exists? in my opinion is one of the most insidious relation’s method because usually people don’t really know how it’s working and use it because of good sounds. Particular qualities of exists? :

  • It generates SQL: SELECT 1 AS one FROM "users" LIMIT 1
  • Not memorized. At all
  • Always runs queries. Doesn’t loaded a query
  • Would be better off using present? or blank?

The most important difference between exist? and blank? is in exist’s LIMIT. Returning to the initial issues, the fix of our 30+ minutes query we just needed to use exist? instead of present? because in the first case only one record is uploaded and the second — all 2 million.

Example of bad exists? using:

- if @users.exists? # SELECT 1 ...  = @users.size # COUNT ...- if @users.exists? # SELECT 1 ...- @users.each do |user| # loads the entire relation  ...

Based on blank? knowledges you may see that this code could be overwritten with only one query using, which improve performance.

Any? None? Empty?

Based on the source (activerecord/lib/active_record/relation.rb) you can see that all these methods are also essentially the same and in the end are linked on exists? method.

def any?  return super if block_given?
!empty?end
def none? return super if block_given? empty?end
def empty? return @records.empty? if loaded? !exists?end

Rules of use for these methods are the same as for the exists? but additionally, you can use it in cases with taking a section of the ActiveRecord::Relation using first or last e.g:

@users.each if @users.any?# There’re actually two queries:@users.any? # (SELECT 1 AS one FROM ... LIMIT 1)@users.each # (SELECT "users".* FROM "users" WHERE ...)

Ways to improve code behind:

@users.each if @users.any?# Should be replaced with:@users.each if If @users.present?or@users.first(3).each if @users.any?or @users.each if @users.load.any?

Where

For example a method in the User model:

class User < ActiveRecord::Base  def active_comments    comments.where(active: true)  endend

Every time when you call active_comments method on a user you also make a request to a database. In using it in a view your controller’s code won’t help you to not cause requests every time. Alsoinclude or other preload methods won’t help you to avoid an extra request.

One of the solutions:

class User  has_many :comments  has_many :active_comments, -> { active }, class_name: "Comment"endclass Comment  belongs_to :user  scope :active, -> { where(active: true) }endclass UsersController  def index    @users = User.includes(:active_comments)  endend

Quick tips

  • includes to avoid n+1
  • find_each when loading a large amount of records
    increments number of queries => reduce memory on loading a huge number of records
  • Check existence of a record with exists? Instead of present?
  • Prefer ActiveRecord::Relation#size over ActiveRecord::Calculations#count

Summary

  • try to keep the number of SQL queries to a minimum
  • try to use preloaded queries
  • database is always faster than Ruby in retrieving, parsing, or processing data

To read

https://www.netguru.com/codestories/activerecord-tips-2