About some Active Record methods or what you may forgot about Active Record
Like all good stories, this one also has been started with fuck up on production. Once we started receiving randomly failed jobs, their processes were killed by our service. It was because of the bad request below:
def ip_segment_needed?
targeting_strategies.single.that_are_archived.where('ip_addresses = ?, true).present? && ip_addresses.present?
end
A little spoiled: the problem was in ip_addresses.present?
The problem was in IpAddresses table: for that moment it had more than 2 million records and the usual time for this request was more than 30 minutes. So the goal of this story is to tell you about Active Record methods that we’re using every day and may not know how they are working inside and have performance problems because of it.
It’s a pretty easy recipe to improve your performance:
- Less number of calls to the database
- Less the memory allocations
- Less the duration of complete the operation
Let’s overview some Active Records methods to improve your application performance
Count and Size
Based on the source activerecord/lib/active_record/relation/calculations.rb you can see that count always makes a database query:
def count(column_name = nil) if block_given? ……… return super() end
calculate(:count, column_name)end
In comparison with the size method (activerecord/lib/active_record/relation.rb) you can see that it’s a simple ruby method on an array:
def size loaded? ? @records.length : count(:all)end
In total:
- count always makes a database query
- call size If the data is already loaded
- or always use size, it makes the database query only if the association is not loaded
Present? and Blank?
These relation’s methods are essentially the same, you can be sure of this by checking activesupport/lib/active_support/core_ext/object/blank.rb:
def blank?
respond_to?(:empty?) ? !!empty? : !self
end def present?
!blank?
end
So that means that every time when you call present?
you actually call blank?
and there’re some things that you need to know before using these methods:
- It generates SQL, e.g.
SELECT "users".* FROM "users"
- The result is Memorized
- It allows to not run a query if it already loaded
- But should not be used if the ActiveRecord::Relation will never be used in its entirety after
present?
orblank?
Example of good using (in that case all relations were uploaded only in the first string):
@relation.present?
@relation.first(3).each ...
Exists?
exists?
in my opinion is one of the most insidious relation’s method because usually people don’t really know how it’s working and use it because of good sounds. Particular qualities of exists?
:
- It generates SQL:
SELECT 1 AS one FROM "users" LIMIT 1
- Not memorized. At all
- Always runs queries. Doesn’t loaded a query
- Would be better off using
present?
orblank?
The most important difference between exist?
and blank?
is in exist’s LIMIT. Returning to the initial issues, the fix of our 30+ minutes query we just needed to use exist?
instead of present?
because in the first case only one record is uploaded and the second — all 2 million.
Example of bad exists?
using:
- if @users.exists? # SELECT 1 ... = @users.size # COUNT ...- if @users.exists? # SELECT 1 ...- @users.each do |user| # loads the entire relation ...
Based on blank?
knowledges you may see that this code could be overwritten with only one query using, which improve performance.
Any? None? Empty?
Based on the source (activerecord/lib/active_record/relation.rb) you can see that all these methods are also essentially the same and in the end are linked on exists?
method.
def any? return super if block_given?
!empty?end
def none? return super if block_given? empty?end
def empty? return @records.empty? if loaded? !exists?end
Rules of use for these methods are the same as for the exists?
but additionally, you can use it in cases with taking a section of the ActiveRecord::Relation using first
or last
e.g:
@users.each if @users.any?# There’re actually two queries:@users.any? # (SELECT 1 AS one FROM ... LIMIT 1)@users.each # (SELECT "users".* FROM "users" WHERE ...)
Ways to improve code behind:
@users.each if @users.any?# Should be replaced with:@users.each if If @users.present?or@users.first(3).each if @users.any?or @users.each if @users.load.any?
Where
For example a method in the User model:
class User < ActiveRecord::Base def active_comments comments.where(active: true) endend
Every time when you call active_comments
method on a user you also make a request to a database. In using it in a view your controller’s code won’t help you to not cause requests every time. Alsoinclude
or other preload methods won’t help you to avoid an extra request.
One of the solutions:
class User has_many :comments has_many :active_comments, -> { active }, class_name: "Comment"endclass Comment belongs_to :user scope :active, -> { where(active: true) }endclass UsersController def index @users = User.includes(:active_comments) endend
Quick tips
- includes to avoid n+1
- find_each when loading a large amount of records
increments number of queries => reduce memory on loading a huge number of records - Check existence of a record with exists? Instead of present?
- Prefer ActiveRecord::Relation#size over ActiveRecord::Calculations#count
Summary
- try to keep the number of SQL queries to a minimum
- try to use preloaded queries
- database is always faster than Ruby in retrieving, parsing, or processing data