Skip to main content

Complete Guide to Rails ActiveRecord Queries

Master database interactions in Ruby on Rails with comprehensive examples and best practices

What you'll learn:

  • • Basic to advanced ActiveRecord query methods
  • • Performance optimization techniques
  • • Real-world examples with explanations
  • • Common pitfalls and how to avoid them

1. ActiveRecord Query Basics

ActiveRecord is Rails' Object-Relational Mapping (ORM) layer that provides a Ruby interface for database operations. Instead of writing raw SQL, you use Ruby methods to query your database.

Basic Query Structure

All Records

# SQL: SELECT * FROM users
User.all

First Record

# SQL: SELECT * FROM users LIMIT 1
User.first

Last Record

# SQL: SELECT * FROM users ORDER BY id DESC LIMIT 1
User.last

Method Chaining

ActiveRecord queries use method chaining. Most query methods return an ActiveRecord::Relation object, allowing you to chain multiple conditions together.

2. Finding Records

Find by ID

find - Raises exception if not found

# SQL: SELECT * FROM users WHERE id = 1
user = User.find(1)
# Raises ActiveRecord::RecordNotFound if user doesn't exist

find_by - Returns nil if not found

# SQL: SELECT * FROM users WHERE id = 1 LIMIT 1
user = User.find_by(id: 1)
# Returns nil if user doesn't exist

Multiple IDs

# SQL: SELECT * FROM users WHERE id IN (1, 2, 3)
users = User.find([1, 2, 3])

Dynamic Finders

find_by with attributes

# SQL: SELECT * FROM users WHERE email = 'john@example.com' LIMIT 1
user = User.find_by(email: 'john@example.com')

# Multiple conditions
# SQL: SELECT * FROM users WHERE name = 'John' AND active = true LIMIT 1
user = User.find_by(name: 'John', active: true)

find_or_create_by

# Finds existing record or creates a new one
user = User.find_or_create_by(email: 'john@example.com') do |u|
  u.name = 'John Doe'
  u.active = true
end

3. WHERE Conditions

Basic WHERE Clauses

Simple Equality

# SQL: SELECT * FROM users WHERE active = true
User.where(active: true)

# SQL: SELECT * FROM users WHERE name = 'John'
User.where(name: 'John')

Multiple Conditions (AND)

# SQL: SELECT * FROM users WHERE active = true AND age >= 18
User.where(active: true, age: 18..Float::INFINITY)

# Chaining where clauses (also AND)
User.where(active: true).where('age >= ?', 18)

Array/Range Conditions

# SQL: SELECT * FROM users WHERE id IN (1, 2, 3)
User.where(id: [1, 2, 3])

# SQL: SELECT * FROM users WHERE age BETWEEN 18 AND 65
User.where(age: 18..65)

# SQL: SELECT * FROM users WHERE created_at >= '2024-01-01'
User.where(created_at: Date.new(2024, 1, 1)..)

Advanced WHERE Conditions

String Conditions with Placeholders

# Positional placeholders (safer than string interpolation)
User.where('age > ? AND created_at < ?', 25, 1.year.ago)

# Named placeholders
User.where('age > :min_age AND city = :city', min_age: 25, city: 'New York')

LIKE Patterns

# SQL: SELECT * FROM users WHERE name LIKE '%John%'
User.where('name LIKE ?', '%John%')

# Case-insensitive search (PostgreSQL)
User.where('name ILIKE ?', '%john%')

NULL Checks

# SQL: SELECT * FROM users WHERE email IS NOT NULL
User.where.not(email: nil)

# SQL: SELECT * FROM users WHERE email IS NULL
User.where(email: nil)

OR Conditions

# SQL: SELECT * FROM users WHERE active = true OR admin = true
User.where(active: true).or(User.where(admin: true))

# More complex OR with different models
User.where(id: Post.where(published: true).select(:user_id))
  .or(User.where(admin: true))

4. JOINs and Includes

Understanding Associations

Before diving into JOINs, let's assume these model associations:

class User < ApplicationRecord
  has_many :posts
  has_many :comments
  belongs_to :company, optional: true
end

class Post < ApplicationRecord
  belongs_to :user
  has_many :comments
  belongs_to :category
end

class Comment < ApplicationRecord
  belongs_to :user
  belongs_to :post
end

joins vs includes

joins - Inner Join (No N+1 Prevention)

# SQL: SELECT users.* FROM users INNER JOIN posts ON posts.user_id = users.id
User.joins(:posts)

# Access associated data requires additional queries
users = User.joins(:posts)
users.each { |user| puts user.posts.count } # N+1 queries!

includes - Eager Loading (Prevents N+1)

# Preloads posts to prevent N+1 queries
users = User.includes(:posts)
users.each { |user| puts user.posts.count } # No additional queries!

# Can also filter on included associations
User.includes(:posts).where(posts: { published: true })

left_joins - Left Outer Join

# SQL: SELECT users.* FROM users LEFT OUTER JOIN posts ON posts.user_id = users.id
User.left_joins(:posts)

# Useful for finding users without posts
User.left_joins(:posts).where(posts: { id: nil })

Complex Joins

Multiple Associations

# Join multiple tables
User.joins(:posts, :comments)

# Nested associations
User.joins(posts: :comments)

# Mixed associations
User.joins(:company, posts: [:category, :comments])

Custom Join Conditions

# Custom join with string SQL
User.joins('LEFT JOIN posts ON posts.user_id = users.id AND posts.published = true')

# Using Arel for complex conditions
users_table = User.arel_table
posts_table = Post.arel_table
User.joins(users_table.join(posts_table, Arel::Nodes::OuterJoin)
  .on(posts_table[:user_id].eq(users_table[:id])
  .and(posts_table[:published].eq(true))).join_sources)

5. Aggregations and Calculations

Basic Calculations

Count

# SQL: SELECT COUNT(*) FROM users
User.count

# SQL: SELECT COUNT(*) FROM users WHERE active = true
User.where(active: true).count

# Count with specific column
User.count(:email) # Excludes NULL values

Sum, Average, Maximum, Minimum

# SQL: SELECT SUM(age) FROM users
User.sum(:age)

# SQL: SELECT AVG(age) FROM users WHERE active = true
User.where(active: true).average(:age)

# SQL: SELECT MAX(created_at) FROM users
User.maximum(:created_at)

# SQL: SELECT MIN(age) FROM users
User.minimum(:age)

GROUP BY and HAVING

Basic Grouping

# SQL: SELECT status, COUNT(*) FROM posts GROUP BY status
Post.group(:status).count

# SQL: SELECT user_id, AVG(rating) FROM reviews GROUP BY user_id
Review.group(:user_id).average(:rating)

# Multiple columns
Post.group(:status, :category_id).count

HAVING Clauses

# SQL: SELECT user_id, COUNT(*) FROM posts GROUP BY user_id HAVING COUNT(*) > 5
Post.group(:user_id).having('COUNT(*) > 5').count

# Using placeholder values
Post.group(:status).having('COUNT(*) > ?', 10).count

6. Scoping and Ordering

Order and Limit

Ordering Results

# SQL: SELECT * FROM users ORDER BY created_at DESC
User.order(created_at: :desc)

# Multiple columns
User.order(:name, created_at: :desc)

# String-based ordering
User.order('name ASC, created_at DESC')

# Random order (PostgreSQL)
User.order('RANDOM()')

# Random order (MySQL)
User.order('RAND()')

Limiting Results

# SQL: SELECT * FROM users ORDER BY created_at DESC LIMIT 10
User.order(created_at: :desc).limit(10)

# With offset (pagination)
# SQL: SELECT * FROM users ORDER BY created_at DESC LIMIT 10 OFFSET 20
User.order(created_at: :desc).limit(10).offset(20)

# First and last with ordering
User.order(:name).first(5)  # First 5 alphabetically
User.order(:name).last(5)   # Last 5 alphabetically

Scopes

Defining Scopes

class User < ApplicationRecord
  scope :active, -> { where(active: true) }
  scope :adults, -> { where('age >= ?', 18) }
  scope :recent, -> { where('created_at > ?', 1.week.ago) }
  scope :by_name, ->(name) { where('name LIKE ?', "%#{name}%") }
end

Using Scopes

# Chain scopes together
User.active.adults.recent

# Combine with other query methods
User.active.where(city: 'New York').order(:name)

# Parametrized scopes
User.by_name('John').active

Default Scope (Use Carefully)

class Post < ApplicationRecord
  default_scope { where(published: true) }
end

# All queries will include published: true unless unscoped
Post.all # Only published posts
Post.unscoped.all # All posts including unpublished

7. Advanced Query Techniques

Subqueries

Subqueries with WHERE

# Find users who have published posts
User.where(id: Post.where(published: true).select(:user_id))

# Find users with more than 5 posts
User.where('id IN (?)', Post.group(:user_id).having('COUNT(*) > 5').select(:user_id))

EXISTS Queries

# Users who have at least one post
User.where('EXISTS (?)', Post.where('posts.user_id = users.id').select('1'))

# More readable using joins
User.joins(:posts).distinct

Raw SQL and find_by_sql

When to Use Raw SQL

# Complex queries that are hard to express in ActiveRecord
users = User.find_by_sql([
  "SELECT users.*, COUNT(posts.id) as post_count 
   FROM users 
   LEFT JOIN posts ON posts.user_id = users.id 
   WHERE users.created_at > ? 
   GROUP BY users.id 
   HAVING COUNT(posts.id) > ?",
  1.month.ago, 5
])

# Accessing custom attributes
users.first.post_count # Available as method

Using select for calculated fields

# Add calculated fields to your query
users = User.select('users.*, 
  (SELECT COUNT(*) FROM posts WHERE posts.user_id = users.id) as post_count')

users.first.post_count # Available as method

8. Performance Optimization

N+1 Query Problem

The Problem

# This will execute N+1 queries (1 for users, N for posts)
users = User.limit(10)
users.each do |user|
  puts user.posts.count # Each iteration hits the database
end

Solutions

# Solution 1: includes (preloading)
users = User.includes(:posts).limit(10)
users.each { |user| puts user.posts.count } # No additional queries

# Solution 2: joins with select
users = User.joins(:posts)
           .select('users.*, COUNT(posts.id) as post_count')
           .group('users.id')
           .limit(10)
users.each { |user| puts user.post_count }

# Solution 3: counter_cache
class User < ApplicationRecord
  has_many :posts
end

class Post < ApplicationRecord
  belongs_to :user, counter_cache: true
end

# After migration to add posts_count column
users.each { |user| puts user.posts_count } # No queries!

Efficient Data Retrieval

select - Only fetch needed columns

# Instead of SELECT *
User.all

# Select only needed columns
User.select(:id, :name, :email)

# For large datasets with text columns
Post.select(:id, :title, :created_at) # Skip content column

pluck - Get arrays of values

# Get array of IDs (no model instantiation)
user_ids = User.where(active: true).pluck(:id)

# Multiple columns
User.pluck(:id, :name) # Returns array of arrays

# Memory efficient for large datasets
User.where(active: true).pluck(:email) # Less memory than .map(&:email)

find_each - Batch processing

# Process large datasets in batches (default 1000)
User.find_each do |user|
  # Process user
  user.update_some_attribute
end

# Custom batch size
User.find_each(batch_size: 500) do |user|
  # Process user
end

# find_in_batches for batch operations
User.find_in_batches(batch_size: 1000) do |users|
  # Process batch of 1000 users
  User.where(id: users.map(&:id)).update_all(processed: true)
end

9. Best Practices

Security Best Practices

Always Use Parameterized Queries

# BAD - SQL injection vulnerable
User.where("name = '#{params[:name]}'")

# GOOD - Safe parameterized query
User.where('name = ?', params[:name])

# BETTER - Hash conditions when possible
User.where(name: params[:name])

Validate Input

# Validate and sanitize user input
def search_users(name)
  return User.none if name.blank?
  User.where('name ILIKE ?', "%#{name.strip}%").limit(100)
end

Code Organization

Use Scopes for Reusable Queries

class User < ApplicationRecord
  scope :active, -> { where(active: true) }
  scope :recent, -> { where('created_at > ?', 1.week.ago) }
  scope :with_posts, -> { joins(:posts).distinct }
  
  # Chainable and reusable
end

# Usage
User.active.recent.with_posts

Extract Complex Queries to Methods

class User < ApplicationRecord
  def self.top_contributors(limit = 10)
    joins(:posts)
      .select('users.*, COUNT(posts.id) as post_count')
      .group('users.id')
      .order('post_count DESC')
      .limit(limit)
  end
  
  def self.search_by_name(query)
    return none if query.blank?
    where('name ILIKE ?', "%#{query.strip}%")
  end
end

Quick Tips

  • • Use includes to prevent N+1 queries
  • • Prefer pluck over map for simple data extraction
  • • Use find_each for processing large datasets
  • • Add database indexes for frequently queried columns
  • • Use explain to analyze query performance
  • • Consider counter caches for frequently counted associations