Complete Guide to Rails ActiveRecord Queries
Master database interactions in Ruby on Rails with comprehensive examples and best practices
What you'll learn:
- • Basic to advanced ActiveRecord query methods
- • Performance optimization techniques
- • Real-world examples with explanations
- • Common pitfalls and how to avoid them
Table of Contents
1. ActiveRecord Query Basics
ActiveRecord is Rails' Object-Relational Mapping (ORM) layer that provides a Ruby interface for database operations. Instead of writing raw SQL, you use Ruby methods to query your database.
Basic Query Structure
All Records
# SQL: SELECT * FROM users
User.all
First Record
# SQL: SELECT * FROM users LIMIT 1
User.first
Last Record
# SQL: SELECT * FROM users ORDER BY id DESC LIMIT 1
User.last
Method Chaining
ActiveRecord queries use method chaining. Most query methods return an ActiveRecord::Relation object, allowing you to chain multiple conditions together.
2. Finding Records
Find by ID
find - Raises exception if not found
# SQL: SELECT * FROM users WHERE id = 1
user = User.find(1)
# Raises ActiveRecord::RecordNotFound if user doesn't exist
find_by - Returns nil if not found
# SQL: SELECT * FROM users WHERE id = 1 LIMIT 1
user = User.find_by(id: 1)
# Returns nil if user doesn't exist
Multiple IDs
# SQL: SELECT * FROM users WHERE id IN (1, 2, 3)
users = User.find([1, 2, 3])
Dynamic Finders
find_by with attributes
# SQL: SELECT * FROM users WHERE email = 'john@example.com' LIMIT 1
user = User.find_by(email: 'john@example.com')
# Multiple conditions
# SQL: SELECT * FROM users WHERE name = 'John' AND active = true LIMIT 1
user = User.find_by(name: 'John', active: true)
find_or_create_by
# Finds existing record or creates a new one
user = User.find_or_create_by(email: 'john@example.com') do |u|
u.name = 'John Doe'
u.active = true
end
3. WHERE Conditions
Basic WHERE Clauses
Simple Equality
# SQL: SELECT * FROM users WHERE active = true
User.where(active: true)
# SQL: SELECT * FROM users WHERE name = 'John'
User.where(name: 'John')
Multiple Conditions (AND)
# SQL: SELECT * FROM users WHERE active = true AND age >= 18
User.where(active: true, age: 18..Float::INFINITY)
# Chaining where clauses (also AND)
User.where(active: true).where('age >= ?', 18)
Array/Range Conditions
# SQL: SELECT * FROM users WHERE id IN (1, 2, 3)
User.where(id: [1, 2, 3])
# SQL: SELECT * FROM users WHERE age BETWEEN 18 AND 65
User.where(age: 18..65)
# SQL: SELECT * FROM users WHERE created_at >= '2024-01-01'
User.where(created_at: Date.new(2024, 1, 1)..)
Advanced WHERE Conditions
String Conditions with Placeholders
# Positional placeholders (safer than string interpolation)
User.where('age > ? AND created_at < ?', 25, 1.year.ago)
# Named placeholders
User.where('age > :min_age AND city = :city', min_age: 25, city: 'New York')
LIKE Patterns
# SQL: SELECT * FROM users WHERE name LIKE '%John%'
User.where('name LIKE ?', '%John%')
# Case-insensitive search (PostgreSQL)
User.where('name ILIKE ?', '%john%')
NULL Checks
# SQL: SELECT * FROM users WHERE email IS NOT NULL
User.where.not(email: nil)
# SQL: SELECT * FROM users WHERE email IS NULL
User.where(email: nil)
OR Conditions
# SQL: SELECT * FROM users WHERE active = true OR admin = true
User.where(active: true).or(User.where(admin: true))
# More complex OR with different models
User.where(id: Post.where(published: true).select(:user_id))
.or(User.where(admin: true))
4. JOINs and Includes
Understanding Associations
Before diving into JOINs, let's assume these model associations:
class User < ApplicationRecord
has_many :posts
has_many :comments
belongs_to :company, optional: true
end
class Post < ApplicationRecord
belongs_to :user
has_many :comments
belongs_to :category
end
class Comment < ApplicationRecord
belongs_to :user
belongs_to :post
end
joins vs includes
joins - Inner Join (No N+1 Prevention)
# SQL: SELECT users.* FROM users INNER JOIN posts ON posts.user_id = users.id
User.joins(:posts)
# Access associated data requires additional queries
users = User.joins(:posts)
users.each { |user| puts user.posts.count } # N+1 queries!
includes - Eager Loading (Prevents N+1)
# Preloads posts to prevent N+1 queries
users = User.includes(:posts)
users.each { |user| puts user.posts.count } # No additional queries!
# Can also filter on included associations
User.includes(:posts).where(posts: { published: true })
left_joins - Left Outer Join
# SQL: SELECT users.* FROM users LEFT OUTER JOIN posts ON posts.user_id = users.id
User.left_joins(:posts)
# Useful for finding users without posts
User.left_joins(:posts).where(posts: { id: nil })
Complex Joins
Multiple Associations
# Join multiple tables
User.joins(:posts, :comments)
# Nested associations
User.joins(posts: :comments)
# Mixed associations
User.joins(:company, posts: [:category, :comments])
Custom Join Conditions
# Custom join with string SQL
User.joins('LEFT JOIN posts ON posts.user_id = users.id AND posts.published = true')
# Using Arel for complex conditions
users_table = User.arel_table
posts_table = Post.arel_table
User.joins(users_table.join(posts_table, Arel::Nodes::OuterJoin)
.on(posts_table[:user_id].eq(users_table[:id])
.and(posts_table[:published].eq(true))).join_sources)
5. Aggregations and Calculations
Basic Calculations
Count
# SQL: SELECT COUNT(*) FROM users
User.count
# SQL: SELECT COUNT(*) FROM users WHERE active = true
User.where(active: true).count
# Count with specific column
User.count(:email) # Excludes NULL values
Sum, Average, Maximum, Minimum
# SQL: SELECT SUM(age) FROM users
User.sum(:age)
# SQL: SELECT AVG(age) FROM users WHERE active = true
User.where(active: true).average(:age)
# SQL: SELECT MAX(created_at) FROM users
User.maximum(:created_at)
# SQL: SELECT MIN(age) FROM users
User.minimum(:age)
GROUP BY and HAVING
Basic Grouping
# SQL: SELECT status, COUNT(*) FROM posts GROUP BY status
Post.group(:status).count
# SQL: SELECT user_id, AVG(rating) FROM reviews GROUP BY user_id
Review.group(:user_id).average(:rating)
# Multiple columns
Post.group(:status, :category_id).count
HAVING Clauses
# SQL: SELECT user_id, COUNT(*) FROM posts GROUP BY user_id HAVING COUNT(*) > 5
Post.group(:user_id).having('COUNT(*) > 5').count
# Using placeholder values
Post.group(:status).having('COUNT(*) > ?', 10).count
6. Scoping and Ordering
Order and Limit
Ordering Results
# SQL: SELECT * FROM users ORDER BY created_at DESC
User.order(created_at: :desc)
# Multiple columns
User.order(:name, created_at: :desc)
# String-based ordering
User.order('name ASC, created_at DESC')
# Random order (PostgreSQL)
User.order('RANDOM()')
# Random order (MySQL)
User.order('RAND()')
Limiting Results
# SQL: SELECT * FROM users ORDER BY created_at DESC LIMIT 10
User.order(created_at: :desc).limit(10)
# With offset (pagination)
# SQL: SELECT * FROM users ORDER BY created_at DESC LIMIT 10 OFFSET 20
User.order(created_at: :desc).limit(10).offset(20)
# First and last with ordering
User.order(:name).first(5) # First 5 alphabetically
User.order(:name).last(5) # Last 5 alphabetically
Scopes
Defining Scopes
class User < ApplicationRecord
scope :active, -> { where(active: true) }
scope :adults, -> { where('age >= ?', 18) }
scope :recent, -> { where('created_at > ?', 1.week.ago) }
scope :by_name, ->(name) { where('name LIKE ?', "%#{name}%") }
end
Using Scopes
# Chain scopes together
User.active.adults.recent
# Combine with other query methods
User.active.where(city: 'New York').order(:name)
# Parametrized scopes
User.by_name('John').active
Default Scope (Use Carefully)
class Post < ApplicationRecord
default_scope { where(published: true) }
end
# All queries will include published: true unless unscoped
Post.all # Only published posts
Post.unscoped.all # All posts including unpublished
7. Advanced Query Techniques
Subqueries
Subqueries with WHERE
# Find users who have published posts
User.where(id: Post.where(published: true).select(:user_id))
# Find users with more than 5 posts
User.where('id IN (?)', Post.group(:user_id).having('COUNT(*) > 5').select(:user_id))
EXISTS Queries
# Users who have at least one post
User.where('EXISTS (?)', Post.where('posts.user_id = users.id').select('1'))
# More readable using joins
User.joins(:posts).distinct
Raw SQL and find_by_sql
When to Use Raw SQL
# Complex queries that are hard to express in ActiveRecord
users = User.find_by_sql([
"SELECT users.*, COUNT(posts.id) as post_count
FROM users
LEFT JOIN posts ON posts.user_id = users.id
WHERE users.created_at > ?
GROUP BY users.id
HAVING COUNT(posts.id) > ?",
1.month.ago, 5
])
# Accessing custom attributes
users.first.post_count # Available as method
Using select for calculated fields
# Add calculated fields to your query
users = User.select('users.*,
(SELECT COUNT(*) FROM posts WHERE posts.user_id = users.id) as post_count')
users.first.post_count # Available as method
8. Performance Optimization
N+1 Query Problem
The Problem
# This will execute N+1 queries (1 for users, N for posts)
users = User.limit(10)
users.each do |user|
puts user.posts.count # Each iteration hits the database
end
Solutions
# Solution 1: includes (preloading)
users = User.includes(:posts).limit(10)
users.each { |user| puts user.posts.count } # No additional queries
# Solution 2: joins with select
users = User.joins(:posts)
.select('users.*, COUNT(posts.id) as post_count')
.group('users.id')
.limit(10)
users.each { |user| puts user.post_count }
# Solution 3: counter_cache
class User < ApplicationRecord
has_many :posts
end
class Post < ApplicationRecord
belongs_to :user, counter_cache: true
end
# After migration to add posts_count column
users.each { |user| puts user.posts_count } # No queries!
Efficient Data Retrieval
select - Only fetch needed columns
# Instead of SELECT *
User.all
# Select only needed columns
User.select(:id, :name, :email)
# For large datasets with text columns
Post.select(:id, :title, :created_at) # Skip content column
pluck - Get arrays of values
# Get array of IDs (no model instantiation)
user_ids = User.where(active: true).pluck(:id)
# Multiple columns
User.pluck(:id, :name) # Returns array of arrays
# Memory efficient for large datasets
User.where(active: true).pluck(:email) # Less memory than .map(&:email)
find_each - Batch processing
# Process large datasets in batches (default 1000)
User.find_each do |user|
# Process user
user.update_some_attribute
end
# Custom batch size
User.find_each(batch_size: 500) do |user|
# Process user
end
# find_in_batches for batch operations
User.find_in_batches(batch_size: 1000) do |users|
# Process batch of 1000 users
User.where(id: users.map(&:id)).update_all(processed: true)
end
9. Best Practices
Security Best Practices
Always Use Parameterized Queries
# BAD - SQL injection vulnerable
User.where("name = '#{params[:name]}'")
# GOOD - Safe parameterized query
User.where('name = ?', params[:name])
# BETTER - Hash conditions when possible
User.where(name: params[:name])
Validate Input
# Validate and sanitize user input
def search_users(name)
return User.none if name.blank?
User.where('name ILIKE ?', "%#{name.strip}%").limit(100)
end
Code Organization
Use Scopes for Reusable Queries
class User < ApplicationRecord
scope :active, -> { where(active: true) }
scope :recent, -> { where('created_at > ?', 1.week.ago) }
scope :with_posts, -> { joins(:posts).distinct }
# Chainable and reusable
end
# Usage
User.active.recent.with_posts
Extract Complex Queries to Methods
class User < ApplicationRecord
def self.top_contributors(limit = 10)
joins(:posts)
.select('users.*, COUNT(posts.id) as post_count')
.group('users.id')
.order('post_count DESC')
.limit(limit)
end
def self.search_by_name(query)
return none if query.blank?
where('name ILIKE ?', "%#{query.strip}%")
end
end
Quick Tips
- • Use
includes
to prevent N+1 queries - • Prefer
pluck
overmap
for simple data extraction - • Use
find_each
for processing large datasets - • Add database indexes for frequently queried columns
- • Use
explain
to analyze query performance - • Consider counter caches for frequently counted associations