Skip to main content

SQL Patterns in Rails: What They Mean and How to Express Them Safely

Not every SQL pattern maps 1:1 to ActiveRecord. This page gives the translation and the gotchas.

Pattern-first approach: Start with the SQL pattern you need, understand its purpose, then choose the safest Rails implementation β€” whether that's ActiveRecord, Arel, or raw SQL.

πŸ“– Pattern Deep Dives: Real-World Applications

Learn how these patterns solve actual production problems with detailed explanations and edge cases:

JOIN Patterns

Query Fundamentals

How Do You Write a Semi-Join (EXISTS) in ActiveRecord?

Purpose: Find records in table A where a related record exists in table B, without returning duplicate rows from A.

SQL Pattern

SELECT * FROM users
WHERE EXISTS (
  SELECT 1 FROM posts
  WHERE posts.user_id = users.id
  AND posts.published = true
)

βœ“ Safe ActiveRecord Translation

User.where(id: Post.where(published: true).select(:user_id))

This generates an IN subquery which PostgreSQL optimizes similarly to EXISTS. No duplicates possible.

⚠️ Problematic Alternative

User.joins(:posts).where(posts: { published: true })

Problem: Returns duplicate User rows if a user has multiple published posts. You'd need .distinct, which can hide performance issues.

When to Verify

Check .to_sql to confirm it's using IN (subquery) or EXISTS, not a JOIN. Run both queries and compare row counts β€” they should match exactly.

How Do You Write an Anti-Join (NOT EXISTS) in ActiveRecord?

Purpose: Find records in table A where NO related record exists in table B.

SQL Pattern

SELECT * FROM users
WHERE NOT EXISTS (
  SELECT 1 FROM posts
  WHERE posts.user_id = users.id
)

Or equivalently with LEFT JOIN:

SELECT users.* FROM users
LEFT JOIN posts ON users.id = posts.user_id
WHERE posts.id IS NULL

βœ“ Safe ActiveRecord Translations

Option 1: Subquery (clearer intent)

User.where.not(id: Post.select(:user_id))

Option 2: LEFT JOIN + NULL check

User.left_joins(:posts).where(posts: { id: nil })

Both work. The subquery version is usually clearer. Verify .to_sql and check that users with no posts appear in the result.

βœ— Common Mistake

User.joins(:posts).where.not(posts: { id: nil })

Problem: This returns users who DO have posts (because INNER JOIN filters out users without posts). Exactly the opposite of what you want!

How Do You Deduplicate Join Results Without DISTINCT?

Problem: JOIN produces duplicate rows, so you slap .distinct on it. This works but can hide a bigger issue.

The Symptom

# Returns 1000 duplicate user rows from database, dedups in memory
User.joins(:posts).joins(:comments).distinct  # Cartesian explosion!

Run this without .distinct and check .count vs .distinct.count. If they differ by 10Γ—+, you have a problem.

Better Approaches

1. Use Subqueries (Semi-Join)

User.where(id: Post.select(:user_id))
    .where(id: Comment.select(:user_id))

2. Use EXISTS for Complex Conditions

User.where('EXISTS (SELECT 1 FROM posts WHERE posts.user_id = users.id)')
    .where('EXISTS (SELECT 1 FROM comments WHERE comments.user_id = users.id)')

3. Only Use DISTINCT When Legitimately Needed

If you're counting or selecting specific columns and duplicates are semantically correct, .distinct is fine. Just verify the performance.

How Does Keyset Pagination Work in ActiveRecord?

Purpose: Paginate large result sets efficiently without OFFSET (which gets slower as you go deeper).

SQL Pattern

-- First page
SELECT * FROM posts ORDER BY created_at DESC, id DESC LIMIT 10

-- Next page (using last seen created_at and id)
SELECT * FROM posts
WHERE (created_at, id) < ('2024-01-01 10:00:00', 12345)
ORDER BY created_at DESC, id DESC
LIMIT 10

ActiveRecord Translation

# First page
posts = Post.order(created_at: :desc, id: :desc).limit(10)

# Next page
last_post = posts.last
Post.where('(created_at, id) < (?, ?)', last_post.created_at, last_post.id)
    .order(created_at: :desc, id: :desc)
    .limit(10)

This stays fast even at page 1000 because it uses an index seek, not a scan + offset.

Index Requirement

You MUST have a composite index on (created_at DESC, id DESC) for this to work efficiently.

add_index :posts, [:created_at, :id], order: { created_at: :desc, id: :desc }

How Do You Get the Top-N Records Per Group in Rails?

Purpose: Get the top N records for each group (e.g., 5 most recent posts per user).

SQL Pattern (PostgreSQL/Modern SQL)

SELECT *
FROM (
  SELECT *,
    ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY created_at DESC) as rn
  FROM posts
) ranked
WHERE rn <= 5

ActiveRecord Limitation

ActiveRecord has limited window function support (Rails 7+), but this specific pattern is awkward to express. You have two options:

Option 1: Raw SQL

sql = <<~SQL
  SELECT *
  FROM (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY created_at DESC) as rn
    FROM posts
  ) ranked
  WHERE rn <= 5
SQL

Post.find_by_sql(sql)

Option 2: N+1 in Ruby (for small N)

# Only acceptable if you have < 100 users
User.includes(:posts).each do |user|
  user.posts.order(created_at: :desc).limit(5)
end

Recommendation: Use raw SQL or create a database view for top-N-per-group queries. Don't fight ActiveRecord on this one.

Does ActiveRecord Support CTEs (WITH Clauses)?

Purpose: Break complex queries into readable, reusable parts. Also enables recursive queries.

SQL Pattern

WITH active_users AS (
  SELECT * FROM users WHERE active = true
),
recent_posts AS (
  SELECT * FROM posts WHERE created_at > NOW() - INTERVAL '7 days'
)
SELECT u.*, COUNT(p.id) as post_count
FROM active_users u
LEFT JOIN recent_posts p ON u.id = p.user_id
GROUP BY u.id

ActiveRecord Support

Rails 7.1+ added .with for CTEs, but it's limited and verbose for complex cases.

# Rails 7.1+
Post.with(
  active_users: User.where(active: true),
  recent_posts: Post.where('created_at > ?', 7.days.ago)
).from('active_users u')
  .joins('LEFT JOIN recent_posts p ON u.id = p.user_id')
  .group('u.id')
  .select('u.*, COUNT(p.id) as post_count')

Recommendation: For anything beyond trivial CTEs, use raw SQL or create a database view. The ActiveRecord syntax is harder to read than the SQL.

When to Use CTEs

  • β€’ Recursive queries (org charts, threaded comments)
  • β€’ Complex analytics with multiple aggregation steps
  • β€’ Queries you want to be readable by future maintainers

Pattern Cheat Sheet Summary

βœ“ ActiveRecord Handles Well

  • β€’ Semi-joins (EXISTS β†’ WHERE id: subquery)
  • β€’ Anti-joins (NOT EXISTS β†’ WHERE.not)
  • β€’ Keyset pagination
  • β€’ Simple aggregations

βœ— Use Raw SQL or Views For

  • β€’ Window functions (ROW_NUMBER, etc.)
  • β€’ Complex CTEs
  • β€’ Top-N per group
  • β€’ UNION, FULL OUTER JOIN

Related Resources

Examples

Copy-paste examples of these patterns with verification notes.

Performance Tips

Learn to diagnose slow queries and index problems.

Guide

Complete guide to safe SQL→ActiveRecord conversion.

Try the Converter

Convert your SQL patterns to ActiveRecord, then verify with .to_sql and this guide.