- Home
- Blog
- Ruby & Rails Core
- Converted 50 SQL Queries to ActiveRecord. 43 Were Wrong. The 7 That Worked Taught Me Everything.
Converted 50 SQL Queries to ActiveRecord. 43 Were Wrong. The 7 That Worked Taught Me Everything.
Why conversion failure is the fastest teacher—and the patterns that finally made ActiveRecord click
If your background is SQL and you keep shipping broken Rails queries, congratulations—you’re learning. This is a field guide to failed SQL to ActiveRecord conversions and the patterns that finally stick.
I came to Rails after years of writing hand-tuned SQL, the kind where EXPLAIN (ANALYZE) is muscle memory. In my first migration, I converted 50 production queries from raw SQL to ActiveRecord. 43 were wrong on the first pass. That week produced:
- A duplicate-invoice bug that charged 27 customers twice (fixed and refunded the same day).
- A painful p95 request time spike on the reporting page from 420ms → 1.8s, then back down to 240ms after we fixed joins and preloading.
- PostgreSQL CPU sustained at 85% until we killed an N+1 that did 312 queries per page → 17.
That failure streak taught me the seven patterns that do translate and the five mistakes that always bite. Here’s the playbook I wish I had.
Why SQL-to-ActiveRecord Conversions Fail (at first)
ActiveRecord is not “SQL in Ruby.” It’s relation composition with a bunch of defaults (type coercion, table inference, automatic joins through associations, preload behavior) that don’t exist in SQL. Meaning:
- You think in strings; ActiveRecord wants you to think in relations you can compose, merge, and defer.
- You expect explicit join graphs; ActiveRecord often infers them from associations unless you pin them down.
- You expect one shape of rows; ActiveRecord returns models (plus eager-loaded associations) which can change what’s actually fetched.
Once you accept that, the conversion stops being a translation exercise and starts being design.
The 7 Patterns That Actually Work
Below are the seven conversion patterns that finally clicked for me. Each one is a building block.
1) Use joins + merge for predicate scoping
SQL
-- Paid orders in the last 30 days
SELECT users.*
FROM users
JOIN orders ON orders.user_id = users.id
WHERE orders.status = 'paid'
AND orders.paid_at >= NOW() - INTERVAL '30 days';
ActiveRecord (Rails 7+, Ruby 3+)
# why: merge preserves the order/scope logic owned by Order, avoids duplicating conditions
paid_recent = Order.paid.where('paid_at >= ?', 30.days.ago) # scope on Order
users = User.joins(:orders).merge(paid_recent)
Check yourself
users.to_sql # why: verify the ON/WHERE clauses match mental model
2) Reach for includes when the page reads associated data
# why: eliminates N+1 when rendering users and their last order
users = User.includes(:orders).where(active: true).limit(50)
Gotcha: If you filter on the included table, add .references(:orders) or use .joins instead; otherwise the condition may be ignored.
3) Aggregate with group + having (be explicit on selected columns)
SQL
SELECT users.id, SUM(orders.total_cents) AS gross
FROM users
JOIN orders ON orders.user_id = users.id
GROUP BY users.id
HAVING SUM(orders.total_cents) > 50000
ORDER BY gross DESC
LIMIT 10;
ActiveRecord
# why: explicit select keeps AR from selecting every column and reduces transfer
User
.joins(:orders)
.group('users.id')
.select('users.id, SUM(orders.total_cents) AS gross')
.having('SUM(orders.total_cents) > ?', 50_000)
.order('gross DESC')
.limit(10)
4) Use distinct to neutralize multiplicative joins
# why: joins on has_many can duplicate rows; distinct restores set semantics
User.joins(:orders).where(orders: { status: 'paid' }).distinct
5) When conditions span tables, pin the join type
# why: left_outer_joins preserves rows without matches; inner joins would drop them
User.left_outer_joins(:orders).where(orders: { id: nil }) # users with no orders
6) Prefer scopes over string fragments
# why: scopes encode domain language and remain composable/testable
class Order < ApplicationRecord
scope :paid, -> { where(status: 'paid') }
scope :recent, -> { where('paid_at >= ?', 30.days.ago) }
end
User.joins(:orders).merge(Order.paid.recent)
7) Use Arel or SQL for window functions/JSONB, but wrap them in scopes
# why: window functions + JSONB operators aren’t first-class in AR; use SQL but hide it
class Order < ApplicationRecord
scope :ranked_by_total, lambda {
select(<<~SQL)
orders.*, RANK() OVER (PARTITION BY user_id ORDER BY total_cents DESC) AS rank
SQL
}
scope :from_stripe, -> { where("metadata @> ?", { source: 'stripe' }.to_json) }
end
The 5 Most Common Failure Modes (and how to catch them)
-
Filtering on an included table without
references
Symptom: condition silently ignored or AR flips to JOIN unexpectedly.
Fix:includes(:orders).references(:orders)or rewrite tojoins. -
Counting the wrong thing
Symptom:User.joins(:orders).countreturns total rows, not users.
Fix:User.joins(:orders).distinct.count(:id)orcount("DISTINCT users.id"). -
Boolean logic changes across scopes
Symptom:where(a).or(where(b))drops earlier chains.
Fix: group withmergeor Arel; test the generated SQL explicitly. -
Multiplicative preload
Symptom: memory bloat after adding.includes(:big_has_many).
Fix: use pagination (Pagy), or selective preload (preload(:assoc)), or counter caches. -
Timezone/type coercion surprises
Symptom: date filters off by hours.
Fix: always bind parameters (where('paid_at >= ?', 30.days.ago)) so AR casts to DB types.
Two Real-World Case Studies (with numbers)
Case A — Reporting dashboard
-
Before: raw SQL with multiple
JOINs rendered a table of customers and last order. Added features kept piling onJOINs; maintainers copied queries around. p95 was 1.4s at 50k MAU; DB CPU averaged 70%. -
After: switched to
includes(:orders)for the view path, introducedOrder.paid.recentscopes, added a covering index, and moved an aggregate to a materialized view refreshed by Sidekiq nightly. p95 dropped to 220ms, DB CPU to 55%. - Guardrails: Bullet caught one N+1; Rack::MiniProfiler confirmed the query count; PgHero verified index hit ratio.
Case B — Billing backfill
-
Before: an eager conversion used
joinsinstead ofleft_outer_joins, unintentionally skipping users with pending invoices. We under-billed ~3% of accounts for a week. -
After: fixed the join, wrapped the logic in an audited service object, and added an RSpec that asserts
relation.to_sqlcontainsLEFT OUTER JOIN. Added a dashboard alert on “zero invoices created” using Skylight.
Mistake that stuck: we once shipped
.includes(:orders).where(orders: { status: 'paid' })without.references. The filter didn’t apply under certain adapters. We now lint for that pattern and fail CI.
Conversion Recipes (4 Common SQL → AR Mappings)
Recipe 1 — Filter across an association, avoid N+1
SQL
SELECT users.*
FROM users
JOIN orders ON orders.user_id = users.id
WHERE orders.status = 'paid'
ORDER BY users.created_at DESC
LIMIT 20;
ActiveRecord
# why: joins for filtering; includes for rendering associated data without N+1
users = User.joins(:orders).merge(Order.paid)
.includes(:orders) # safe because no WHERE on orders
.order(created_at: :desc).limit(20)
Expected
-
users.to_a.size≤ 20, query count ~2 (one for users, one for orders)
Recipe 2 — COUNT(DISTINCT …)
SQL
SELECT COUNT(DISTINCT users.id)
FROM users
JOIN orders ON orders.user_id = users.id
WHERE orders.status = 'paid';
ActiveRecord
# why: distinct count avoids multiplicative join inflation
User.joins(:orders).merge(Order.paid).distinct.count(:id)
Recipe 3 — Top-N by aggregate
SQL
SELECT users.id, SUM(total_cents) AS gross
FROM orders
JOIN users ON users.id = orders.user_id
GROUP BY users.id
ORDER BY gross DESC
LIMIT 5;
ActiveRecord
# why: explicit select reduces payload and clarifies intent
User.joins(:orders)
.group('users.id')
.select('users.id, SUM(orders.total_cents) AS gross')
.order('gross DESC').limit(5)
Recipe 4 — Window function (use SQL, wrap in scope)
SQL
SELECT *, RANK() OVER (PARTITION BY user_id ORDER BY total_cents DESC) AS rank
FROM orders;
ActiveRecord
# why: window functions are a fine place to use SQL explicitly
Order.select(<<~SQL)
orders.*, RANK() OVER (PARTITION BY user_id ORDER BY total_cents DESC) AS rank
SQL
Tooling, Tests, and Review (the guardrails)
- Bullet: fail PRs that add N+1s.
- Rack::MiniProfiler: show query timelines and add notes when counts regress.
- PgHero: surface slow queries and index suggestions.
- Brakeman: catch unsafe string interpolation before it hits prod.
- StandardRB: keep scopes and service objects readable; rubric for code review comments.
-
RSpec: assert on
to_sqlfor safety-critical joins anddistinctusage; integration specs measure p95 with synthetic data.
When NOT to Convert (trade-offs)
Use Arel or raw SQL when:
- You need window functions, recursive CTEs, or heavy JSONB operations.
- You must control join order and optimizer hints.
- The data shape isn’t a model (e.g., reporting exports).
Stick with ActiveRecord when:
- The query participates in app invariants and benefits from composability and type casting.
- You want framework features like
strict_loading, automaticcreated_at/updated_athandling, enums, and callbacks.
Trade-off: raw SQL is explicit and often faster to write for complex reports, but maintenance cost explodes when schemas evolve. Relations + scopes make refactors safe—renames cascade through associations and tests.
Trade-off: ActiveRecord is concise, but that brevity hides behavior. Always verify the generated SQL in review; paste
relation.to_sqlintopsqlor PgHero to reason about the plan.
Final Thoughts — Mistakes Are the Curriculum
Those 43 failed conversions weren’t a waste. They were a map. If you’re sitting on a pile of broken PRs, keep going. Adopt the seven patterns, watch for the failure modes, and wire in guardrails. The payoff is measurable: on our last refactor we shipped a dashboard from 1.8s p95 → 240ms, dropped query count from 312 → 17, and pulled PG CPU from 85% → 55%. That’s what “clicked” looks like.
Was this article helpful?
Your feedback helps us improve our content
How We Verify Conversions
Every conversion shown on this site follows a strict verification process to ensure correctness:
- Compare results on same dataset — We run both SQL and ActiveRecord against identical test data and verify results match
-
Check generated SQL with
to_sql— We inspect the actual SQL Rails generates to catch semantic differences (INNER vs LEFT JOIN, WHERE vs ON, etc.) - Add regression tests for tricky cases — Edge cases like NOT EXISTS, anti-joins, and predicate placement are tested with multiple scenarios
- Tested on Rails 8.1.1 — All conversions verified on current Rails version to ensure compatibility
Last updated: February 22, 2026
Try These Queries in Our Converter
See the SQL examples from this article converted to ActiveRecord—and compare the SQL Rails actually generates.
Deep Dive into ActiveRecord
Raza Hussain
Full-stack developer specializing in Ruby on Rails, React, and modern JavaScript. 15+ years upgrading and maintaining production Rails apps. Led Rails 4/5 → 7 upgrades with 40% performance gains, migrated apps from Heroku to Render cutting costs by 35%, and built systems for StatusGator, CryptoZombies, and others. Available for Rails upgrades, performance work, and cloud migrations.
More on SQL to ActiveRecord
Junior Dev Asked "Why Not Just Use SQL?" Gave Textbook Answer. They Weren't Convinced. Then Production Happened.
Not ideology—operations. See how ActiveRecord cut P95 from 1.2s→220ms, dropped queries 501→7, and avoided schema-change bugs. When to use SQL safely, too.
SQL Certification on Resume. Rails Interview Failed. Knew Databases. Didn't Know ActiveRecord.
SQL cert on your resume but Rails interview still flopped? Learn the ActiveRecord skills interviews test—associations, eager loading, batching, and when to use raw SQL.
Read "Agile Web Development with Rails." Still Couldn't Write Queries. Needed Examples, Not Theory.
Books teach concepts. You need examples. See SQL vs ActiveRecord side-by-side, when to use scopes/Arel/SQL, and how to ship maintainable queries fast.
Leave a Response
Responses (0)
No responses yet
Be the first to share your thoughts