Skip to main content

Documentation

Everything you need to safely use the SQL to ActiveRecord converter: verification strategies, known limitations, testing approaches, and debugging production issues.

Core principle: This tool generates syntactically correct ActiveRecord code, but you must verify it produces the same query semantics as your original SQL before deploying to production.

How the Converter Works

The converter parses your SQL query and translates it into equivalent ActiveRecord method chains. It handles SELECT, INSERT, UPDATE, DELETE, JOINs, subqueries, aggregations, and more.

What Gets Converted

✓ Supported

  • • SELECT with WHERE, ORDER BY, LIMIT
  • • INNER JOIN, LEFT JOIN
  • • Subqueries (IN, EXISTS patterns)
  • • GROUP BY, HAVING
  • • Basic aggregations (COUNT, SUM, AVG)
  • • INSERT, UPDATE, DELETE
  • • DISTINCT
  • • CASE statements

✗ Limited/Not Supported

  • • FULL OUTER JOIN
  • • UNION / UNION ALL
  • • Window functions (ROW_NUMBER, etc.)
  • • CTEs (WITH clauses)
  • • Complex correlated subqueries
  • • Database-specific syntax
  • • PIVOT / UNPIVOT
  • • Recursive queries

⚠️ Important: Syntax ≠ Semantics

The converter produces valid Ruby code that will run without syntax errors. However, the generated ActiveRecord query may not return the same results as your SQL due to how ActiveRecord handles JOINs, NULL values, and query building. Always verify the output.

Verification Checklist

Use this checklist for every conversion before deploying to production:

Essential Verification Steps

1

Compare Generated SQL

Run .to_sql on your ActiveRecord relation and compare line-by-line with original SQL:

relation = User.joins(:posts).where(...)
puts relation.to_sql

# Check:
# - Is LEFT JOIN preserved or did it become INNER JOIN?
# - Are WHERE conditions in the right place (ON vs WHERE clause)?
# - Is DISTINCT added when it shouldn't be?
2

Validate Row Counts

Ensure both queries return the same number of rows. Use a subquery for accurate SQL row counts:

# Safe approach: wrap in SELECT COUNT(*) subquery
count_sql = "SELECT COUNT(*) FROM (#{original_sql}) subquery"
sql_count = ActiveRecord::Base.connection.select_value(count_sql)
ar_count = relation.count

raise "Count mismatch! SQL: #{sql_count}, AR: #{ar_count}" unless sql_count == ar_count

Why: .execute() returns adapter-specific result objects; .count on those objects may not reflect row count. Wrapping in SELECT COUNT(*) ensures an accurate integer result.

3

Compare Result Sets

Verify the actual IDs/data returned match. Use exec_query when you need result materialization:

# Safe: exec_query materializes results properly
result = ActiveRecord::Base.connection.exec_query(original_sql)
sql_ids = result.rows.map { |r| r[0] }.sort  # First column (id)
ar_ids = relation.pluck(:id).sort

missing = sql_ids - ar_ids  # In SQL but not ActiveRecord
extra = ar_ids - sql_ids    # In ActiveRecord but not SQL

puts "Missing IDs: #{missing.inspect}"
puts "Extra IDs: #{extra.inspect}"

Why: exec_query returns ActiveRecord::Result with properly materialized rows, unlike execute which returns raw adapter results.

4

Test Edge Cases

Specifically test with:

  • NULL values in joined tables
  • Empty result sets
  • Records with no associations (for LEFT JOINs)
  • Duplicate data (to catch missing DISTINCT)
5

Performance Check

Run EXPLAIN to ensure query plans are similar:

# PostgreSQL
puts relation.explain

# MySQL
puts relation.explain(:analyze)

Known Limitations

LEFT JOIN Condition Placement

Issue: ActiveRecord's .where() adds conditions to the WHERE clause, not the ON clause. This converts LEFT JOIN to INNER JOIN behavior.

Your SQL (correct):

LEFT JOIN posts ON users.id = posts.user_id AND posts.status = 'published'

Naive conversion (wrong):

.left_joins(:posts).where(posts: { status: 'published' })

Workaround:

.left_joins(:posts).where('posts.id IS NULL OR posts.status = ?', 'published')

Complex Subqueries

Issue: Correlated subqueries and nested subqueries may not convert correctly. The converter attempts basic subquery patterns but cannot handle all cases.

Recommendation: For complex subqueries, consider using raw SQL with ActiveRecord::Base.connection.execute or database views.

Database-Specific Features

Issue: PostgreSQL arrays, JSON operators, MySQL-specific functions, and other database-specific syntax cannot be converted to portable ActiveRecord code.

  • • PostgreSQL: array operators, jsonb, full-text search
  • • MySQL: JSON functions, FULLTEXT indexes
  • • SQL Server: PIVOT, specific date functions

Window Functions

Issue: SQL window functions (ROW_NUMBER, RANK, LAG, LEAD) have limited ActiveRecord support even in Rails 7+. The converter cannot reliably translate these.

Recommendation: Keep window functions as raw SQL or use database views.

GROUP BY Edge Cases

Issue: PostgreSQL requires all SELECT columns to appear in GROUP BY. MySQL (with ONLY_FULL_GROUP_BY off) does not. The converter may produce code that works in one database but not another.

Recommendation: Always test generated GROUP BY queries on your target database.

Testing Strategies

Writing Tests for Converted Queries

RSpec Example: Verify SQL Generation

RSpec.describe "User published posts query" do
  it "generates LEFT OUTER JOIN with conditions in ON clause" do
    relation = User.left_joins(:posts)
                   .where('posts.id IS NULL OR posts.published = true')

    sql = relation.to_sql

    # Verify JOIN type
    expect(sql).to include("LEFT OUTER JOIN")
    expect(sql).not_to include("INNER JOIN")

    # Verify condition placement
    # This is database-specific, adjust for your DB
    expect(sql).to match(/ON.*posts\.published/i)
  end

  it "returns correct result set" do
    # Setup test data
    user_with_post = create(:user)
    create(:post, user: user_with_post, published: true)

    user_without_post = create(:user)

    user_with_unpublished = create(:user)
    create(:post, user: user_with_unpublished, published: false)

    # Run query
    results = User.left_joins(:posts)
                  .where('posts.id IS NULL OR posts.published = true')

    # Verify
    expect(results).to include(user_with_post)
    expect(results).to include(user_without_post)
    expect(results).to include(user_with_unpublished)
  end
end

Minitest Example: Compare SQL and ActiveRecord

class UserQueriesTest < ActiveSupport::TestCase
  test "SQL and ActiveRecord return same IDs" do
    # Your original SQL
    original_sql = <<~SQL
      SELECT users.id FROM users
      LEFT JOIN posts ON users.id = posts.user_id
      WHERE posts.published = true OR posts.id IS NULL
    SQL

    # ActiveRecord version
    ar_relation = User.left_joins(:posts)
                      .where('posts.published = true OR posts.id IS NULL')

    # Compare IDs - use exec_query for safe result materialization
    result = ActiveRecord::Base.connection.exec_query(original_sql)
    sql_ids = result.rows.map { |r| r[0] }.sort

    ar_ids = ar_relation.pluck(:id).sort

    assert_equal sql_ids, ar_ids,
      "SQL and ActiveRecord queries return different results"
  end
end

Debugging Guide

Common Debugging Techniques

1. Inspect Generated SQL

# In rails console or your code
relation = User.joins(:posts).where(published: true)

# See the SQL
puts relation.to_sql

# See it formatted (if you have a formatter)
pp relation.to_sql

# Run EXPLAIN
puts relation.explain

2. Enable Query Logging

# In rails console
ActiveRecord::Base.logger = Logger.new(STDOUT)

# Or in config/environments/development.rb
config.log_level = :debug
config.active_record.verbose_query_logs = true

# Now all queries are logged with their source location

3. Compare Query Plans

# PostgreSQL
ActiveRecord::Base.connection.execute("EXPLAIN ANALYZE #{original_sql}")
puts relation.explain

# Look for differences in:
# - Index usage (Seq Scan vs Index Scan)
# - JOIN algorithms (Hash Join vs Nested Loop)
# - Row estimates vs actual rows

4. Debug Data Differences

# Find records only in SQL result - use exec_query for safety
result = ActiveRecord::Base.connection.exec_query(original_sql)
sql_ids = result.rows.map { |r| r[0] }.to_set

ar_ids = relation.pluck(:id).to_set

# Records in SQL but not ActiveRecord
missing_ids = sql_ids - ar_ids
User.where(id: missing_ids).each do |user|
  puts "Missing user #{user.id}: #{user.inspect}"
  # Inspect associations to understand why it was filtered
  puts "Posts: #{user.posts.inspect}"
end

# Records in ActiveRecord but not SQL
extra_ids = ar_ids - sql_ids
User.where(id: extra_ids).each do |user|
  puts "Extra user #{user.id}: #{user.inspect}"
end

Common Issues & Solutions

Issue: Query Returns Fewer Rows Than Expected

Likely cause: LEFT JOIN converted to INNER JOIN, or WHERE filter applied too broadly.

Solution:

# Check the JOIN type in generated SQL
puts relation.to_sql  # Look for "LEFT OUTER JOIN" vs "INNER JOIN"

# If INNER JOIN appears where you expected LEFT:
# Move conditions from .where() to the ON clause
User.joins("LEFT JOIN posts ON users.id = posts.user_id AND posts.published = true")
    .where("posts.id IS NULL OR posts.published = true")

Issue: N+1 Queries in Production

Likely cause: Converted query doesn't include eager loading.

Solution:

# Add .includes() for associations you'll access
User.joins(:posts).where(...).includes(:posts)

# Or use .preload() if you don't need JOIN filtering
User.where(...).preload(:posts)

# Install bullet gem to detect N+1 in development
# gem 'bullet', group: :development

Issue: Performance Degradation

Likely cause: Missing indexes, inefficient JOIN order, or DISTINCT on large result sets.

Solution:

# Check EXPLAIN
puts relation.explain

# Add indexes for foreign keys and WHERE conditions
add_index :posts, :user_id
add_index :posts, [:user_id, :published]  # Composite index

# If .distinct causes issues, check if you need it
User.joins(:posts).count              # May return duplicates
User.joins(:posts).distinct.count     # Deduplicated but slower
User.joins(:posts).select(:id).distinct.count  # Faster

Best Practices

✓ Do

  • • Always run .to_sql and compare with original SQL
  • • Test with production-like data volumes and NULL values
  • • Write regression tests for complex conversions
  • • Use .explain to verify index usage
  • • Document why you chose ActiveRecord vs raw SQL
  • • Keep complex SQL as raw SQL or database views
  • • Monitor query performance in production

✗ Don't

  • • Deploy converted queries without verification
  • • Assume LEFT JOIN semantics are preserved
  • • Trust .count alone — check the actual records
  • • Convert database-specific features to ActiveRecord
  • • Use .includes when you mean .joins
  • • Skip testing edge cases (NULLs, empty sets)
  • • Ignore performance implications of DISTINCT

Ready to Convert Safely?

Use the converter with confidence by following these verification and testing strategies.

Production-safe conversions with verification built in

Related Resources

Documentation Unclear?

Found an error in the documentation? Need clarification on a specific feature?