Skip to main content

Rails find_each vs each—Why Background Jobs Need Batch Processing

How find_each prevents memory explosions when processing 100K+ records, with real benchmarks and production gotchas

A
Raza Hussain
· 8 min read · 5
Rails find_each vs each batch processing comparison showing memory usage with 100K records

I once watched a background job process 100K user records and crash the server. Hard. Memory spiked to 8GB, the dyno ran out of RAM, and Heroku killed the process. The fix? One method swap: each to find_each. Memory dropped to 200MB and stayed flat.

If you’re iterating over large datasets in Rails, you need to understand the difference between each and find_each. One loads everything into memory at once. The other processes records in batches—keeping your memory footprint small no matter how many records you have.

Why each Will Kill Your Background Jobs

Here’s what happens when you use each on a large dataset:

# Load ALL users into memory at once
User.where(newsletter_subscribed: true).each do |user|
  NewsletterMailer.weekly_digest(user).deliver_later
end

SQL version:

SELECT * FROM users WHERE newsletter_subscribed = true
-- Returns ALL rows in one giant result set

Rails loads every single record into an array first. For 100K users with 20 columns each, that’s roughly 400MB of objects sitting in memory before your loop even starts.

Production impact: On our SaaS app with 120K users, this pattern consumed 6.2GB of memory per job run. The Sidekiq worker would OOM (out of memory) halfway through, fail the job, and retry—creating an infinite crash loop until we manually killed it.

How find_each Saves Your Memory

The find_each method processes records in batches using LIMIT and OFFSET under the hood:

# Process users in batches of 1000 (default batch size)
User.where(newsletter_subscribed: true).find_each do |user|
  NewsletterMailer.weekly_digest(user).deliver_later
end

ActiveRecord version (under the hood):

# Rails generates queries like this:
# SELECT * FROM users WHERE newsletter_subscribed = true ORDER BY id LIMIT 1000
# SELECT * FROM users WHERE newsletter_subscribed = true AND id > 1000 ORDER BY id LIMIT 1000
# SELECT * FROM users WHERE newsletter_subscribed = true AND id > 2000 ORDER BY id LIMIT 1000
# ... continues until all records processed

Rails fetches 1000 records, processes them, releases them from memory, then fetches the next 1000. Your memory usage stays constant regardless of total record count.

Production impact: After switching to find_each, the same 120K user job ran with 180-220MB memory (90% reduction). Job completion time went from 8 minutes (before crashing) to 6 minutes (successful completion). Zero OOM errors in 4 months since the fix.

When to Use find_each vs each

Use find_each when:

  • Processing 1000+ records in background jobs
  • Iterating over entire tables or large scopes
  • Memory constraints matter (production jobs, large datasets)
  • You don’t need custom ordering (find_each forces ORDER BY id)

Use each when:

  • Working with small result sets (under 100 records)
  • You’ve already eager loaded associations with includes
  • You need custom ordering that conflicts with id ordering
  • You’re in a console debugging with .limit(10)

Trade-offs:

  • find_each pros: Constant memory usage, processes millions of records safely, prevents OOM crashes
  • find_each cons: Forces ORDER BY id (can’t use custom sort), slightly slower on small datasets due to query overhead, won’t work with custom primary keys without configuration

I use find_each by default in any job that touches user-generated content. The memory safety is worth the ordering constraint.

Customizing Batch Size for Performance

The default batch size is 1000, but you can tune it:

# Smaller batches: better for memory-constrained environments
User.find_each(batch_size: 500) do |user|
  ExpensiveService.new(user).process
end

# Larger batches: fewer queries, better for fast operations
User.find_each(batch_size: 5000) do |user|
  user.update_column(:last_processed_at, Time.current)
end

When to adjust batch size:

Use smaller batches (100-500) when:

  • Each iteration does heavy work (API calls, complex calculations)
  • Memory per record is high (large JSON columns, text fields)
  • You want more frequent progress updates in logs

Use larger batches (5000-10000) when:

  • Each iteration is lightweight (simple updates, logging)
  • Database query overhead matters (network latency to DB)
  • You’ve profiled and confirmed memory stays low

Production lesson: I once set batch_size to 10K for an API integration job. Each user record triggered an HTTP request to Stripe. Memory was fine, but we hit Stripe’s rate limit (100 requests/second) and got blocked. Dropped batch size to 500, added sleep(0.01) between records, problem solved. Lesson: tune batch size to your slowest dependency.

The Mistake I Made with find_each and includes

I tried to combine eager loading with batching and Rails silently broke my optimization:

# I thought this would batch AND eager load
User.includes(:subscriptions).find_each do |user|
  puts user.subscriptions.map(&:plan_name)
end

What broke: The includes was ignored. Rails generated N+1 queries inside the batch loop—1 query per user to fetch subscriptions. For a 10K user batch, that’s 10,001 queries (1 for users + 10K for subscriptions).

Why it broke: find_each uses LIMIT and OFFSET internally, which conflicts with how includes builds JOIN or separate queries. Rails drops the includes to make the query work.

The fix: Use preload inside the batch manually:

# Batch users first, then eager load per batch
User.find_each(batch_size: 1000) do |user|
  # Manually preload associations for this batch
  ActiveRecord::Associations::Preloader.new(
    records: [user],
    associations: :subscriptions
  ).call

  puts user.subscriptions.map(&:plan_name)
end

Or process IDs first, then fetch batches with includes:

# Better: fetch IDs in batches, then load full records
User.where(active: true).find_each do |user|
  batch_ids << user.id

  if batch_ids.size >= 1000
    process_batch(batch_ids)
    batch_ids.clear
  end
end

def process_batch(user_ids)
  # Now includes works because we're not using find_each
  users = User.includes(:subscriptions).where(id: user_ids)
  users.each { |user| puts user.subscriptions.map(&:plan_name) }
end

Lesson: I lost 2 days debugging why a job that “should” take 5 minutes was taking 45 minutes. The Bullet gem finally caught it—showing 10K N+1 queries I thought I’d prevented. Now I always check the Rails logs for query counts when using find_each with associations.

Using find_in_batches for Custom Batch Logic

If you need to process entire batches at once (not individual records), use find_in_batches:

# Process entire batches as arrays
User.find_in_batches(batch_size: 1000) do |batch|
  # batch is an array of 1000 User objects
  user_ids = batch.map(&:id)

  # Bulk insert or update operations work better here
  Notification.insert_all(
    batch.map { |user| { user_id: user.id, message: "Welcome!", created_at: Time.current } }
  )
end

Why this matters: find_in_batches gives you the full batch array, so you can use bulk operations like insert_all, update_all, or send data to external APIs in batches.

Production use case: We sync 200K users to our analytics warehouse nightly. Using find_in_batches(batch_size: 5000), we build CSV files of 5K users each and upload to S3. This approach takes 8 minutes vs 35 minutes when we were processing one user at a time with find_each.

Real-World Batch Processing Benchmarks

Here’s what I measured on a 250K user table:

Processing task: Update a timestamp column on all users

# Approach 1: each (load everything)
users = User.all.to_a  # Loads all 250K records
Benchmark.measure { users.each { |u| u.update_column(:processed_at, Time.current) } }
# Memory: 4.2GB peak
# Time: 12m 30s (OOM killed before completion)

# Approach 2: find_each (default batch size 1000)
Benchmark.measure { User.find_each { |u| u.update_column(:processed_at, Time.current) } }
# Memory: 185MB constant
# Time: 8m 15s

# Approach 3: find_in_batches with bulk update
Benchmark.measure do
  User.find_in_batches(batch_size: 5000) do |batch|
    User.where(id: batch.map(&:id)).update_all(processed_at: Time.current)
  end
end
# Memory: 220MB constant
# Time: 2m 40s (fastest)

Key insight: For updates, find_in_batches + update_all beats find_each by 3x because you’re doing bulk SQL operations instead of individual record updates. For jobs that touch external APIs or need per-record logic, find_each is your best bet.

Final Thoughts

Use find_each for any Rails job processing more than 1000 records. It prevents memory explosions and makes your jobs reliable at scale. Avoid combining find_each with includes—it silently breaks eager loading. For bulk operations, reach for find_in_batches with update_all or insert_all to get 3-5x faster processing.

Start with the default batch size (1000), profile with real data, then tune based on memory usage and query counts. The Bullet gem catches N+1s that sneak in. If you’re new to activerecord, start with Added includes() Everywhere to Fix N+1. Made Everything Slower.

Was this article helpful?

Your feedback helps us improve our content

Be the first to vote!

How We Verify Conversions

Every conversion shown on this site follows a strict verification process to ensure correctness:

  • Compare results on same dataset — We run both SQL and ActiveRecord against identical test data and verify results match
  • Check generated SQL with to_sql — We inspect the actual SQL Rails generates to catch semantic differences (INNER vs LEFT JOIN, WHERE vs ON, etc.)
  • Add regression tests for tricky cases — Edge cases like NOT EXISTS, anti-joins, and predicate placement are tested with multiple scenarios
  • Tested on Rails 8.1.1 — All conversions verified on current Rails version to ensure compatibility

Last updated: March 19, 2026

Try These Queries in Our Converter

See the SQL examples from this article converted to ActiveRecord—and compare the SQL Rails actually generates.

5

Leave a Response

Responses (0)

No responses yet

Be the first to share your thoughts

R

Raza Hussain

Full-stack developer specializing in Ruby on Rails, React, and modern JavaScript. 15+ years upgrading and maintaining production Rails apps. Led Rails 4/5 → 7 upgrades with 40% performance gains, migrated apps from Heroku to Render cutting costs by 35%, and built systems for StatusGator, CryptoZombies, and others. Available for Rails upgrades, performance work, and cloud migrations.

💼 15 years experience 📝 44 posts