5 Laravel Queue Failures That Only Show Up in Production
Your queue works perfectly in local. Every job dispatches, processes, and completes without a hitch. Then you deploy to production with real traffic, real concurrency, and real third-party APIs, and things start breaking in ways your test suite never predicted.
I've been running Laravel queues in production for years across multiple applications. Every failure on this list caught me off guard at least once. Not because the documentation doesn't cover them, but because you don't think about them until they bite you at 2 AM.
1. Your Job Runs Before the Data Exists
This one is subtle and maddening. You create a record, dispatch a job to process it, and the job fails with "model not found." The record is right there in the database when you check manually. So what happened?
You dispatched inside a database transaction.
DB::transaction(function () {
$order = Order::create([
'user_id' => $user->id,
'total' => $cart->total(),
]);
ProcessOrder::dispatch($order);
});
The job gets pushed to Redis immediately, but the transaction hasn't committed yet. If the queue worker picks it up before the commit, the Order row doesn't exist. The job fails, retries a few times, and maybe succeeds on the third attempt when the transaction has finally landed. Or it exhausts retries and dies.
The fix is one method call:
ProcessOrder::dispatch($order)->afterCommit();
Or set it globally in config/queue.php:
'connections' => [
'redis' => [
'driver' => 'redis',
'after_commit' => true,
// ...
],
],
With after_commit enabled, Laravel holds the dispatch until the transaction commits. If the transaction rolls back, the job never enters the queue. I set this globally on every app now. I've never once wanted a job to fire before its transaction commits.
2. Workers Silently Eating All Your Memory
Queue workers are long-running PHP processes. Unlike HTTP requests that die after serving a response, workers persist for hours or days. Every job they process leaves a tiny memory footprint behind. Eloquent model caches, event listeners, service container bindings, log contexts. None of it gets fully cleaned up.
After a few thousand jobs, your worker is sitting at 500MB. Your server starts swapping. Other processes slow to a crawl. And because nothing "failed," there's no alert.
The fix is to let workers die on purpose:
php artisan queue:work redis --max-jobs=1000 --max-time=3600 --memory=128
--max-jobs=1000 means the worker exits after processing 1,000 jobs. --max-time=3600 gives it a one-hour hard cap. --memory=128 exits if memory crosses 128MB. When the worker exits, Supervisor restarts it fresh. Clean slate.
Here's the Supervisor config that makes this work:
[program:laravel-worker]
process_name=%(program_name)s_%(process_num)02d
command=php /var/www/app/artisan queue:work redis --sleep=3 --tries=3 --max-jobs=1000 --max-time=3600 --memory=128
autostart=true
autorestart=true
stopasgroup=true
killasgroup=true
numprocs=4
stopwaitsecs=300
stdout_logfile=/var/www/app/storage/logs/worker.log
Two settings here matter more than people realize: stopwaitsecs=300 gives a running job up to 5 minutes to finish before Supervisor kills it. And stopasgroup=true ensures child processes die with the parent, preventing zombie workers that consume resources but don't process anything.
If you're using Horizon, it manages restarts and memory limits for you. But with raw Supervisor, these settings are the difference between a stable queue and a server that degrades over time.
3. Deployments Killing Jobs Mid-Execution
You deploy new code. Supervisor restarts your workers. But one of those workers was halfway through a job: it charged the customer's credit card, but it hasn't recorded the payment in your database yet.
The worker dies. The job returns to the queue. It runs again with fresh code. The customer gets charged twice.
Two defenses prevent this.
First, give workers time to finish. That stopwaitsecs=300 from the previous section is critical. When Supervisor sends SIGTERM, Laravel's worker finishes its current job before exiting. But only if Supervisor waits long enough. Without sufficient stopwaitsecs, Supervisor sends SIGKILL almost immediately, and the job gets interrupted at whatever point it reached.
Second, make every job idempotent. Assume it will run more than once, because eventually it will:
public function handle(): void
{
$existing = Payment::where('order_id', $this->order->id)
->where('idempotency_key', $this->idempotencyKey)
->first();
if ($existing) {
return;
}
$charge = $this->paymentGateway->charge(
$this->order->total_in_cents,
$this->order->payment_method_id,
);
Payment::create([
'order_id' => $this->order->id,
'charge_id' => $charge->id,
'idempotency_key' => $this->idempotencyKey,
]);
}
The idempotency key gets generated when the job is dispatched, not when it runs. If the same job executes twice, the second run finds the existing payment and exits cleanly. No double charges.
4. Unique Job Locks That Expire Too Early
Laravel's ShouldBeUnique interface prevents duplicate jobs. Dispatch a job, and any identical dispatch is dropped until the first one completes.
Except when the lock expires before the job finishes.
class ProcessReport implements ShouldQueue, ShouldBeUnique
{
public $uniqueFor = 60;
public function handle(): void
{
// Takes 90 seconds on large datasets
$this->generateReport();
$this->emailReport();
}
}
If report generation takes longer than 60 seconds on a large dataset, the lock expires. A second ProcessReport starts. Both send the email. The user gets duplicate reports.
Set uniqueFor well above your worst-case execution time. If a job normally takes 30 seconds but occasionally takes 90, set the lock to 300. A lock that lasts too long is harmless. A lock that expires too early causes duplicate processing.
Also watch your cache driver. Unique locks use your application's cache. If that's the file driver, locks don't work across multiple servers. If it's Redis and Redis restarts, every lock vanishes and all pending unique jobs can run simultaneously.
5. Retry Storms Against External APIs
A third-party API goes down. Your jobs that call it start failing. Each failure triggers a retry. With 3 retries per job and 50 failed jobs, you're suddenly sending 150 requests to a service that's already struggling. Multiply by the number of workers, and you're making things worse for everyone.
The default retry behavior is fast: fail, wait a few seconds, try again. Fine for transient blips. Terrible for sustained outages.
Use exponential backoff:
public function backoff(): array
{
return [30, 60, 300];
}
First retry waits 30 seconds. Second waits a minute. Third waits 5 minutes. This gives the external service breathing room instead of piling on.
For critical integrations, add a circuit breaker:
public function handle(): void
{
$failures = Cache::get('external-api:failures', 0);
if ($failures > 10) {
$this->release(300);
return;
}
try {
$response = Http::timeout(10)
->throw()
->post('https://api.example.com/process', $this->payload);
Cache::forget('external-api:failures');
} catch (ConnectionException $e) {
Cache::increment('external-api:failures');
throw $e;
}
}
When failures accumulate past a threshold, jobs stop attempting the call and release themselves back to the queue with a 5-minute delay. The first successful request clears the counter. Simple, effective, and it prevents your queue from participating in someone else's outage.
The Common Thread
Every failure here shares a root cause: production has concurrency, timing, and persistence characteristics that local development doesn't. One worker, one database, no real traffic, and no external services having a bad day. That's local. Production is the opposite of that.
The fix isn't more testing, though that helps. It's defensive design: assume jobs will run twice, assume workers will die mid-execution, assume external services will fail, and assume your deploy will happen at the worst possible moment. Build for those assumptions and your queues will survive the real world.
Stay in the Loop
Get the latest posts delivered to your inbox - on your schedule.