Skip to main content

Fixing when queue workers keep popping the same job off a queue

From Amazon Queue documentation
My (Laravel) project uses a queue system for importing because these jobs can take a fair amount of time (up to an hour) and I want to have them run asynchronously to prevent my users from having to sit and watch a spinning ball.

I created a cron job which would run my Laravel queue work command every 5 minutes.  PHP is not really the best language for long-running processes which is why I elected to rather run a task periodically instead of listening all the time.  This introduced some latency (which I could cut down) but this is acceptable in my use case (imports happen once a month and are only run manually if an automated import fails).

The problem that I faced was that my queue listener kept popping the same job off the queue.  I didn't try running multiple listeners but I'm pretty confident weirdness would have resulted in that case as well.

Fixing the problem turned out to be a matter of configuring the visibility time of my queue.  I was using the SQS provided default of 30 seconds.  Amazon defines the visibility timeout as:
The period of time that a message is invisible to the rest of your application after an application component gets it from the queue. During the visibility timeout, the component that received the message usually processes it, and then deletes it from the queue. This prevents multiple components from processing the same message.  Source: Amazon documentation
This concept is common to queues and exists in various names in Beanstalk and others.  In Beanstalk the setting is called time-to-run and IronMQ refers to it as timeout.  So if your run time exceeds your queue availability timeout then workers will pop off a job that is currently being run in another process.

Comments

Popular posts from this blog

Solving Doctrine - A new entity was found through the relationship

There are so many different problems that people have with the Doctrine error message: exception 'Doctrine\ORM\ORMInvalidArgumentException' with message 'A new entity was found through the relationship 'App\Lib\Domain\Datalayer\UnicodeLookups#lookupStatus' that was not configured to cascade persist operations for entity: Searching through the various online sources was a bit of a nightmare.  The best documentation I found was at  http://www.krueckeberg.org/  where there were a number of clearly explained examples of various associations. More useful information about association ownership was in the Doctrine manual , but I found a more succinct explanation in the answer to this question on StackOverflow . Now I understood better about associations and ownership and was able to identify exactly what sort I was using and the syntax that was required. I was implementing a uni-directional many to one relationship, which is supposedly one of the most simpl...

Grokking PHP monolog context into Elastic

An indexed and searchable centralized log is one of those tools that once you've had it you'll wonder how you managed without it.    I've experienced a couple of advantages to using a central log - debugging, monitoring performance, and catching unknown problems. Debugging Debugging becomes easier because instead of poking around grepping text logs on servers you're able to use a GUI to contrast and compare values between different time ranges. A ticket will often include sparse information about the problem and observed error, but if you know more or less when a problem occurred then you can check the logs of all your systems at that time. Problem behaviour in your application can occur as a result of the services you depend on.  A database fault will produce errors in your application, for example. If you log your database errors and your application errors in the same central platform then it's much more convenient to compare behaviour between...

Translating a bit of the idea behind domain driven design into code architecture

I've often participated in arguments discussions about whether thin models or thin controllers should be preferred.  The wisdom of a thin controller is that if you need to test your controller in isolation then you need to stub the dependencies of your request and response. It also violates the single responsibility principal because the controller could have multiple reasons to change.   Seemingly, the alternative is to settle on having fat models. This results in having domain logic right next to your persistence logic. If you ever want to change your persistence layer you're going to be in for a painful time. That's a bit of a cargo cult argument because honestly who does that, but it's also a violation of the single responsibility principal.   One way to decouple your domain logic from both persistence and controller is to use the "repository pattern".   Here we encapsulate domain logic into a data service. This layer deals exclusively with imple...