Rate limiting, global uniqueness, timeouts, memory limits, mix asyncio/processes/threads/sub-interpreters in the same worker, workflows, cron jobs, dashboard, metrics, django integrations, repriotization, triggers, pruning, Windows support, queue tagging (ex: run this queue on all machines running windows with a GPU, run this one on workers with py3.14 and this one on workers with py3.11) etc etc...
https://tkte.ch/chancy/ & https://github.com/tktech/chancy
The pending v0.26 includes stabilizing of the HTTP API, dashboard improvements, workflow performance improvements for workflows with thousands of steps and django-tasks integration.
We moved all our celery tasks to procrastinate at work for all our django backends since almost two years now and it has been great.
Having tasks deferred in the same transaction as the business logic stuff is something that helped us a lot to improve consistency and debugability. Moreover, it's so nice to able to inspect what's going on by just querying our database or just looking at the django admin.
For those wondering, procrastinate has no built-in alternative to django-celery-beat, but you can easily build your own in a day: no need for an extra dependency for this :)
Anyone here done the migration off of celery to another thing? Any wisdom?
In the first one we use Celery to run some jobs that may last from a few seconds to some minutes. In the other one we create a new VM and make it run the job and we make it self destroy on job termination. The communication is over a shared database and SQS queues.
We have periodic problems with celery: workers losing connection with RabbitMQ, Celery itself getting stuck, gevent issues maybe caused by C libraries but we can't be sure (we use prefork for some workers but not for everything)
We had no problems with EC2 VMs. By the way, we use VirtualBox to simulate EC2 locally: a Python class encapsulates the API to start the VMs and does it with boto3 in production and with VBoxManage in development.
What I don't understand is: it's always Linux, amd64, RabbitMQ but my other customer using Rails and Sidekiq has no problems and they run many more jobs. There is something in the concurrency stack inside Celery that is too fragile.
RabbitMQ and friends are just a pain to use.
This is because the code enqueuing the task needs to be aware of what happens next, which breaks separation of concerns. Why should the user sign-up code have to know that a report generation job now needs queuing?
Really what starts to make more sense to me is to fire off events. Code can say, "this thing just happened", and let other code decide if it wants to listen. When then makes it an event stream rather than a queue, with consumer groups at al.
I made the (now unmaintained) project https://lightbus.org around this, and it did work really well for our use case. Hopefully someone has now created something better.
So I'd say this: before grabbing for a task queue, take a moment to think about what you're actually modelling. But be careful of the event streaming rabbit-hole!
For folks who’ve used Celery/Procrastinate/Chancy: how does retry/ACK behavior feel in real projects? Any rough edges?
What about observability — dashboards, tracing, metrics — good enough out of the box, or did you bolt on extra stuff?
Also, any gotchas with type hints or decorator-style tasks when refactoring? I’ve seen those bite before.
And lastly, does swapping backends for tests actually feel seamless, or is that more of a “works in the demo” thing?
One of the major complaints with Celery is observability. Databased-backed options like Procastinate and Chancy will never reach the potential peak throughput of Celery+RabbitMQ, but they're still sufficient to run millions upon millions of tasks per day even on a $14/month VPS. The tradeoff to this is excellent insight into what's going on - all state lives in the database, you can just query it. Both Procastinate and Chancy come with Django integrations, so you can even query it with the ORM.
For Chancy in particular, retries are a (very trivial) plugin (that's enabled by default) - https://github.com/TkTech/chancy/blob/main/chancy/plugins/re.... You can swap it out and add whatever complex retry strategies you'd like.
Chancy also comes with a "good enough" metrics plugin and a dashboard. Not suitable for an incredibly busy instance with tens of thousands of distinct types of jobs, but good enough for most projects. You can see the new UI and some example screenshots in the upcoming 0.26 release - https://github.com/TkTech/chancy/pull/58 (and that dashboard is for a production app running ~600k jobs a day on what's practically a toaster). The dashboard can be run standalone locally and pointing to any database as-needed, run inside a worker process, or embedded inside any existing asgi app.
That and the rq backend sound promising to me.
You probably want something like pydantic’s @validate_call decorator.
Can you say more, maybe with with an example, about a function which can't be typed? Are you talking about generating bytecode at runtime, defining functions with lambda expressions, or something else?
Does the api support progress reporting? ("30 % done")
Of course one could build this manually when you're building the worker implementation but I'd love to have it reflected in the api somewhere. Celery also seems to be missing api for that.
Anyone sees a reason why that's missing? I don't think it complicates the api much and seems such an obvious thing for longer-running background tasks.
Though, I imagine you could have strategies to give an approximation of it, for example like keeping track of the past execution time of a given type of task in order to infer the progress of a currently running task of the same type.
No. You just need to know the total number of steps and what step are you currently on.
Does anyone know a task queue library that implement it ? I would be curious to look at it !
I first learned about it from Miguel Grinberg's Flask tutorial: https://blog.miguelgrinberg.com/post/the-flask-mega-tutorial... But the same concept applies to Django.
Really want to give this a try though.
I've been handling this, so far, with separate standalone scripts that hook into Django's models and ORM. You have to use certain incantations in an explicit order at the top of the module to make this happen.
Django has management commands for that [1].
When you use Django over time, you experience this pleasant surprise over and over when you need something, “oh, Django already has that”
[1] https://docs.djangoproject.com/en/5.2/howto/custom-managemen...
The main thing when it comes to models is that you pass the ids not the model instances themselves when passing to celery jobs.
This is because the celery worker could be running somewhere else entirely.
I haven't looked into this in any detail but I wonder if the API or defaults will shave off some of the rough edges in Celery, like early ACKs and retries.
https://procrastinate.readthedocs.io/en/stable/index.html
Core is not Django-specific, but it has an optional integration. Sync and async, retries/cancellation/etc., very extensible, and IMO super clean architecture and well tested.
IIRC think the codebase is like one-tenth that of Celery.