Building a Reliable Background Worker System with Celery and Redis
How to offload heavy tasks in Django using Celery and Redis, including architectural diagrams and best practices.
When you're building web applications, you inevitably hit a wall where synchronous request/response cycles aren't enough. Generating a PDF, sending a batch of emails, or processing an uploaded CSV file will block your web workers and cause timeouts.
This is where Celery comes in.
The Architecture
Here is how a standard Django + Celery architecture looks:
Notice that the Django application never actually executes the heavy work. It simply serializes the task arguments, pushes a message to Redis, and immediately returns a response to the user.
Writing Resilient Tasks
The biggest mistake I see developers make with Celery is passing complex objects to tasks.
Bad Practice:
@shared_task
def process_user_upload(user_obj):
# What if the user gets updated in the database before this task runs?
# user_obj is stale!
user_obj.is_processed = True
user_obj.save()
Good Practice:
@shared_task
def process_user_upload(user_id):
# Always pass IDs and fetch fresh data from the database
user = User.objects.get(id=user_id)
user.is_processed = True
user.save()
Warning: Redis has limited memory. If you pass a massive dictionary or raw file contents to a Celery task, you will OOM your Redis instance. Always save files to disk/S3 and pass the URL or filepath instead.
Monitoring
Once you have workers running, you must monitor them. I highly recommend using Flower. It gives you a dashboard to see active tasks, failure rates, and execution times. Combine that with a solid retry mechanism (bind=True, max_retries=3), and you have a bulletproof background processing system.