AffinityRouter | Marc Badia

Screenshots

Architecture conceptual diagram: Load Balancer → Middleware → Workers

Benchmark results showing cache hit rate improvements

Overview

AffinityRouter is a Python library for smart task routing in distributed systems. Instead of distributing work agnostically (e.g., Round-Robin), it prioritizes Data Locality using consistent hashing, ensuring tasks for the same resource always reach the same worker — maximizing in-memory cache hits and minimizing database pressure. The core insight is simple: if Worker A has already processed a task for customer_id=42, that customer's data is likely still in Worker A's memory. Routing the next task for customer_id=42 back to Worker A means zero cache misses, zero database round-trips — and significantly higher throughput at scale. AffinityRouter ships with a Redis-backed middleware layer so your task distribution strategy persists across restarts and is visible to all workers in the cluster. The consistent hashing ring automatically handles node additions and removals with minimal key remapping.

Architecture

AffinityRouter sits between your load balancer and your workers as a thin middleware layer. Task Submission: A client submits a task with an affinity key (e.g., customer_id, tenant_id, file_id). The AffinityRouter hashes this key using consistent hashing to determine the target worker node. Consistent Hashing Ring: The ring maps key space to worker nodes. When a worker joins or leaves, only the keys in the affected arc are remapped — not the entire keyspace. This minimizes disruption during scaling events. Redis Routing Table: The current hash ring state is persisted in Redis, so all routers in the cluster share the same view of which worker owns which keys. Worker Processing: Workers receive tasks guaranteeing locality — if they have the resource in memory, they serve directly. Otherwise, they fetch from the database and cache it for subsequent requests.

← All Projects View on GitHub →