The Challenge
What Below The Crime Was Facing
Below The Crime aggregates public crime data from dozens of municipal sources — each publishing in different formats, on different schedules, with different geographic reference systems — and delivers a queryable analytics platform for journalists, researchers, and civic organisations. The core infrastructure challenge was normalising geospatial data across inconsistent coordinate systems, handling large bulk ingestion when municipalities published monthly data dumps, and serving complex geospatial queries efficiently.
The Solution
What We Built
We built the ingestion layer as a set of municipality-specific ETL workers that normalised coordinates to WGS84, resolved address references to standardised geohash cells, and classified incident types against a canonical taxonomy. Normalised records were stored in PostGIS with spatial indexes optimised for bounding-box and radius queries. Bulk ingestion jobs ran on AWS Batch with S3 as the staging area, and incremental updates were processed via Lambda triggers on S3 object creation. Query results were cached in Redis with geospatial-aware cache key design. The entire pipeline was instrumented with OpenTelemetry traces exported to Jaeger for end-to-end request visibility.

Results
