The Challenge
What PriceZ Was Facing
Building a reliable price comparison platform meant solving a fundamentally hard data engineering problem: ingesting live price feeds from 200+ retailers — each with different schemas, rate limits, and reliability — and normalising them into a single queryable product catalogue. The existing pull-based scraping approach could not scale, missed updates frequently, and had no observability into pipeline failures.
The Solution
What We Built
We replaced the polling architecture with an event-driven ingestion system built on Apache Kafka. Each retailer source published price-change events to a dedicated topic, consumed by a normalisation worker that resolved product identity via barcode and fuzzy-match algorithms. The serving layer used Redis for hot-path caching and PostgreSQL with a read replica for analytical queries. Prometheus and Grafana dashboards gave the team full visibility into per-source ingestion lag, error rates, and data freshness SLAs.

Results
