The Challenge
What PurpleX Was Facing
PurpleX aggregates campaign performance data from 12 advertising platforms — Google, Meta, LinkedIn, TikTok, and others — each with different API rate limits, data schemas, and attribution windows. The platform had to unify this data into a consistent attribution model, refresh fast enough to be actionable for campaign managers, and scale to handle enterprise clients with hundreds of concurrent campaigns.
The Solution
What We Built
We built the data collection layer as a set of source-specific connector services, each responsible for a single ad platform and deployed as independently scalable AWS Lambda functions on a polling schedule. Raw data landed in S3, was transformed by a dbt-based pipeline running on Airflow, and served from a Redshift data warehouse optimised with materialised views for the most common attribution query patterns. A change data capture layer pushed incremental updates to a Redis cache serving the live dashboard. The entire dbt transformation layer was tested with unit tests and contract tests to catch schema changes from upstream APIs before they reached production.

Results
