ClickHouse for Application Analytics: Fast Aggregations Without Spark

The requirement: an internal analytics dashboard showing trading activity metrics — volume, trade count, latency distributions, error rates — sliced by instrument, venue, time window, and a dozen other dimensions. Data volume: about 4 billion events per day, 90-day retention. Query pattern: ad-hoc OLAP — arbitrary group-bys, time ranges, filters. We evaluated TimescaleDB (Postgres extension), Apache Druid, ClickHouse, and “just use BigQuery.” We chose ClickHouse. After a year in production, I’d make the same choice. ...

May 17, 2023 · 5 min · MW
Available for consulting Distributed systems · Low-latency architecture · Go · LLM integration & RAG · Technical leadership
hello@turboawesome.win