ClickHouse for Application Analytics: Fast Aggregations Without Spark

The requirement: an internal analytics dashboard showing trading activity metrics — volume, trade count, latency distributions, error rates — sliced by instrument, venue, time window, and a dozen other dimensions. Data volume: about 4 billion events per day, 90-day retention. Query pattern: ad-hoc OLAP — arbitrary group-bys, time ranges, filters. We evaluated TimescaleDB (Postgres extension), Apache Druid, ClickHouse, and “just use BigQuery.” We chose ClickHouse. After a year in production, I’d make the same choice. ...

May 17, 2023 · 5 min · MW

Choosing a Time-Series Database in 2020

The fintech startup had three different time-series storage problems at the same time. After evaluating the options available in 2020, we ended up running two different systems. Here’s the decision framework and why the landscape fragmented the way it did. ...

November 18, 2020 · 5 min · MW
Available for consulting Distributed systems · Low-latency architecture · Go · LLM integration & RAG · Technical leadership
hello@turboawesome.win