blog / designing-scalable-analytics-postgres-clickhouse / overview.md

Designing Scalable Analytics: From Postgres to ClickHouse

How we outgrew PostgreSQL for our blog's analytics engine, the pain of counting billions of rows, and how migrating to ClickHouse solved our aggregate query nightmares.

The Postgres Honeymoon Phase

When you build an analytics system—say, tracking page views, unique visitors, and referrers for a blogging platform—PostgreSQL is the logical place to start.

You create a simple table:

CREATE TABLE page_views (
    id UUID PRIMARY KEY,
    post_id UUID REFERENCES posts(id),
    visitor_id TEXT,
    referrer TEXT,
    created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX idx_pv_post_date ON page_views(post_id, created_at);

For the first million rows, a SELECT COUNT(*) or COUNT(DISTINCT visitor_id) grouped by day takes milliseconds.

Then you hit 50 million rows.

Suddenly, rendering the admin dashboard takes 4 seconds. Your database CPU spikes to 100% every time a user loads their stats. You try materialized views, but refreshing them locks tables or takes minutes, leaving users with stale data.

Postgres is a row-oriented database (OLTP). It is designed to find, insert, and update specific records incredibly fast. It is not designed to scan and aggregate tens of millions of rows on the fly (OLAP).

Enter ClickHouse

ClickHouse is an open-source, columnar database designed specifically for lightning-fast analytics.

Because it stores data in columns rather than rows, a COUNT(*) or SUM() operation only needs to read the specific columns involved in the query from disk, bypassing the rest of the data. Furthermore, data is heavily compressed, and queries are vectorized (executed in batches using CPU SIMD instructions).

The Migration Strategy

We couldn't just swap Postgres for ClickHouse. Postgres remained our source of truth for relational data (Users, Posts, Comments). We only needed ClickHouse for the high-volume event data.

We adopted a Dual-Write with Message Queue architecture:

When a page view occurs, the Next.js API route pushes a tiny JSON payload to a Redis Stream (or Kafka).
A background worker consumes the stream.
The worker writes batches of 10,000 events to ClickHouse every few seconds.

ClickHouse thrives on large batch inserts, not single-row inserts.

Designing the ClickHouse Schema

In ClickHouse, you don't use UUID primary keys the way you do in Postgres. You define an ORDER BY clause which determines how data is physically sorted on disk.

CREATE TABLE page_views (
    post_id UUID,
    visitor_hash UInt64,
    referrer String,
    created_at DateTime
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(created_at)
ORDER BY (post_id, created_at);

Because the data is ordered by post_id and then created_at, looking up the analytics for a specific post over a specific date range requires reading a tiny, contiguous block of disk. It is blisteringly fast.

Solving the "Unique Visitors" Problem

Calculating unique visitors (COUNT(DISTINCT visitor_hash)) is computationally expensive over large datasets.

ClickHouse offers specialized aggregate functions like uniqCombined, which uses a HyperLogLog data structure to estimate unique counts with >99% accuracy but a fraction of the memory and CPU cost.

-- Postgres: 14 seconds on 100M rows
SELECT date_trunc('day', created_at), COUNT(DISTINCT visitor_id)
FROM page_views 
WHERE post_id = '...' 
GROUP BY 1;

-- ClickHouse: 40 milliseconds on 100M rows
SELECT toDate(created_at), uniqCombined(visitor_hash)
FROM page_views
WHERE post_id = '...'
GROUP BY 1;

Data Synchronization and Aggregating Engines

To make dashboards load instantly, you don't even query the raw page_views table. You use ClickHouse's AggregatingMergeTree engine paired with a Materialized View.

Unlike Postgres materialized views, ClickHouse materialized views act as insert triggers. As data flows into the page_views table, ClickHouse automatically aggregates it and writes it into a daily summary table.

CREATE MATERIALIZED VIEW daily_post_stats
ENGINE = AggregatingMergeTree()
ORDER BY (post_id, day)
AS SELECT
    post_id,
    toDate(created_at) AS day,
    countState() AS total_views,
    uniqCombinedState(visitor_hash) AS unique_visitors
FROM page_views
GROUP BY post_id, day;

When a user loads their dashboard, we query daily_post_stats. The query touches a few hundred rows instead of millions, responding in under 5ms regardless of how much traffic the site receives.

Conclusion

Postgres is incredible. But knowing when to step outside of it is the mark of a maturing architecture. By splitting our OLTP workloads (CMS, Auth) in Postgres, and our OLAP workloads (Analytics) in ClickHouse, we achieved a system that is both rock-solid and infinitely scalable.

Related Blogs

database