A Better Way to Handle Postgres Partitioned Tables for CDC

Robin Tang

Updated on

November 4, 2025

Data know-how

‍TL;DR

Partitioned tables in Postgres introduce complexity when using logical replication or CDC. You often need to maintain regex patterns to detect partitions, ensure each partition is included in publications, handle cases where partitions live in different schemas, and deal with downstream breakage when a table transitions from non-partitioned to partitioned.

Enabling publish_via_partition_root on your publication removes most of this complexity. It causes Postgres to publish changes from all partitions under the root table name, eliminating the need for partition discovery and preventing replication gaps.

Replicating Partitioned Tables in Postgres

Partitioned tables are common in Postgres as they help large tables scale more efficiently - But when it comes to CDC replication, they introduce friction.

Here are some challenges teams frequently run into:

Partitions following a particular regex pattern
Partitions stored in a separate schema from the root table
Partitioned tables stored in the publication
The desire to fan all of these partitions into a single table downstream
A regular table changed to a partitioned table and breaking downstream dependencies

Fortunately, you are able to create a publication in Postgres with publish_via_partition_root that makes these problems easier to deal with.

CREATE PUBLICATION my_pub FOR ALL TABLES WITH (publish_via_partition_root = true);

What does publish_via_partition_root do?

When enabled, all logical replication events from partitions are published with the root table name instead of the partition table name.

Partitions following a particular regex pattern

First off, why do we care?

There are a couple of reasons why we need to understand the layout:

We'll need to know this if we're fanning partitions into a single topic
We'll need to know this to know whether we should process or ignore this when we see this CDC message.

For the most part, most companies follow some sort of layout like:

{{tableName}}_YYYY_MM
{{tableName}}_default

And if this were the case, we can easily solve this with a regular expression pattern like this: {{tableName}}_((default)|([0-9]{4})_(0[1-9]|1[012]))$

What if your layout was less deterministic?
What if you weren't partitioning your table based on timestamp and instead was partitioning off a region identifier?

For example, if the partitions were something like orders_sf, orders_oakland, and there is a non-partitioned table called orders_total.

By creating too wide of a regular expression, we may accidentally match on the non-partitioned table. However, if it was too narrow, then we risk not picking up new partitions. This process then becomes error prone, but also laborious.

Partitions stored in a separate schema from the root table

Some organizations keep the root table in one schema and partitions in another. Pipeline logic must then account for schema differences, increasing complexity.

Partitioned tables stored in the publication

This is a common pitfall when the publication is not created with for all tables. In this case, each partition must be added manually; any partition that is not included in the publication will not emit logical replication events, resulting in missing change data.

A regular table changed to a partitioned table and breaking downstream dependencies

A previously non-partitioned table may later be converted to a partitioned table. If downstream logic expects a single table identity, this transition can cause breakages unless handled explicitly.

So how does `publish_via_root` fix these issues?

Problem	Comment
Partitions following a particular regex pattern	You do not need to specify a regular expression. All CDC events will be labeled with the root table instead of the partition table.
Partitions stored in a separate schema from the root table	It does not matter since data is published to the root. We only care about the schema of the root table.
Partitioned tables stored in the publication	We only need the root table to be added to the publication if `FOR ALL TABLES` is not specified.
The desire to fan all of these partitions into a single table downstream	This happens out of the box by Postgres as it’s already fanned into the root table.
A regular table changed to a partitioned table and breaking downstream dependencies	This does not matter since `publish_via_root` treats root partition table the same as regular tables.

Looking for a seamless way to stream data out of Postgres?

Artie is a fully-managed CDC streaming platform that replicates database changes into warehouses and lakes without the complexity of DIY pipelines. It delivers production-grade reliability without ongoing maintenance.

‍

AUTHOR

Robin Tang