Artie is a real-time database replication platform that streams changes from databases like Postgres and MySQL into warehouses and lakes. We handle the full pipeline – change data capture (CDC), merges, schema evolution, backfills, and observability – so your data team doesn't have to.
Under sustained production-level writes, AWS DMS fell 33 minutes behind the source database - and the gap was still growing. Artie stayed under 30 seconds.
We ran head-to-head benchmarks comparing Artie and AWS Database Migration Service (DMS) for replicating data from Postgres to Snowflake. We tested initial snapshots, ongoing CDC replication, history mode, and impact on the source database. Every configuration and query is documented below, and all code is published at github.com/artie-labs/benchmarks.

At a Glance
Why This Benchmark Matters
If your stack is on AWS and you need to replicate from Postgres into Snowflake, DMS is probably the first thing you'll try. It's the default. It's already in your AWS console. And for one-time migrations, it works fine – because that's what it was built for.
But CDC isn't a one-time migration. It's a continuous stream of changes flowing from your production database into your warehouse, ideally in near real-time. As write volumes go up, the architectural differences between a migration tool and a purpose-built streaming platform start to compound. We wanted to put real numbers on exactly how much they compound.
The question: Under sustained, production-level writes (~22,400 events/sec for 10 minutes straight), how do Artie and DMS compare on latency, throughput, and operational impact to the source database?
Setup

To ensure a fair comparison, both tools replicated from the same AWS RDS Postgres instance to the same Snowflake warehouse in the same region.
All DMS task settings (batch sizes, buffer configs, error handling) are published in the benchmarks repo.
How the Pipelines Work
Before we get into results, it's worth understanding how these two pipelines are architected. This is where most of the performance gap comes from.

AWS DMS Pipeline (4 hops)
DMS reads from the Postgres WAL (Write-Ahead Log - the append-only log that Postgres uses internally for crash recovery and replication) and routes data through four stages before it lands in Snowflake:
- DMS Instance - buffers changes and serializes them into Parquet files. At 22,400 events/sec, the 2 vCPU instance becomes CPU-bound.
- S3 - file write latency plus batch accumulation delay.
- Snowpipe - an async queue with unpredictable notification delay.
- Merge Task - a separate consolidation step into the final table.
Artie Pipeline (2 hops)
Artie Reader reads directly from the Postgres logical replication stream, queues in Kafka, Artie Transfer buffers changes in memory, and flushes micro-batches to Snowflake using optimized MERGE statements. No intermediate storage, no S3, no Snowpipe.
Under low write volumes, the extra hops in the DMS pipeline may not matter much. Under sustained production load, the buffering at each stage compounds. That's the core of what these benchmarks measure.
Methodology
We used pgbench - PostgreSQL's built-in benchmarking tool - for all data generation and load testing. Nothing custom, nothing proprietary.
Populating the table (100M rows):
pgbench -i -I dtGvp -s 1000 postgresBefore each snapshot run:
-- Optimize page layout
VACUUM FULL pgbench_accounts;
-- Warm the RAM cache
SELECT COUNT(*) FROM pgbench_accounts;Generating sustained CDC load:
We ran a custom TPC-B-like script that performs UPDATE, INSERT, and transaction operations across 100,000 distinct rows:
pgbench -f "custom_tpcb.sql"-c 50 -j 8 -T 600 -M prepared -P 5 -n postgres
\set aid random(1, 100000 * :scale)
\set bid random(1, 1 * :scale)
\set tid random(1, 10 * :scale)
\set delta random(-5000, 5000)
-- pipeline mode to reduce round trips
-- https://www.postgresql.org/docs/current/libpq-pipeline-mode.html
\startpipeline
BEGIN;
UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid,:bid, :aid, :delta, CURRENT_TIMESTAMP);
END;
\endpipeline
This generated 13,460,453 events in 600 seconds - an average of ~22,400 events per second.
Measuring End-to-End Latency
We embedded source timestamps into the filler column at write time. On the Snowflake side, we compared this against Snowflake's METADATA$ROW_LAST_COMMIT_TIME - the row-level commit timestamp - to compute true end-to-end latency:
SELECT
DATE_TRUNC('minute', METADATA$ROW_LAST_COMMIT_TIME) AS minute,
COUNT(*) AS row_count,
AVG(DATEDIFF('second', FILLER::TIMESTAMP_TZ, METADATA$ROW_LAST_COMMIT_TIME)) AS avg_latency_s,
MAX(DATEDIFF('second', FILLER::TIMESTAMP_TZ, METADATA$ROW_LAST_COMMIT_TIME)) AS max_latency_s,
MIN(DATEDIFF('second', FILLER::TIMESTAMP_TZ, METADATA$ROW_LAST_COMMIT_TIME)) AS min_latency_s
FROM target_table
GROUP BY 1
ORDER BY 1;
The full custom SQL script and all measurement queries are in the benchmarks repo.
Results
1. Snapshot: Loading 100 Million Rows
Snapshot performance measures how fast each tool performs the initial backfill from Postgres into Snowflake. We ran multiple iterations of each tool, tuning parallelism where possible.
With parallelism tuned, Artie and DMS land in the same ballpark - both around 10–12 minutes for 100M rows. The key difference: Artie's snapshot speed scales with parallelism, while DMS's S3-target path doesn't expose that knob. At higher volumes, Artie’s snapshot speed really starts to compound.
2. CDC Replication Mode (Merge/Upsert)
Replication mode maintains a 1:1 copy of the source table in Snowflake. This is the most common CDC use case - keeping your warehouse in sync with production.
600 seconds of sustained writes at ~22,400 events/sec:
Artie is 68x faster than DMS.
To put this in perspective: the test ran for 10 minutes. By the end, DMS was 33 minutes behind the source database. And the gap was still widening - it wasn't catching up.
If your dashboards, reverse ETL workflows, or downstream ML pipelines depend on fresh data in Snowflake, this is the difference between "near real-time" and "we're looking at data from half an hour ago."
3. CDC History Mode (SCD Type 4)
History mode creates a separate table that records every single data mutation - every insert, update, and delete - along with a timestamp and operation type. Think of it as a full changelog for your data. (In data warehousing, this pattern is called SCD Type 2, or Slowly Changing Dimension Type 2 - basically a history table that sits alongside your current-state table.)
This is what you need for audit trails, temporal analytics, and ML feature stores where point-in-time correctness isn't optional.

600 seconds of sustained writes at ~22,400 events/sec:
Artie is 23x faster than DMS.
History mode is inherently more expensive than replication mode - every source event generates writes to both the current table and the history table. So latency is higher across the board. But Artie still kept it under 2 minutes. DMS blew past 30 minutes.
Under a short 10-second burst:
Even under a brief burst, Artie was 3.5–4x faster than DMS. The gap widens dramatically under sustained load because each hop in DMS's pipeline adds a bit more buffering delay, and those delays stack up as throughput increases.
4. WAL Lag: The Hidden Cost to Your Source Database
This is probably the most operationally important result in the whole benchmark.
Here's the deal: every CDC tool creates a replication slot on Postgres. A replication slot tells Postgres "don't throw away any WAL segments until I've confirmed I've processed them." If your CDC tool falls behind, WAL piles up on disk. And that's a problem.
High WAL retention means:
- Disk fills up on your source Postgres instance
- Query performance degrades as Postgres manages bloated WAL
- In extreme cases, the database goes down - a replication-slot-induced outage that takes your production database offline
We measured retained WAL at three checkpoints during the 600-second CDC run:

At the end of the test, DMS had accumulated 6.1 GB of retained WAL. Artie held at 385 MB.
But the trend matters more than the absolute number. Between 450s and 600s, Artie's WAL retention barely moved (381 MB → 385 MB). It was processing changes about as fast as they were being produced. DMS's WAL was still growing linearly (4.7 GB → 6.3 GB) with no sign of leveling off.
If this test had run for an hour instead of 10 minutes, DMS would have retained tens of gigabytes of WAL on the source Postgres instance. If you're running CDC against a production database, that's the kind of thing that pages your on-call engineer at 3 AM.
5. Latency Over Time: The Full Picture
The summary numbers are dramatic, but the minute-by-minute data is what really tells the story. These charts show average end-to-end latency measured every minute during the sustained CDC test.
DMS Latency Over Time - Latency starts at ~85s and climbs to nearly 2,000s (33 minutes). The line never flattens. DMS was still falling further behind when the test ended.
Artie Latency Over Time - Artie latency stays between 10s and 97s. It drifts up gradually under load but stays in the same ballpark.
Side-by-side at the same timestamp (22:25):
Nine minutes in, DMS was already 8.5x slower than Artie. By the end of the test, DMS hit 1,980s while Artie's last reading was 97s - a 20x gap, and still growing.
Why the Gap Is So Large
You might be wondering: can you just tune DMS to close this gap? Throw it a bigger instance, tweak the batch sizes, adjust the buffer settings?
Short answer: no. The performance difference is architectural. You can optimize DMS's settings (and we tried - all our task configs are in the benchmarks repo), but you can't eliminate the fundamental overhead of routing data through four intermediate stages.
AWS DMS routes data through four stages:
- Postgres → DMS Instance - The DMS instance reads from the WAL and buffers changes.
- DMS Instance → S3 - Changes get batched into Parquet files and written to S3. Batching adds inherent delay.
- S3 → Snowpipe - Snowpipe notified for new files.
- Snowpipe → Snowflake - Files land in staging tables. A separate merge task consolidates them into the final table.
Artie collapses this into two stages:
- Postgres → Artie - Reads directly from the logical replication stream into Kafka.
- Artie → Snowflake - Two parallel writes, no intermediate file storage in either path:
- History table - Events are written directly, preserving the full changelog of every insert, update, and delete.
- Target table - A staging table is merged into the target using optimized MERGE statements, maintaining a 1:1 copy of the source.
Fewer hops, no disk I/O in the middle, and a merge engine that was purpose-built for continuous streaming - not one-time migrations.
The Schema Evolution Problem
Benchmarks tend to focus on throughput and latency under stable conditions. But production databases don't stay stable. Schemas change constantly - new columns get added, types get altered, columns get dropped or renamed. How a CDC pipeline handles those changes determines whether it's a tool you can trust in production or one that creates incidents.
With AWS DMS, all schema evolution is manual. When a column changes on the source, you have to pause the entire pipeline and then update the schema in every place it's referenced:
- The COPY statement that loads data from S3
- The history table definition
- The MERGE statement that reconciles staging with the target table
- The replication task configuration itself
If you miss any one of these, the pipeline either fails or - worse - silently writes incorrect data. And if you don't pause the pipeline before making these changes, events accumulate during the window where source and destination schemas are out of sync. At that point, the only recovery path is a full backfill.
This is what the process looks like for common schema changes:
For teams running DMS against databases with frequent schema changes, this becomes a recurring operational burden. Each change carries the risk of a failed pipeline, corrupted data, or a forced backfill - and the manual steps multiply across every table under replication.
Artie handles schema evolution automatically. When the source schema changes, the platform detects the change in the CDC stream and applies it to the destination without pausing replication or requiring manual DDL updates. No events are lost, no backfill is needed, and the pipeline continues without interruption.
This wasn't part of our quantitative benchmark, but in practice it's one of the largest differences in operational cost between the two systems.
If you're evaluating CDC tools for Postgres to Snowflake and want to see how Artie handles your workload, book a demo or try Artie.
Reproducing This Benchmark
We're publishing everything. No cherry-picked configs, no hidden settings. All configurations, scripts, and queries are in the open:
- Benchmark code and configs: github.com/artie-labs/benchmarks
- pgbench: Bundled with PostgreSQL
- Latency measurement: Snowflake's
METADATA$ROW_LAST_COMMIT_TIME - DMS task settings: Published in the benchmarks repo
If you run these benchmarks yourself and get different results, we genuinely want to hear about it - reach out.
Conclusion
For snapshots, both tools land in the same ballpark - around 10-12 minutes for 100M rows.
The real story is CDC. Under sustained production-level writes:
- Replication mode: Artie averaged 29.7 seconds. DMS averaged 33+ minutes and was still falling behind. (68x)
- History mode: Artie averaged 84.8 seconds. DMS averaged 33 minutes. (23x)
- Source database impact: Artie retained 385 MB of WAL. DMS retained 6.1 GB and was still growing. (16x)
This isn't about configuration. We published every DMS setting we used. The gap is architectural - four intermediate stages vs. a direct stream.
Appendix: Latency Over Time
The raw data and the scripts used to generate these charts are available in the GitHub repo: github.com/artie-labs/benchmarks.




.jpg)



.png)
