At work, we recently had to move a few tables from one PostgreSQL instance to another. In my previous post, I discussed how to use Google’s Database Migration Service (DMS) to migrate data from one instance to another. Unfortunately, that option was not available here, since DMS only allows the migration of an entire database, not specific tables within a database.
We chose the native logical replication option. It’s a much more involved process compared to using DMS, but it provides greater flexibility and allows replication of specific tables only.
Grant access to user accounts
Let’s assume you already have both the source and destination PostgreSQL instances ready. You need to grant replication access to the user accounts on both the source and destination databases. In the case of Cloud SQL, we had to grant the REPLICATION role to the user account; this may vary for other instances.
Copy over schema
The next step is to copy over the schema. The table schema needs to be identical on both the source and destination. We used pg_dump to dump and restore the schema, as we had to move more than 50 tables.
There are a few nuances related to constraints as well, which we will cover soon. Before that, let’s try to understand a bit more about how logical replication works.
Logical replication runs in two modes:
1) Initial dump: It copies the data from the source to the destination.
2) CDC: Once the initial dump is done, it switches to CDC mode and applies changes to the destination in real time.
The key point here is that during the initial dump, some constraints can’t be enforced. For example, if you have a foreign key relationship between two tables, dumping one of the tables will throw an error if the referenced column hasn’t been dumped yet.
To solve this problem and to speed up the initial dump process, we first create tables without constraints and indexes. Indexes slow down the dump stage, and it’s easier to rebuild them once the initial dump is complete.
pg_dump provides a useful flag to dump only the table definitions without indexes and constraints.
The --section flag allows us to control what is dumped. pre-data dumps only the table definitions, while post-data dumps all constraints and indexes.
Restore table definition
There’s one more catch here: logical replication depends on the primary key, so you need to create the primary key constraint in addition to the table definition. However, primary key constraints are in the post-data.sql file. Open it in an editor and remove everything except the primary key constraints.
Restore primary key constrain
Functions and others
If you use PostgreSQL functions, enums, or anything that isn’t covered by pg_dump, you might have to handle those manually. When you specify the -t table option, it only copies objects directly related to the tables. Enums and functions don’t fall under that.
Set up publication and subscription
At this point, the table structure should be identical on both instances for the tables that are going to be migrated. The publication should be created on the source instance.
A corresponding subscription needs to be created on the destination instance.
Replace {variable} with the respective values. If the verify CA option is enabled on the source instance, you need to disable it and enable only SSL mode.
Once the subscription is created, PostgreSQL starts copying data from the source to the destination instance. It performs an initial data dump for each table and then switches to CDC mode.
You need to wait until the initial dump is complete and it moves to the CDC state with near-zero lag. PostgreSQL exposes this information through several tables such as pg_replication_slots, pg_stat_replication, and pg_stat_subscription.
Add indexes and foreign keys
Once the replication moves to the CDC state, you can create indexes and foreign keys. The post-data.sql file contains all indexes and foreign keys. Remove the primary key constraints and keep the rest.
This will take quite some time if you have a lot of data.
Analyze
analyze is one of the most often overlooked steps when moving or upgrading PostgreSQL instances. PostgreSQL depends on the statistics generated by this command to create an efficient query plan. Without these statistics, it might choose an inefficient plan, and a query that used to take 50 ms could turn into a 1-second query on your new instance. So make sure to run analyze once the indexes are created. If you have more time, you can also run a vacuum as well.
Switchover
Sequence
Your data is now available on both systems, and you are nearly ready to switch your traffic from the source to the destination.
PostgreSQL copies the data and keeps the indexes in sync, but it doesn’t sync the sequences. You have to do that manually.
You can view the current value on the source instance, and then set it to a higher value on the destination instance.
You can run the snippet above on the source instance; the output is a SQL query that you can run on the destination instance. The buffer value is up to you. The key point is that after running the query on the destination instance, you need to perform the switchover before the sequence values on the source instance exceed those on the destination instance.
Disable writes
Once the sequence is updated, stop sending writes to the source PostgreSQL instance. Wait for the replication lag to reach zero, and then switch all writes to the destination PostgreSQL instance. You can monitor the replication lag using pg_replication_slots.
PgBouncer
The amount of downtime depends on how your app is architected, whether your application can run in read-only mode, and other factors. PgBouncer can help significantly in this regard, and it’s what we used to achieve near-zero downtime.
PgBouncer is a PostgreSQL proxy. The key feature relevant to our situation is that it allows configuration changes without requiring a restart. Assume you have a database named myapp that’s configured to connect to the source instance. You can edit the PgBouncer config file to update the connection details to point to the destination instance. Then, connect to the PgBouncer admin console and run the following commands.
The first command pauses all connections from PgBouncer to the source PostgreSQL instance. The command blocks until all in-flight queries are completed, and new queries are queued. RELOAD reloads the configuration from disk. You can run SHOW DATABASES to quickly verify that the new configuration has been loaded. RESUME then resumes connectivity, now to the new destination PostgreSQL instance.
If you don’t have any long-running queries, this process can result in near-zero downtime, as no queries are dropped. As long as RESUME is executed quickly, users will notice at most a slight increase in latency.
You can use the following query to check for long-running queries and terminate them if needed.
In the worst-case scenario, if the PAUSE command hangs because of a long-running query, you can forcefully restart PgBouncer. However, this will result in errors for any active connections.
Cleanup
Once you’re confident everything is working correctly, you can clean up the logical replication setup.
Drop the subscription on the destination instance.
Drop the publication on the source instance.
.png)
  

