Azure Database for PostgreSQL Part 2— Data Migration

Create Sample Data in Source PostgreSQL Server

If everything is set up correctly in Part 1, PostgreSQL server could be accessed without issues. Follow through this article to create a new PostgreSQL DB inside the server and insert sample data. The output should be similar to below. This PostgreSQL server is without CMK data encryption.

# login the postgresql server
- psql "sslmode=verify-full sslrootcert="./BaltimoreCyberTrustRoot.crt" host=jonwpostgresqlsrv2.postgres.database.azure.com port=5432 dbname=postgres user=jonw@jonwpostgresqlsrv2 password=xxxx"
# show all databases in the server
- \l
# change to the right database
- \c <database name>
# show all tables in the target database
- \dt
# show all the data inside the target table
- select * from <table name>;

Create the Same Table Schema in Destination PostgreSQL Server / Database

We would need to ensure the data coming from source PostgreSQL server/DB could sit somewhere in the destination PostgreSQL server/DB with the same table schema. This PostgreSQL server is with CMK data encryption.

# login the postgresql server
- psql "sslmode=verify-full sslrootcert="./BaltimoreCyberTrustRoot.crt" host=jonwpostgresqlsrv2.postgres.database.azure.com port=5432 dbname=postgres user=jonw@jonwpostgresqlsrv2 password=xxxx"
# create a new database
- create database <database name>;
# show all databases in the server
- \l
# change to the right database
- \c <database name>
# create a new table
- CREATE TABLE users(
id SERIAL PRIMARY KEY,
email VARCHAR(40) NOT NULL UNIQUE
);
# show all tables in the database
- \dt
# show data in the table
- SELECT * FROM users;

Create Azure Data Factory (ADF)

ADF serves as the middle man of the data migration process as it could link to different Azure services and handle data operations. If you would need step-by-step guidance on the service creation, please check here. After everything completes, the service should be shown as below.

Data Migration

When you read through the article above, you would notice within ADF, we would need to create

  • Source PostgreSQL server linked service, source dataset
  • Destination PostgreSQL server linked service, destination dataset
  • Copy data pipeline
  • In ADF, click on the pencil icon (author) → ellipsis on the side of datasets → New dataset.
  • Search for “postgresql” → select “Azure Database for PostgreSQL”
  • Provide the new linked service a name that could be easily identified, such as “linkedservice-<postgreSQL server name>”. Click on “+ New” to start the configuration.
  • Make sure the integration runtime (IR) is in the same Azure region your PostgreSQL server is located. This is to ensure the managed compute resource (IR) has the smallest communication latency with the target PostgreSQL resource. For more information what IR is, please check here.
  • Continue to setup managed private endpoint for PostgreSQL server. This is to provide ADF a way to communicate with PostgreSQL server with private IP address in the same Azure virtual network. If no other public access is needed, “Deny public access” could be set to “yes”. For more information on managed private endpoints, please check here.
  • At first, the managed private endpoint state would be in “pending”. Not until you head over to PostgreSQL server’s private endpoint section to approve the connection would it change to “approved”. Make sure the state shows “approved” both in PostgreSQL server and ADF.
  • Make sure the server and database are the ones you are targeting.
  • Input the username, password and select SSL. Click on “Test connection” to see whether managed private endpoint to the target PostgreSQL server works.
  • “Test connection” could still be performed in the below panel. Please ensure you select the right table name.
  • After confirming all the settings, you could safely publish the newly created linked service and dataset.
  • Head back to the pencil icon (author) → click on ellipsis of pipelines → select “New pipeline”
  • Expand “Move & transform” → drag “Copy data” and drop it on the canvas on the right
  • Select source PostgreSQL server linked service as “Source”
  • Select destination PostgreSQL server as “Sink”
  • Validate the pipeline
  • Debug the pipeline and look for the reseult
  • If everything goes well, we would be seeing the same data shown in source PostgreSQL server also shown in destination PostgreSQL server’s table.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Jonathan

Jonathan

186 Followers

Learning new things about Kubernetes every day. Hopefully, the learning notes could help people on the same journey!