Use the tds_fdw extension to migrate data from SQL Server to PostgreSQL

PostgreSQL has become the preferred open-source relational database for many enterprise companies and startups, powering leading business and mobile applications. Many customers choose PostgreSQL as the target database when migrating from commercial engines such as Oracle and Microsoft SQL Server. AWS offers Amazon Relational Database Service (Amazon RDS) for PostgreSQL and Amazon Aurora PostgreSQL-Compatible Edition as fully managed PostgreSQL database services.

You can migrate from SQL Server to PostgreSQL using the AWS Schema Conversion Tool (AWS SCT) and the AWS Database Migration Service (AWS DMS). AWS SCT can convert both the source database schema and the majority of the code objects into the target database’s compatible format. With AWS DMS, you can migrate the data from the source database to the target database with minimal downtime. Another database migration method is to use Babelfish for Aurora PostgreSQL. With Babelfish, Amazon Aurora PostgreSQL supports commonly used T-SQL language and semantics which reduces the amount of code changes related to database calls in an application. As a result, the amount of application code you need to re-write is minimized, reducing the risk of any new application errors.

Some customers that have simpler migration requirements, such as one-time migration for few tables, look for quick way to migrate schema and data from SQL Server to PostgreSQL. In this post, we demonstrate an alternative data migration method using the PostgreSQL extension tds_fdw. PostgreSQL is designed to be easily extensible, and extensions are add-on modules that enhance the functionality of the database. The tds_fdw extension allows PostgreSQL to access the database’s supporting tabular data stream (TDS) protocol (used by Sybase and SQL Server databases) through a PostgreSQL feature called a foreign data wrapper. The foreign data wrapper is responsible for fetching data from the remote data source and returning it to the PostgreSQL engine. To learn more about federated query support using PostgreSQL foreign data wrapper, see Federated query support for Amazon Aurora PostgreSQL and Amazon RDS for PostgreSQL.

Solution overview

The high-level steps to implement this solution are as follows:

Configure the target PostgreSQL database.
Import the schema from the SQL Server database.
Create the target tables based on the source table definition.
Reconcile DATE-related column issues due to locale setting differences.
Perform data validation.
Clean up the intermediate objects used for migration.

Prerequisites

This post assumes you have a SQL Server instance created and running. In this post, we have an RDS for SQL Server database configured with the following information. This solution works with SQL Server set up in a non-Amazon RDS environment as well.

SQL Server version: SQL Server Standard Edition 15.00.4236.7.v1
Database endpoint: sql-source.xx.<region>.rds.amazonaws.com
Database name: dms_sample
Admin user: admin
Admin user password: your_complex_password
Application user: dms_user
Application user password: your_complex_password
Database client tool: SQL Server Management Studio (SSMS)

We also assume you have a PostgreSQL database created and running. In this post, we have the following target database. Although we use Amazon Aurora PostgreSQL, this solution works with Amazon RDS for PostgreSQL as long as the required tds_fdw extension is supported.

Aurora PostgreSQL version: 15.2
Database endpoint: pgsql-target.cluster-xx.<region>.rds.amazonaws.com
Database name: fdwdemo
Admin user: postgresql
Admin user password: your_complex_password
Database client tool: psql

The following diagram illustrates our database configuration.

architecture overview

We use the Amazon DMS Sample Database for SQL Server schema to demonstrate the migration process.

This solution involves the creation and utilization of AWS resources and therefore will incur costs on your account, including but not limited to RDS for SQL Server instances and storage costs and Aurora instances and storage costs. Refer to AWS Pricing for more information.

Configure the target PostgreSQL database

Perform the following procedures to set up the tds_fdw extension and create an empty schema to store the foreign tables and target tables to be migrated:

Connect to the target database fdwdemo, using an account that has rds_superuser permissions. You can use the postgres user, which is the admin user created during the RDS instance creation with such permission. Enter the password when prompted.
```
psql --host=pgsql-target.cluster-xx.<region>.rds.amazonaws.com --port=5432 --dbname=fdwdemo --username=postgres
```
Create an empty schema. Here, we use the fdw schema to store foreign tables referencing directly to the SQL Server database. We use the dms_sample schema to store objects migrated from SQL Server. Although an empty schema is not compulsory, it makes the cleanup and troubleshooting easier compared to working with existing schemas with other objects inside.

CREATE SCHEMA fdw;
CREATE SCHEMA dms_sample;

The output should look like the following:

fdwdemo=> CREATE SCHEMA fdw;
CREATE SCHEMA
fdwdemo=> CREATE SCHEMA dms_sample;
CREATE SCHEMA

The following architecture illustrates our setup between databases.

fdw mapping

Run the following commands to create the user dms_user, which will be used as the least privilege user to carry out the data migration:
```
CREATE USER dms_user WITH PASSWORD 'your_complex_password';
GRANT ALL PRIVILEGES ON SCHEMA fdw TO dms_user;
GRANT ALL PRIVILEGES ON SCHEMA dms_sample TO dms_user;
```
The output should look like the following:

fdwdemo=> CREATE USER dms_user WITH PASSWORD 'your_complex_password';
CREATE ROLE
fdwdemo=> GRANT ALL PRIVILEGES ON SCHEMA fdw TO dms_user;
GRANT
fdwdemo=> GRANT ALL PRIVILEGES ON SCHEMA dms_sample TO dms_user;
GRANT

Run the following command to install the tds_fdw extension:

CREATE EXTENSION tds_fdw;

The output should look like the following:

fdwdemo=> CREATE EXTENSION tds_fdw;
CREATE EXTENSION

Create a foreign server (SQL Server) in the target database (PostgreSQL) and grant permission for dms_user to use it. The foreign server represents the remote database you want to connect from PostgreSQL. In this case, it’s the dms_sample database in the sql-source.xx.<region>.rds.amazonaws.com instance. Refer to TDS Foreign data wrapper for advanced options such as TDS version and optimizer-related parameters used for query planning.

CREATE SERVER mssql_dms_sample FOREIGN DATA WRAPPER tds_fdw
OPTIONS (servername 'sql-source.xx.<region>.rds.amazonaws.com', port '1433', database 'dms_sample');
GRANT USAGE ON FOREIGN SERVER mssql_dms_sample TO dms_user;

The output should look like the following:

fdwdemo=> CREATE SERVER mssql_dms_sample FOREIGN DATA WRAPPER tds_fdw
fdwdemo-> OPTIONS (servername 'sql-source.xx.<region>.rds.amazonaws.com', port '1433', database 'dms_sample');
CREATE SERVER

fdwdemo=> GRANT USAGE ON FOREIGN SERVER mssql_dms_sample TO dms_user;  
GRANT

Specify a user name and password pair for the user dms_user to log in to the remote database specified in the foreign server. The dms_user inside the OPTIONS clauses refers to the SQL login created on the source SQL Server.

CREATE USER MAPPING FOR dms_user
SERVER mssql_dms_sample
OPTIONS (username 'dms_user', password 'your_complex_password');

The output should look like the following:

fdwdemo=> CREATE USER MAPPING FOR dms_user
fdwdemo-> SERVER mssql_dms_sample
fdwdemo-> OPTIONS (username 'dms_user', password 'your_complex_password');
CREATE USER MAPPING

Import the schema from the SQL Server database

Complete the following steps to import your schema:

Connect to the target database fdwdemo using the user dms_user created previously. Enter the password when prompted.

psql --host=pgsql-target.cluster-xx.<region>.rds.amazonaws.com --port=5432 --dbname=fdwdemo --username=dms_user

Instead of using CREATE FOREIGN TABLE one by one for each table required for migration, you can use the IMPORT FOREIGN SCHEMA command to create foreign tables in bulk based on the import options. In the following SQL commands, we exclude the view getNewID during foreign table creation and include column DEFAULT expressions in the definitions of foreign tables:

IMPORT FOREIGN SCHEMA dbo
     EXCEPT ("getNewID")
     FROM SERVER mssql_dms_sample
     INTO fdw
     OPTIONS (import_default 'true');

The output should look like the following:

fdwdemo=> IMPORT FOREIGN SCHEMA dbo
fdwdemo->     EXCEPT ("getNewID")
fdwdemo->     FROM SERVER mssql_dms_sample
fdwdemo->     INTO fdw
fdwdemo->     OPTIONS (import_default 'true');
IMPORT FOREIGN SCHEMA

After you import the foreign tables, run the following command to list them in the schema fdw from the target database:

fdwdemo=> \dE fdw.*
                       List of relations
 Schema |            Name            |     Type      |  Owner   
--------+----------------------------+---------------+----------
 fdw    | mlb_data                   | foreign table | dms_user
 fdw    | name_data                  | foreign table | dms_user
 fdw    | nfl_data                   | foreign table | dms_user
 fdw    | nfl_stadium_data           | foreign table | dms_user
 fdw    | person                     | foreign table | dms_user
 fdw    | player                     | foreign table | dms_user
 fdw    | seat                       | foreign table | dms_user
 fdw    | seat_type                  | foreign table | dms_user
 fdw    | sport_division             | foreign table | dms_user
 fdw    | sport_league               | foreign table | dms_user
 fdw    | sport_location             | foreign table | dms_user
 fdw    | sport_team                 | foreign table | dms_user
 fdw    | sport_type                 | foreign table | dms_user
 fdw    | sporting_event             | foreign table | dms_user
 fdw    | sporting_event_info        | foreign table | dms_user
 fdw    | sporting_event_ticket      | foreign table | dms_user
 fdw    | sporting_event_ticket_info | foreign table | dms_user
 fdw    | ticket_purchase_hist       | foreign table | dms_user
(18 rows)

Optionally, run the following query to count the number of rows in the source database via a foreign data wrapper. The SELECT statement generates these SELECT COUNT(*) statements for each foreign tables in the schema fdw, and the \gexec command runs the generated statements one by one:

SELECT 'SELECT COUNT(*) as ' || foreign_table_name||'_CNT FROM fdw.' || foreign_table_name|| ';'
FROM information_schema.foreign_tables
 WHERE foreign_table_schema ='fdw'
AND foreign_server_name = 'mssql_dms_sample';
\gexec

Create target tables based on the source table definition

Run the following script to create target tables based on the foreign tables imported in the previous steps and populate the data after table creation using CREATE TABLE ... AS SELECT... (CTAS) SQL syntax. Based on the size of the source and destination, it could take several minutes to create tables and transfer the data.

If you have already created a target table using other methods, such as AWS SCT or Liquibase, generate the script with the syntax INSERT INTO..SELECT.. FROM.. instead. For example, INSERT INTO dms_sample.ticket_purchase_hist AS SELECT * FROM fdw.ticket_purchase_hist.

SELECT 'CREATE TABLE dms_sample.' || foreign_table_name || ' AS SELECT * FROM fdw.' || foreign_table_name || ';'
FROM information_schema.foreign_tables
WHERE foreign_table_schema ='fdw'
AND foreign_server_name = 'mssql_dms_sample';
\gexec

The generated script and execution output is as following, omitting some output for better reading experience.

fdwdemo=> SELECT 'CREATE TABLE dms_sample.' || foreign_table_name || ' AS SELECT * FROM fdw.' || foreign_table_name || ';'
fdwdemo-> FROM information_schema.foreign_tables
fdwdemo-> WHERE foreign_table_schema ='fdw'
fdwdemo-> AND foreign_server_name = 'mssql_dms_sample';
                                              ?column?
--------------------------------------------------------------------------------------
 CREATE TABLE dms_sample.ticket_purchase_hist AS SELECT * FROM fdw.ticket_purchase_hist;
 CREATE TABLE dms_sample.sporting_event_ticket_info AS SELECT * FROM fdw.sporting_event_ticket_info;
 CREATE TABLE dms_sample.sporting_event_ticket AS SELECT * FROM fdw.sporting_event_ticket;
<<omitted output>>
 CREATE TABLE dms_sample.name_data AS SELECT * FROM fdw.name_data;
 CREATE TABLE dms_sample.mlb_data AS SELECT * FROM fdw.mlb_data;
(18 rows)
fdwdemo=> \gexec
NOTICE:  tds_fdw: Query executed correctly
NOTICE:  tds_fdw: Getting results
SELECT 0
NOTICE:  tds_fdw: Query executed correctly
NOTICE:  tds_fdw: Getting results
SELECT 57588768
NOTICE:  tds_fdw: Query executed correctly
NOTICE:  tds_fdw: Getting results
SELECT 57588768
<<omitted output>>
NOTICE:  tds_fdw: Query executed correctly
NOTICE:  tds_fdw: Getting results
ERROR:  invalid input syntax for type date: "Mar 31 2023 12:00:00:AM"
<<omitted output>>
NOTICE:  tds_fdw: Query executed correctly
NOTICE:  tds_fdw: Getting results
SELECT 5360
NOTICE:  tds_fdw: Query executed correctly
NOTICE:  tds_fdw: Getting results
SELECT 2230

Ignore the message related to ERROR: invalid input syntax for the SQL Server DATETIME data type at this stage. We explain the causes and share workarounds in next section.

Handle DATE-related columns due to locale setting differences

As part of troubleshooting, let’s examine the foreign table structure in PostgreSQL and the source table structure in SQL Server for the table sporting_event.

To display the foreign table column definition, run \d fdw.sporting_event in your psql terminal:

fdwdemo=> \d fdw.sporting_event
                                         Foreign table "fdw.sporting_event"
     Column      |              Type              | Collation | Nullable | Default |           FDW options
-----------------+--------------------------------+-----------+----------+---------+---------------------------------
 id              | bigint                         |           | not null |         | (column_name 'id')
 sport_type_name | character varying(15)          |           | not null |         | (column_name 'sport_type_name')
 home_team_id    | integer                        |           | not null |         | (column_name 'home_team_id')
 away_team_id    | integer                        |           | not null |         | (column_name 'away_team_id')
 location_id     | integer                        |           | not null |         | (column_name 'location_id')
 start_date_time | timestamp(3) without time zone |           | not null |         | (column_name 'start_date_time')
 start_date      | date                           |           |          |         | (column_name 'start_date')
 sold_out        | integer                        |           | not null | 0       | (column_name 'sold_out')
Server: mssql_dms_sample
FDW options: (schema_name 'dbo', table_name 'sporting_event')

To check the source table definition, script the table definition for sporting_event using SQL Server Management Studio (SSMS) connected to the source SQL Server database. For instructions, refer to Script tables. The following SQL script is the output of the script table definition using SSMS:

CREATE TABLE [dbo].[sporting_event](
    [id] [bigint] IDENTITY(1,1) NOT NULL,
    [sport_type_name] [varchar](15) NOT NULL,
    [home_team_id] [int] NOT NULL,
    [away_team_id] [int] NOT NULL,
    [location_id] [int] NOT NULL,
    [start_date_time] [datetime] NOT NULL,
    [start_date]  AS (CONVERT([date],[start_date_time])),
    [sold_out] [int] NOT NULL,
 CONSTRAINT [sporting_event_pk] PRIMARY KEY CLUSTERED 
(
    [id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

To confirm the root cause of the error ERROR: invalid input syntax that occurred previously, query start_date alone. The same error should appear as expected:

fdwdemo=> select start_date from fdw.sporting_event limit 1;
NOTICE:  tds_fdw: Query executed correctly
NOTICE:  tds_fdw: Getting results
ERROR:  invalid input syntax for type date: "Mar 31 2023 12:00:00:AM"

As part of the foreign table access, the data in the DATE data type in SQL Server converts into TEXT format, and later converts to the DATE data type. The default date format of MM DD YYYY HH12:MI:SS:AM through the foreign server is not compatible with the PostgreSQL default setting, which is YYYY-MM-DD. A similar problem would be applicable for the SQL Server DATETIME2 data type column as well.

Workaround 1: Create a view to map the source table data type from `DATE` to `DATETIME`

In this workaround, we convert the DATE data type to DATETIME, which maps to TIMESTAMP in PostgreSQL. To minimize the impact on the source database, we create a view for this change instead of making a direct modification in the source SQL Server database. Use SSMS to connect to the source SQL Server database dms_sample with SQL login dms_user and run the following SQL to create the view:

create view dbo.sporting_event_v
as
select id, 
  sport_type_name, 
  home_team_id, 
  away_team_id, 
  location_id, 
  start_date_time,
  CONVERT(datetime,start_date) start_date_wrap, 
  sold_out
from dbo.sporting_event;

In the target PostgreSQL database, we import the view sporting_event_v as a new foreign table. The LIMIT TO clause is useful here to restrict importing to only this table.

IMPORT FOREIGN SCHEMA dbo
    LIMIT TO (sporting_event_v)
    FROM SERVER mssql_dms_sample
    INTO fdw
    OPTIONS (import_default 'true');

The output should look like the following:

fdwdemo=> IMPORT FOREIGN SCHEMA dbo
fdwdemo->     LIMIT TO (sporting_event_v)
fdwdemo->     FROM SERVER mssql_dms_sample
fdwdemo->     INTO fdw
fdwdemo->     OPTIONS (import_default 'true');
IMPORT FOREIGN SCHEMA

fdwdemo=> \d fdw.sporting_event_v
                                        Foreign table "fdw.sporting_event_v"
     Column      |              Type              | Collation | Nullable | Default |           FDW options
-----------------+--------------------------------+-----------+----------+---------+---------------------------------
 id              | bigint                         |           | not null |         | (column_name 'id')
 sport_type_name | character varying(15)          |           | not null |         | (column_name 'sport_type_name')
 home_team_id    | integer                        |           | not null |         | (column_name 'home_team_id')
 away_team_id    | integer                        |           | not null |         | (column_name 'away_team_id')
 location_id     | integer                        |           | not null |         | (column_name 'location_id')
 start_date_time | timestamp(3) without time zone |           | not null |         | (column_name 'start_date_time')
 start_date_wrap | timestamp(3) without time zone |           |          |         | (column_name 'start_date_wrap')
 sold_out        | integer                        |           | not null |         | (column_name 'sold_out')
Server: mssql_dms_sample
FDW options: (schema_name 'dbo', table_name 'sporting_event_v')

Let’s create a target table without data based on fdw.sporting_event. It completes successfully because no actual data conversion is involved:

CREATE TABLE dms_sample.sporting_event AS SELECT * FROM fdw.sporting_event WHERE 1=2;

The output should look like the following:

fdwdemo=>CREATE TABLE dms_sample.sporting_event AS SELECT * FROM fdw.sporting_event WHERE 1=2;
SELECT 0

Insert the data based on the sporting_event_v view and convert the data type from TIMESTAMP to DATE as part of the INSERT..SELECT.. statement:

INSERT INTO dms_sample.sporting_event
(id,sport_type_name,home_team_id,away_team_id,location_id,start_date_time,start_date,sold_out)
SELECT id,sport_type_name,home_team_id,away_team_id,location_id,start_date_time,start_date_wrap::date,sold_out
FROM fdw.sporting_event_v;

The output should look like the following:

fdwdemo=> INSERT INTO dms_sample.sporting_event
fdwdemo-> (id,sport_type_name,home_team_id,away_team_id,location_id,start_date_time,start_date,sold_out)
fdwdemo-> SELECT id,sport_type_name,home_team_id,away_team_id,location_id,start_date_time,start_date_wrap::date,sold_out
fdwdemo-> FROM fdw.sporting_event_v;
NOTICE:  tds_fdw: Query executed correctly
NOTICE:  tds_fdw: Getting results
INSERT 0 1158

Workaround 2: Use the `TEXT` data type instead of `DATE` in the PostgreSQL foreign table

Because PostgreSQL can’t convert the data into the DATE data type due to the improper format, we can use the TEXT data type instead.

Drop the start_date column and add it back using the TEXT data type:

alter foreign table fdw.sporting_event drop column start_date;
alter foreign table fdw.sporting_event add column start_date text;

The output should look like the following:

fdwdemo=> alter foreign table fdw.sporting_event drop column start_date;
ALTER FOREIGN TABLE
fdwdemo=> alter foreign table fdw.sporting_event add column start_date text;
ALTER FOREIGN TABLE

fdwdemo=> \ d fdw.sporting_event
                                         Foreign table "fdw.sporting_event"
     Column      |              Type              | Collation | Nullable | Default |           FDW options
-----------------+--------------------------------+-----------+----------+---------+---------------------------------
 id              | bigint                         |           | not null |         | (column_name 'id')
 sport_type_name | character varying(15)          |           | not null |         | (column_name 'sport_type_name')
 home_team_id    | integer                        |           | not null |         | (column_name 'home_team_id')
 away_team_id    | integer                        |           | not null |         | (column_name 'away_team_id')
 location_id     | integer                        |           | not null |         | (column_name 'location_id')
 start_date_time | timestamp(3) without time zone |           | not null |         | (column_name 'start_date_time')
 sold_out        | integer                        |           | not null | 0       | (column_name 'sold_out')
 start_date      | text                           |           |          |         |
Server: mssql_dms_sample
FDW options: (schema_name 'dbo', table_name 'sporting_event')

Create the target table dms_sample.sporting_event based on the foreign table fdw.sporting_event, with the data type conversion from TEXT to DATE for column start_date:

CREATE TABLE dms_sample.sporting_event AS
SELECT id,
  sport_type_name,
  home_team_id,
  away_team_id,
  location_id,
  start_date_time,
  to_date(start_date,'Mon DD YYYY HH12:MI:SS:AM') start_date,
  sold_out
FROM fdw.sporting_event;

The output should look like the following:

fdwdemo=> CREATE TABLE dms_sample.sporting_event AS
fdwdemo-> SELECT id,
fdwdemo->   sport_type_name,
fdwdemo->   home_team_id,
fdwdemo->   away_team_id,
fdwdemo->   location_id,
fdwdemo->   start_date_time,
fdwdemo->   to_date(start_date,'Mon DD YYYY HH12:MI:SS:AM') start_date,
fdwdemo->   sold_out
fdwdemo-> FROM fdw.sporting_event;
NOTICE:  tds_fdw: Query executed correctly
NOTICE:  tds_fdw: Getting results
SELECT 1158

Perform data validation

Data validation is critical to ensure that your data was migrated accurately from the source to the target. In this section, we demonstrate how to perform data validation by examining the row count and then an actual data comparison. Row count is faster, but may not able to catch data value discrepancies.

The following is an example of performing data validation based on row count for the table sporting_event:

SELECT target_count, source_count FROM
(SELECT COUNT(*) target_count FROM dms_sample.sporting_event) a,
(SELECT COUNT(*) source_count FROM fdw.sporting_event_v) b;

The output should look like the following:

fdwdemo=> SELECT target_count, source_count FROM
fdwdemo-> (SELECT count(*) target_count FROM dms_sample.sporting_event) a,
fdwdemo-> (SELECT count(*) source_count FROM fdw.sporting_event_v) b;
NOTICE:  tds_fdw: Query executed correctly
NOTICE:  tds_fdw: Getting results
WARNING:  Table definition mismatch: Foreign source has column named, but target table does not. Column will be ignored.
 target_count | source_count
--------------+--------------
         1158 |         1158

The warning message Table definition mismatch: Foreign source has column named, but target table does not. Column will be ignored. is related to how the tds_fdw extension interprets the value in the COUNT function. If we perform COUNT on an existing column such as id instead of *, there won’t be a warning message, but COUNT(*) returns the count for all rows, and COUNT(id) only returns the count for rows for which the ID is NOT NULL.

Generate the row count comparison SQL statements and run them using \gexec in psql:

SELECT replace('select ''$$$'' Table_name, target_count, source_count,(CASE WHEN (target_count - source_count) = 0 THEN ''OK'' ELSE ''ERROR'' END) from (select count(*) target_count from dms_sample.$$$ ) a, (select count(*) source_count from fdw.$$$ ) b;','$$$',foreign_table_name)
FROM information_schema.foreign_tables
WHERE foreign_table_schema ='fdw'
AND foreign_server_name = 'mssql_dms_sample';
\gexec

The following is an example of data comparison for the table sporting_event:

select in_target_not_in_source, in_source_not_in_target from 
(select count(*) in_target_not_in_source from
(select * from dms_sample.sporting_event except select * from fdw.sporting_event_v) in_target) a,
(select count(*) in_source_not_in_target from
(select * from fdw.sporting_event_v except select * from dms_sample.sporting_event) in_source) b;

The output should look like the following:

fdwdemo=> select in_target_not_in_source, in_source_not_in_target from
fdwdemo-> (select count(*) in_target_not_in_source from
fdwdemo(> (select * from dms_sample.sporting_event except select * from fdw.sporting_event_v) in_target) a,
fdwdemo-> (select count(*) in_source_not_in_target from
fdwdemo(> (select * from fdw.sporting_event_v except select * from dms_sample.sporting_event) in_source) b;
NOTICE:  tds_fdw: Query executed correctly
NOTICE:  tds_fdw: Getting results
NOTICE:  tds_fdw: Query executed correctly
NOTICE:  tds_fdw: Getting results
 in_target_not_in_source | in_source_not_in_target
-------------------------+-------------------------
                       0 |                       0

Generate the data comparison statements and run them using \gexec in psql:

SELECT replace('select in_target_not_in_source, in_source_not_in_target from (select count(*) in_target_not_in_source from (select * from dms_sample.$$$ except select * from fdw.$$$) in_target) a, (select count(*) in_source_not_in_target from (select * from fdw.$$$ except select * from dms_sample.$$$) in_source) b;','$$$',foreign_table_name)
FROM information_schema.foreign_tables
WHERE foreign_table_schema ='fdw'
AND foreign_server_name = 'mssql_dms_sample';
\gexec

Clean up intermediate objects used for migration

These foreign tables are no longer required after migration, so you can drop them programmatically:

DO $$
DECLARE
table_name text;
BEGIN
    FOR table_name IN 
    SELECT foreign_table_name 
        FROM information_schema.foreign_tables 
    WHERE foreign_table_schema ='fdw'
        AND foreign_server_name = 'mssql_dms_sample'
    LOOP
    RAISE NOTICE 'Drop foreign table %', table_name;
    EXECUTE 'DROP FOREIGN TABLE fdw.' || table_name;
    END LOOP;
END;
$$;

Similarly, drop the foreign server and related objects. The CASCADE keyword drops the user mapping for the foreign server as well:

fdwdemo=> drop server mssql_dms_sample cascade;
NOTICE:  drop cascades to user mapping for postgres on server mssql_dms_sample
DROP SERVER

Additional use case: Use an interim RDS for SQL Server database when direct connection is not feasible

Using tds_fdw to migrate data requires direct network connectivity from the source SQL Server database to the target PostgreSQL database. In real-world migration scenarios, there are use cases where direct network connectivity is not feasible. One workaround is to take a SQL Server native backup, upload the backup to an Amazon Simple Storage Service (Amazon S3) bucket, and then restore to an intermittent SQL Server. See Importing and exporting SQL Server databases using native backup and restore for steps to restore a native backup from on-premises SQL Server to Amazon RDS for SQL Server. The following diagram illustrates this architecture.

Database backup

Clean up

To avoid ongoing costs, delete the resources that you created as part of this post:

Conclusion

In this post, we showed you how to use the tds_fdw extension to migrate data from SQL Server to PostgreSQL, as well as how to address known issues related to the DATE data type in SQL Server. This approach works well for customers who have simple data migration requirements. In this case, you can migrate your data directly over the network via the tds_fdw extension without using AWS DMS.

If you have any comments or questions, leave them in the comment section.

About the Authors

Donghua Luo is a Senior RDS Specialist Solutions Architect. He works with AWS customers to migrate and achieve higher flexibility, scale, and resiliency with database services in AWS cloud.

Bill Baldwin is a Principal Specialist Solutions Architect at AWS. Bill has over 20 years of experience working with data. He is passionate about providing technical guidance to our global customers, on their AWS database, analytics, and ML projects.

Venkataramanan Govindarajan is a Senior Database Specialist Solutions Architect at Amazon Web Services. He works with our customers to provide architecture guidance and database solutions, helping them innovate using AWS services to improve business value.

AWS Database Blog

Use the tds_fdw extension to migrate data from SQL Server to PostgreSQL

Solution overview

Prerequisites

Configure the target PostgreSQL database

Import the schema from the SQL Server database

Create target tables based on the source table definition

Handle DATE-related columns due to locale setting differences

Workaround 1: Create a view to map the source table data type from DATE to DATETIME

Workaround 2: Use the TEXT data type instead of DATE in the PostgreSQL foreign table

Perform data validation

Clean up intermediate objects used for migration

Additional use case: Use an interim RDS for SQL Server database when direct connection is not feasible

Clean up

Conclusion

About the Authors

Resources

Blog Topics

Follow

Workaround 1: Create a view to map the source table data type from `DATE` to `DATETIME`

Workaround 2: Use the `TEXT` data type instead of `DATE` in the PostgreSQL foreign table