Обсуждение: MS SQL to Postgres
Hi,
I am in search of the best practices to migrate from an MS SQL database server to PostgreSQL 16
Existing DB server MSSQL with 6 Million records and 3.5 TB with 424 Tables running from 2019 onwards.
Each table has 5 to 16 columns ( basically text, numbers, lat long coordinates , time stamps, and images/voice file (stored in archive folders)reference links, etc. ).
I am in need to port / migrate all this data from this MS SQL server to Postgres16 .
1. What are the best methods and practices folks employ to do this kind of data porting operations?
2. what are the tools and techniques to explored / employed for this
3. How much time is consumed by employing the right tools, the entire porting of 6 million records of 3.5 TB size to Postgres 16 takes
4. Any hurdles or challenges or risks
Kindly enlighten me with the best practices and reference materials / links or tutorials to perform these operations successfully.
Thank you,
Krishane
On 8/20/25 10:25, KK CHN wrote: > Hi, > > I am in search of the best practices to migrate from an MS SQL database > server to PostgreSQL 16 > > Existing DB server MSSQL with 6 Million records and 3.5 TB with 424 > Tables running from 2019 onwards. > > Each table has 5 to 16 columns ( basically text, numbers, lat long > coordinates , time stamps, and images/voice file (stored in archive > folders)reference links, etc. ). > > I am in need to port / migrate all this data from this MS SQL server to > Postgres16 . > Up front, I have not moved data from MSSQL to Postgres. What follows will be generic. > 1. What are the best methods and practices folks employ to do this kind > of data porting operations? Planning and patience. It is a matter of drawing a map/diagram of where you are now(MSSQL) and where you want to end up(Postgres) with steps to get from A to B. > 2. what are the tools and techniques to explored / employed for this One that I know of: https://pgloader.io/ > 3. How much time is consumed by employing the right tools, the entire > porting of 6 million records of 3.5 TB size to Postgres 16 takes Unknowable at this point. I will say it depends on how many MSSQL specific features you use and whether there are Postgres direct equivalents or you whether you will need to do extensive modifications. > 4. Any hurdles or challenges or risks Changing database vendors. > > Kindly enlighten me with the best practices and reference materials / > links or tutorials to perform these operations successfully. > > Thank you, > Krishane > -- Adrian Klaver adrian.klaver@aklaver.com
> On 20 Aug 2025, at 19:25, KK CHN <kkchn.in@gmail.com> wrote: (…) > 4. Any hurdles or challenges or risks MS SQL defaults to case insensitive string comparisons, trimming trailing white-space. PostgreSQL defaults to case sensitive string comparisons, so incorrectly cased strings in queries that match in MS SQL willnot match in PostgreSQL. The trailing spaces bit is not going to matter while moving the data to Postgres, as you will not get any trailing spacesfrom MS SQL to be stored in PostgreSQL (they’ve been trimmed already, after all) - but it could trigger some applicationbugs where people have assumed that trailing spaces get trimmed. Also, time zone names are wildly different between the two. MS SQL uses Microsoft Windows time zone names, Postgres (andmost other RDBMSes) use IANA names. Alban Hertroys -- There is always an exception to always.
On Wed, Aug 20, 2025 at 4:15 PM Alban Hertroys <haramrae@gmail.com> wrote:
> On 20 Aug 2025, at 19:25, KK CHN <kkchn.in@gmail.com> wrote:
(…)
> 4. Any hurdles or challenges or risks
MS SQL defaults to case insensitive string comparisons, trimming trailing white-space.
PostgreSQL defaults to case sensitive string comparisons, so incorrectly cased strings in queries that match in MS SQL will not match in PostgreSQL.
The trailing spaces bit is not going to matter while moving the data to Postgres, as you will not get any trailing spaces from MS SQL to be stored in PostgreSQL (they’ve been trimmed already, after all) - but it could trigger some application bugs where people have assumed that trailing spaces get trimmed.
Also, time zone names are wildly different between the two. MS SQL uses Microsoft Windows time zone names, Postgres (and most other RDBMSes) use IANA names.
Alban Hertroys
--
There is always an exception to always.
Moving the data and schema are the easy part, it's all minor differences in the SQL implementation that bite big time.
CASE SENSITIVE vs CASE INSENSITIVE for searching
CASE SENSITIVE vs CASE INSENSITIVE for searching
Sessions/Connections can't jump databases in PostgreSQL have to create a new connection while in MSSQL if the user has permissions can connect to any database using fully qualified names database.schema.table. This is not possible in PostgreSQL there are workarounds using FDW, which is hackish.
PostgreSQL object names are case insensitive unless using double quotes. example MyTable == mytable to make case sensitive have to use double quote like so SELECT * FROM "MyTable"
Name of common functions differ LEN() == LENGTH() there are lots of these..
LIMIT OFFSET are completely different structure
How Transactions are handled you need to read up on PostgreSQL MVCC vs the MSSQL default transaction handling and Isolation level. MSSQL can be made to work like MVCC via SNAPSHOT isolation; it has to be turned on as its off by default.
Depending on how MSSQL is being used the locking behavior can be very different. Read up on pessimistic vs optimistic locking, PostgreSQL operates in optimistic locking mode by default, while MSSQL operates in a pessimistic locking mode by default.
PostgreSQL can not read rows/transactions that have NOT been committed, this is possible in MSSQL with "TRANSACTION ISOLATION LEVEL READ UNCOMMITTED" PostgreSQL will ignore that command....
Postgresql operates in Implicit Transaction mode means every command is treated as a separate transaction unless it sees a BEGIN. While MSSQL does not operate that way it expects to see a BEGIN. MSSQL can automatically add the BEGIN using the SET IMPLICIT_TRANSACTIONS ON
There are a bunch of gotchas like this that are not found during testing unless you are looking for them..
Thank you
Justin
Hi Krishane,
On Wed, Aug 20, 2025 at 1:26 PM KK CHN <kkchn.in@gmail.com> wrote:
Hi,I am in search of the best practices to migrate from an MS SQL database server to PostgreSQL 16Existing DB server MSSQL with 6 Million records and 3.5 TB with 424 Tables running from 2019 onwards.
This is definitely not a problem, I have seen hundreds of migrations with more than 2k Tables and 10TB data from MSSQL 2019.
Each table has 5 to 16 columns ( basically text, numbers, lat long coordinates , time stamps, and images/voice file (stored in archive folders)reference links, etc. ).
This is not a problem either.
I am in need to port / migrate all this data from this MS SQL server to Postgres16 .1. What are the best methods and practices folks employ to do this kind of data porting operations?
You could use Open Source migration tools like: pgloader for schema migration (excluding any code objects like procedures or functions).
Or you could also use tools like HexaRocket: www.hexarocket.com
One more extension you could try is: tds_fdw, using which you could directly query your MSSQL database and load data to PostgreSQL, but be prepared to see some surprises.
2. what are the tools and techniques to explored / employed for this
Already answered in the previous question.
3. How much time is consumed by employing the right tools, the entire porting of 6 million records of 3.5 TB size to Postgres 16 takes
While there cannot always be a direct answer, I can talk about the tool:HexaRocket for some of such migrations.
It took around 12 Hours, but remember, this can be more or even lesser depending on your Infrastructure.
4. Any hurdles or challenges or risksKindly enlighten me with the best practices and reference materials / links or tutorials to perform these operations successfully.
There are several differences you need to be aware of between MSSQL and PostgreSQL.
- Start with the data type mapping to begin with.
- If PostGIS is enabled, use types like geometry, geography. Can use text for fallback support.
- PostgreSQL supports composite types, arrays at the column level natively, while SQL Server cannot.
- SQL Server often auto-generates constraint names, while PostgreSQL typically requires explicit
names. - There is a good amount of difference between Clustered Indexes in SQL Server vs PostgreSQL
- Spatial Indexes (Geometry/Geography) requires PostGIS extension in PostgreSQL.
- Using PostgreSQL's native IDENTITY feature instead of legacy SERIAL/BIGSERIAL, as IDENTITY matches SQL
Server's behavior with clear syntax. - In SQL Server, RANGE partitioning is the only natively supported partitioning method. But, during migration, partition boundaries must be carefully adjusted to match PostgreSQL's behavior.
- User Defined Table Types of SQL Server are migrated to PostgreSQL as composite types which can
encapsulate multiple columns under single type. - There is a much more bigger list for every category, so I will share with you a Slide deck from one of my talks on MSSQL to PostgreSQL.
On 8/20/25 15:52, Avinash Vallarapu wrote: > Hi Krishane, > > * SQL Server often auto-generates constraint names, while PostgreSQL > typically requires explicit > names. You will need to give more detail on above as: create table constraint_test (id integer primary key, fld_1 varchar, fld_2 varchar check (fld_2 != ''), UNIQUE(fld_1, fld_2)); constraint_test Table "public.constraint_test" Column | Type | Collation | Nullable | Default --------+-------------------+-----------+----------+--------- id | integer | | not null | fld_1 | character varying | | | fld_2 | character varying | | | Indexes: "constraint_test_pkey" PRIMARY KEY, btree (id) "constraint_test_fld_1_fld_2_key" UNIQUE CONSTRAINT, btree (fld_1, fld_2) Check constraints: "constraint_test_fld_2_check" CHECK (fld_2::text <> ''::text) > Regards, > Avinash Vallarapu > +1-902-221-5976 > www.hexarocket.com <http://www.hexarocket.com> -- Adrian Klaver adrian.klaver@aklaver.com