Обсуждение: The quickest way to migrate database within the same cluster
Hi PgSql experts,
I’m working on a task to move tables from on database to another within the same cluster and same server. I try the pg_dump and pg_restore commands (with –Fc option), and notice that it costs around half an hour which is not acceptable. Then I try to move the data files directly. I create the same tables in the new DB, find out the directory stores the data file and search for the data files by the table name, then move the corresponding data file in the old DB to the new one. This is very quick, and looks like the DB still works. But I have a concern that whether there will any other problem if I doing this?
Another workaround is to access the old DB in the new DB with dblink (I created a view with dblink and use it as if the view as if the table is in the new DB). But we soon notice that even though we specify the criteria in the select command, dblink will still retrieve all the records first, and then apply the search criteria. This brings poor performance. Do you have any better idea how can I handle this?
Thanks,
Wendy
On 15/01/04 15:52, Zheng, Wendy wrote: > Hi PgSql experts, > > I’m working on a task to move tables from on database to another within > the same cluster and same server. I try the pg_dump and pg_restore commands > (with –Fc option), and notice that it costs around half an hour which is > not acceptable. Then I try to move the data files directly. I create the > same tables in the new DB, find out the directory stores the data file > and search for the data files by the table name, then move the corresponding > data file in the old DB to the new one. This is very quick, and looks like the > DB still works. But I have a concern that whether there will any other problem > if I doing this? Yes, you'll experience massive data corruption. You can't just manually copy data files around, even though it might seem to work. > Another workaround is to access the old DB in the new DB with dblink > (I created a view with dblink and use it as if the view as if the table > is in the new DB). But we soon notice that even though we specify the > criteria in the select command, dblink will still retrieve all the > records first, and then apply the search criteria. This brings poor > performance. Do you have any better idea how can I handle this? Which PostgreSQL version are you using? If 9.3 or later you can use a foreign data wrapper (postgres_fdw) to access data in another database (including on the same cluster). Regards Ian Barwick -- Ian Barwick http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, RemoteDBA, Training & Services
I'm using 9.1 Will foreign data wrapper have similar performance issue as dblink? Thanks, Wendy -----Original Message----- From: Ian Barwick [mailto:ian@2ndquadrant.com] Sent: Sunday, January 04, 2015 3:13 PM To: Zheng, Wendy; pgsql-admin@postgresql.org Subject: Re: [ADMIN] The quickest way to migrate database within the same cluster On 15/01/04 15:52, Zheng, Wendy wrote: > Hi PgSql experts, > > I'm working on a task to move tables from on database to another > within the same cluster and same server. I try the pg_dump and > pg_restore commands (with -Fc option), and notice that it costs around > half an hour which is not acceptable. Then I try to move the data > files directly. I create the same tables in the new DB, find out the > directory stores the data file and search for the data files by the > table name, then move the corresponding data file in the old DB to the > new one. This is very quick, and looks like the DB still works. But I > have a concern that whether there will any other problem if I doing this? Yes, you'll experience massive data corruption. You can't just manually copy data files around, even though it might seemto work. > Another workaround is to access the old DB in the new DB with dblink > (I created a view with dblink and use it as if the view as if the > table is in the new DB). But we soon notice that even though we > specify the criteria in the select command, dblink will still retrieve > all the records first, and then apply the search criteria. This brings > poor performance. Do you have any better idea how can I handle this? Which PostgreSQL version are you using? If 9.3 or later you can use a foreign data wrapper (postgres_fdw) to access datain another database (including on the same cluster). Regards Ian Barwick -- Ian Barwick http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, RemoteDBA, Training & Services
I’m working on a task to move tables from on database to another within the same cluster and same server...
The tables have the same schema, but in different DB. Because I create the table in the new DB, so the table of the same name in different DB have different filenode too. For example, the data file of old DB is under /pgsql/data/base/16384/, while the data file of new DB is under /dbdata/uemcp_tblspc/PG_9.1_201105231/16481 (I created the new table in a table space). Then the filenode for my table user_information in old DB is 34576, and that in the new DB is 45663. Then I perform the following to switch the data files:
mv /dbdata/uemcp_tblspc/PG_9.1_201105231/16481/45663 /dbdata/uemcp_tblspc/PG_9.1_201105231/16481/45663_bak
mv /pgsql/data/base/16384/34576 /dbdata/uemcp_tblspc/PG_9.1_201105231/16481/45663
mv /dbdata/uemcp_tblspc/PG_9.1_201105231/16481/45663_bak /pgsql/data/base/16384/34576
mv /dbdata/uemcp_tblspc/PG_9.1_201105231/16481/45663_fsm /dbdata/uemcp_tblspc/PG_9.1_201105231/16481/45663_fsm_bak
mv /pgsql/data/base/16384/34576_fsm /dbdata/uemcp_tblspc/PG_9.1_201105231/16481/45663_fsm
mv /dbdata/uemcp_tblspc/PG_9.1_201105231/16481/45663_fsm_bak /pgsql/data/base/16384/34576_fsm
I do that for all of my tables. Looks like I can still query and insert records.
Thanks,
Wendy
From: Rosser Schwarz [mailto:rosser.schwarz@gmail.com]
Sent: Sunday, January 04, 2015 3:19 PM
To: Zheng, Wendy
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] The quickest way to migrate database within the same cluster
On Sat, Jan 3, 2015 at 10:52 PM, Zheng, Wendy <wendy.zheng@emc.com> wrote:
I’m working on a task to move tables from on database to another within the same cluster and same server...
Your second idea, of copying the underlying disk files, probably won't work unless you're also copying all the relevant records from all of the system catalog tables (a non-trivial task, itself), and likely not even then. (For instance, if two tables in the different databases have the same 'relfilenode', you're going to overwrite that table in the second when you perform the copy, and because those tables probably have different structures, you'll then see "unexpected" behavior.)
You might instead try piping the output of a "COPY ($query) TO STDOUT" in a psql session against the source database into a "COPY table FROM STDIN" command in a psql session against the destination. That would filter on the source side, but would require you to have empty tables of the expected structure on the destination to receive the data.
rls
--
:wq
On 15/01/04 16:14, Zheng, Wendy wrote: > On 15/01/04 15:52, Zheng, Wendy wrote: >> Hi PgSql experts, >> >> I'm working on a task to move tables from on database to another >> within the same cluster and same server. I try the pg_dump and >> pg_restore commands (with -Fc option), and notice that it costs around >> half an hour which is not acceptable. Then I try to move the data >> files directly. I create the same tables in the new DB, find out the >> directory stores the data file and search for the data files by the >> table name, then move the corresponding data file in the old DB to the >> new one. This is very quick, and looks like the DB still works. But I >> have a concern that whether there will any other problem if I doing this? > > Yes, you'll experience massive data corruption. You can't just manually > copy data files around, even though it might seem to work. > >> Another workaround is to access the old DB in the new DB with dblink >> (I created a view with dblink and use it as if the view as if the >> table is in the new DB). But we soon notice that even though we >> specify the criteria in the select command, dblink will still retrieve >> all the records first, and then apply the search criteria. This brings >> poor performance. Do you have any better idea how can I handle this? > > Which PostgreSQL version are you using? If 9.3 or later you can use a > foreign data wrapper (postgres_fdw) to access data in another database > (including on the same cluster). > I'm using 9.1 > Will foreign data wrapper have similar performance issue as dblink? It won't be slower and depending on your use case should be more efficient, however you'll need 9.3 or later. If your aim is to move tables between databases, you should be able to use dblink to copy the data. Regards Ian Barwick -- Ian Barwick http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, RemoteDBA, Training & Services