Обсуждение: pg_restore failing with "ERROR: out of memory"
I’m attempting to do something that should be a trivially simple task. I want to do a data only dump from my production data in the public schema and restore it on another machine. Both machines are running 8.2.5, both machines are virtually identical and have 16GB of RAM.
I created an archive with the following command:
pg_dump -v -Fc -a -n public -f 20080318-data-archive
The archive is roughly 2GB. An uncompressed SQL dump is ~9GB.
...and am trying to restore it with:
pg_restore -av -d m16test 20080318-data-archive
I have a copy of the public schema loaded prior to doing the pg_restore.
Several small tables import fine, but when it gets to answerselectinstance, a table with > 107 million rows, it chokes:
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 3709; 0 1170915411 TABLE DATA answerselectinstance bzzagent
pg_restore: [archiver (db)] COPY failed: ERROR: out of memory
DETAIL: Failed on request of size 32.
CONTEXT: COPY answerselectinstance, line 61944061: "180097 63 0"
I have also tried running a straight data-only SQL file through psql and get a similar error.
Answerselectinstance is defined as follows:
Table "public.answerselectinstance"
Column | Type | Modifiers
----------------+---------+--------------------
memberid | integer | not null
answerid | integer | not null
taskinstanceid | integer | not null default 0
Indexes:
"answerselectinstance_pkey" PRIMARY KEY, btree (memberid, answerid, taskinstanceid)
"asi_answerid_idx" btree (answerid)
"asi_memberid_idx" btree (memberid)
"asi_taskinstanceid_idx" btree (taskinstanceid)
Triggers:
"RI_ConstraintTrigger_1170917112" AFTER INSERT OR UPDATE ON answerselectinstance FROM answer NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_check_ins"('<unnamed>', 'answerselectinstance', 'answer', 'UNSPECIFIED', 'answerid', 'id')
_bzz_cluster_logtrigger_301 AFTER INSERT OR DELETE OR UPDATE ON answerselectinstance FOR EACH ROW EXECUTE PROCEDURE _bzz_cluster.logtrigger('_bzz_cluster', '301', 'kkk')
As you can see, this is part of a slony cluster. I have created the schema using slony1_extract_schema.sh and am importing the data into a standalone server.
Our DBA solved this problem a few months ago, but recently left the company and did not document the procedure. I’m no PostgreSQL expert....
Let me know what other information would be helpful.
Thanks,
Aaron
I created an archive with the following command:
pg_dump -v -Fc -a -n public -f 20080318-data-archive
The archive is roughly 2GB. An uncompressed SQL dump is ~9GB.
...and am trying to restore it with:
pg_restore -av -d m16test 20080318-data-archive
I have a copy of the public schema loaded prior to doing the pg_restore.
Several small tables import fine, but when it gets to answerselectinstance, a table with > 107 million rows, it chokes:
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 3709; 0 1170915411 TABLE DATA answerselectinstance bzzagent
pg_restore: [archiver (db)] COPY failed: ERROR: out of memory
DETAIL: Failed on request of size 32.
CONTEXT: COPY answerselectinstance, line 61944061: "180097 63 0"
I have also tried running a straight data-only SQL file through psql and get a similar error.
Answerselectinstance is defined as follows:
Table "public.answerselectinstance"
Column | Type | Modifiers
----------------+---------+--------------------
memberid | integer | not null
answerid | integer | not null
taskinstanceid | integer | not null default 0
Indexes:
"answerselectinstance_pkey" PRIMARY KEY, btree (memberid, answerid, taskinstanceid)
"asi_answerid_idx" btree (answerid)
"asi_memberid_idx" btree (memberid)
"asi_taskinstanceid_idx" btree (taskinstanceid)
Triggers:
"RI_ConstraintTrigger_1170917112" AFTER INSERT OR UPDATE ON answerselectinstance FROM answer NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_check_ins"('<unnamed>', 'answerselectinstance', 'answer', 'UNSPECIFIED', 'answerid', 'id')
_bzz_cluster_logtrigger_301 AFTER INSERT OR DELETE OR UPDATE ON answerselectinstance FOR EACH ROW EXECUTE PROCEDURE _bzz_cluster.logtrigger('_bzz_cluster', '301', 'kkk')
As you can see, this is part of a slony cluster. I have created the schema using slony1_extract_schema.sh and am importing the data into a standalone server.
Our DBA solved this problem a few months ago, but recently left the company and did not document the procedure. I’m no PostgreSQL expert....
Let me know what other information would be helpful.
Thanks,
Aaron
Aaron Brown <abrown@bzzagent.com> writes: > I�m attempting to do something that should be a trivially simple task. I > want to do a data only dump from my production data in the public schema and > restore it on another machine. Does it really need to be data-only? A regular schema+data dump usually restores a lot faster. Your immediate problem is probably that it's running out of memory for pending foreign-key triggers. Even if it didn't run out of memory, the ensuing one-tuple-at-a-time checks would take forever. You'd be better off dropping the FK constraint, loading the data, and re-creating the constraint. There's further discussion of bulk-loading tricks in the manual: http://www.postgresql.org/docs/8.2/static/populate.html regards, tom lane
Yes, it kind of needs to be data only since I am pulling from a slonized database. My experience has been if you don’t load the schema first with a schema from slony1_extract_schema.sh, we end up with all the slony triggers and crap in the dump. If there is a better way of doing this, I’m definitely all ears.
Aaron
On 3/19/08 3:06 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
-------------------------------------------------------
Aaron Brown, Systems Engineer
BzzAgent, Inc. | www.bzzagent.com
abrown@bzzagent.com | 617.451.2280
-------------------------------------------------------
Aaron
On 3/19/08 3:06 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
Aaron Brown <abrown@bzzagent.com> writes:
> I’m attempting to do something that should be a trivially simple task. I
> want to do a data only dump from my production data in the public schema and
> restore it on another machine.
Does it really need to be data-only? A regular schema+data dump usually
restores a lot faster.
Your immediate problem is probably that it's running out of memory for
pending foreign-key triggers. Even if it didn't run out of memory, the
ensuing one-tuple-at-a-time checks would take forever. You'd be better
off dropping the FK constraint, loading the data, and re-creating the
constraint.
There's further discussion of bulk-loading tricks in the manual:
http://www.postgresql.org/docs/8.2/static/populate.html
regards, tom lane
-------------------------------------------------------
Aaron Brown, Systems Engineer
BzzAgent, Inc. | www.bzzagent.com
abrown@bzzagent.com | 617.451.2280
-------------------------------------------------------
Aaron Brown <abrown@bzzagent.com> writes: > Yes, it kind of needs to be data only since I am pulling from a slonized > database. My experience has been if you don�t load the schema first with a > schema from slony1_extract_schema.sh, we end up with all the slony triggers > and crap in the dump. If there is a better way of doing this, I�m > definitely all ears. Well, you'd need to take that up on the Slony lists, but I'd say that Slony definitely ought to provide a better export mechanism than forcing you to use a standard data-only dump. At minimum they should teach that script to emit two separate files: one to load before the data and one after. Building the indexes and FK constraints on already-loaded data is a heck of a lot faster. regards, tom lane
I agree that they “should”. Unfortunately, nothing about slony is simple, in my experience.
On 3/19/08 3:17 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
-------------------------------------------------------
Aaron Brown, Systems Engineer
BzzAgent, Inc. | www.bzzagent.com
abrown@bzzagent.com | 617.451.2280
-------------------------------------------------------
On 3/19/08 3:17 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
Aaron Brown <abrown@bzzagent.com> writes:
> Yes, it kind of needs to be data only since I am pulling from a slonized
> database. My experience has been if you don’t load the schema first with a
> schema from slony1_extract_schema.sh, we end up with all the slony triggers
> and crap in the dump. If there is a better way of doing this, I’m
> definitely all ears.
Well, you'd need to take that up on the Slony lists, but I'd say that
Slony definitely ought to provide a better export mechanism than forcing
you to use a standard data-only dump. At minimum they should teach that
script to emit two separate files: one to load before the data and one
after. Building the indexes and FK constraints on already-loaded data
is a heck of a lot faster.
regards, tom lane
-------------------------------------------------------
Aaron Brown, Systems Engineer
BzzAgent, Inc. | www.bzzagent.com
abrown@bzzagent.com | 617.451.2280
-------------------------------------------------------