BUG #17846: pg_dump doesn't properly dump with paused WAL replay
От | PG Bug reporting form |
---|---|
Тема | BUG #17846: pg_dump doesn't properly dump with paused WAL replay |
Дата | |
Msg-id | 17846-1a0e5ce976f4c01a@postgresql.org обсуждение исходный текст |
Ответы |
Re: BUG #17846: pg_dump doesn't properly dump with paused WAL replay
|
Список | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 17846 Logged by: Francisco Reinolds Email address: francisco.reinolds@channable.com PostgreSQL version: 13.8 Operating system: Ubuntu 22.04.2 LTS Description: In our setup, we have primary and secondary databases, taking advantage of physical streaming replication to keep the secondary up to date, so that we can perform backups from it, lessening the load on the primary. For backups, we use pg_dump to perform a full database dump. Before we start a backup, we pause the WAL replay on the secondary, unpausing it after it is concluded. This was done since we previously encountered problems with pg_dump failing when an AccessExclusiveLock was held on a table that pg_dump was going to dump. For some time we faced no problems with this setup, but starting some months ago, we started witnessing sporadic failures when we attempted to restore the dumps of one of our databases, to verify the dump's integrity. These restore failures would occur due to a key not being present in a table: ``` pg_restore: while PROCESSING TOC: pg_restore: from TOC entry 3279; 2606 203326243 FK CONSTRAINT table_2 table_2_table_1_id_fkey realtime_alembic pg_restore: error: could not execute query: ERROR: insert or update on table "table_2" violates foreign key constraint "table_2_table_1_id_fkey" DETAIL: Key (table_1_id)=(34553066) is not present in table "table_1". Command was: ALTER TABLE ONLY public.table_2 ADD CONSTRAINT table_2_table_1_id_fkey FOREIGN KEY (table_1_id) REFERENCES public.table_1(id); ``` This has been puzzling us for some time since: - this is the only database of our infrastructure where this occurs, despite this setup being replicated across it; - the failures started occurring sporadically, but have become more consistent; We have managed, with some help from the Postgres IRC channel (special thanks to user nickb), to work around the problem. The solution was to begin a transaction, and extract a snapshot that'd be passed as a pg_dump argument, and only then pause WAL replay. From our understanding, pg_dump should already implicitly pick a suitable point to start the dump but it apparently is not the case, hence the bug report. Thanks for the attention, Francisco
В списке pgsql-bugs по дате отправления: