Re: [HACKERS] Logical replication existing data copy
От | Erik Rijkers |
---|---|
Тема | Re: [HACKERS] Logical replication existing data copy |
Дата | |
Msg-id | b0dbcb2a1066d6728cbf62e391e7edf4@xs4all.nl обсуждение исходный текст |
Ответ на | Re: [HACKERS] Logical replication existing data copy (Erik Rijkers <er@xs4all.nl>) |
Ответы |
Re: [HACKERS] Logical replication existing data copy
|
Список | pgsql-hackers |
On 2017-02-22 14:48, Erik Rijkers wrote: > On 2017-02-22 13:03, Petr Jelinek wrote: > >> 0001-Skip-unnecessary-snapshot-builds.patch >> 0002-Don-t-use-on-disk-snapshots-for-snapshot-export-in-l.patch >> 0003-Fix-xl_running_xacts-usage-in-snapshot-builder.patch >> 0001-Use-asynchronous-connect-API-in-libpqwalreceiver.patch >> 0002-Fix-after-trigger-execution-in-logical-replication.patch >> 0003-Add-RENAME-support-for-PUBLICATIONs-and-SUBSCRIPTION.patch >> 0001-Logical-replication-support-for-initial-data-copy-v5.patch > > It works well now, or at least my particular test case seems now > solved. Cried victory too early, I'm afraid. The logical replication is now certainly much more stable but there are still errors, just less often. The rare 'hang'-error that I mentioned a few emails back I have not yet encountered; I am beginning to trust that that is indeed solved. But there is still sometimes incorrect replication. The symptoms are the ones I mentioned earlier: - incorrect number of rows in one of (mostly) pgbench_accounts or pgbench_history. the numers are always off by a very small number, say less than 20, often even only 1 row. - incorrect content in one of pgbench_accounts or pgbench_history (detected via md5). Also mostly the two tables named above. I see sometimes primary key violations on the replica. That should not be possible if I have understood the intent of logical replication correctly. ( ERROR: duplicate key value violates unique constraint "pgbench_tellers_pkey" ) mostly *_tellers, also seen *_branches Understandably, the errors become more frequent with higher client counts: a 25x repeat with 1 client yielded only 1 failed run whereas a 25x repeat with 16 clients gave 16 failures. I attach once more the current incarnation of my test-bash pgbench runner, pgbench_derail2.sh. Easiest to run it yourself, I guess. I also attach the output (of pgbench_derail2.sh) of those two 25x repeats: d2_scale__1_client__1_25x.txt d2_scale__1_client_16_25x.txt I worry a bit about the correctness of that test program (pgbench_derail2.sh). I especially wonder if it should look around better at startup (e.g., at stuff left over from previous iterations). If you see any incorrect/dumb things there, or a better way to monitor (aka pre-flight checks), please let me know. But the current state si certainly a big step forward -- I guess it's just your bad luck that I had the afternoon off ;) thanks, Erik Rijkers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Вложения
В списке pgsql-hackers по дате отправления: