Re: [HACKERS] Logical replication existing data copy
От | Erik Rijkers |
---|---|
Тема | Re: [HACKERS] Logical replication existing data copy |
Дата | |
Msg-id | 0a4418aff31920c92c1a446ad20d89f3@xs4all.nl обсуждение исходный текст |
Ответ на | Re: [HACKERS] Logical replication existing data copy (Petr Jelinek <petr.jelinek@2ndquadrant.com>) |
Ответы |
Re: [HACKERS] Logical replication existing data copy
|
Список | pgsql-hackers |
On 2017-02-25 00:40, Petr Jelinek wrote: > 0001-Use-asynchronous-connect-API-in-libpqwalreceiver.patch > 0002-Fix-after-trigger-execution-in-logical-replication.patch > 0003-Add-RENAME-support-for-PUBLICATIONs-and-SUBSCRIPTION.patch > snapbuild-v3-0001-Reserve-global-xmin-for-create-slot-snasphot-export.patch > snapbuild-v3-0002-Don-t-use-on-disk-snapshots-for-snapshot-export-in-l.patch > snapbuild-v3-0003-Fix-xl_running_xacts-usage-in-snapshot-builder.patch > snapbuild-v3-0004-Skip-unnecessary-snapshot-builds.patch > 0001-Logical-replication-support-for-initial-data-copy-v6.patch Here are some results. There is improvement although it's not an unqualified success. Several repeat-runs of pgbench_derail2.sh, with different parameters for number-of-client yielded an output file each. Those show that logrep is now pretty stable when there is only 1 client (pgbench -c 1). But it starts making mistakes with 4, 8, 16 clients. I'll just show a grep of the output files; I think it is self-explicatory: Output-files (lines counted with grep | sort | uniq -c): -- out_20170225_0129.txt 250 -- pgbench -c 1 -j 8 -T 10 -P 5 -n 250 -- All is well. -- out_20170225_0654.txt 25 -- pgbench -c 4 -j 8 -T 10 -P 5 -n 24 -- All is well. 1 -- Not good, but breakingout of wait (waited more than 60s) -- out_20170225_0711.txt 25 -- pgbench -c 8 -j 8 -T 10 -P 5 -n 23 -- All is well. 2 -- Not good, but breakingout of wait (waited more than 60s) -- out_20170225_0803.txt 25 -- pgbench -c 16 -j 8 -T 10 -P 5 -n 11 -- All is well. 14 -- Not good, but breakingout of wait (waited more than 60s) So, that says: 1 clients: 250x success, zero fail (250 not a typo, ran this overnight) 4 clients: 24x success, 1 fail 8 clients: 23x success, 2 fail 16 clients: 11x success, 14 fail I want to repeat what I said a few emails back: problems seem to disappear when a short wait state is introduced (directly after the 'alter subscription sub1 enable' line) to give the logrep machinery time to 'settle'. It makes one think of a timing error somewhere (now don't ask me where..). To show that, here is pgbench_derail2.sh output that waited 10 seconds (INIT_WAIT in the script) as such a 'settle' period works faultless (with 16 clients): -- out_20170225_0852.txt 25 -- pgbench -c 16 -j 8 -T 10 -P 5 -n 25 -- All is well. QED. (By the way, no hanged sessions so far, so that's good) thanks Erik Rijkers
В списке pgsql-hackers по дате отправления: