Обсуждение: NOTIFY and pg_notify performance when deduplicating notifications
Hi, Back in 2016 a patch was proposed to fix the O(N^2) performance on transactions that generate many notifications. The performanceissue is caused by the check for duplicate notifications. I have a feature built around LISTEN / NOTIFY that works perfectly well, except for the enormous performance impact to transactionsthat emit large numbers of notifications. It’s very hard to work around the problem on the application side andtransactions that could take just a few seconds end up taking over 30 minutes. The patch that was proposed was nearly finalized, but ended up being abandoned. The latest patch is here: https://www.postgresql.org/message-id/CAP_rwwmKjO_p3kYB4jYceqcvcyRYjBQdji1GSCyqvLK%3D5nZzWQ%40mail.gmail.com. I understand that the only work left to be done on the patch was to address comments made on the proposed syntax. I’m attachingan updated patch that changes the syntax to allow for a variable number of modes. The new syntax would be NOTIFYchannel [ , payload [ , collapse_mode ] ] ; where collapse_mode can be 'never' or 'maybe'. I hope this patch can be reviewed and included in PostgreSQL. Best regards. -- Julien Demoor
Вложения
On Tue, Oct 2, 2018 at 7:20 PM <julien@jdemoor.com> wrote: > I have a feature built around LISTEN / NOTIFY that works perfectly well, except for the enormous performance impact totransactions that emit large numbers of notifications. Indeed, I have the same and am very interested in this. > I hope this patch can be reviewed and included in PostgreSQL. I added this to the next Commitfest and added myself as a reviewer. Will try to a review beginning of next week. https://commitfest.postgresql.org/20/1820/
> Indeed, I have the same and am very interested in this. > > > I hope this patch can be reviewed and included in PostgreSQL. > > I added this to the next Commitfest and added myself as a reviewer. > Will try to a review beginning of next week. > https://commitfest.postgresql.org/20/1820/ Thank you for reviewing. I just caught an error in my patch, it's fixed in the attachment. The 'never' and 'maybe' collapse modes were mixed up inone location. I can't find a reasonable way to build a regression test that checks whether notifications are effectively deduplicated.The output of the LISTEN command lists the PID of the notifying backend for each notification, e.g. : 'Asynchronousnotification "foobar" received from server process with PID 24917'. I can't just add this to async.out. I didtest manually for all eight combinations : four collapse mode values (missing, empty string, 'maybe' and 'never'), bothwith NOTIFY and pg_notify().
Вложения
On Tue, Oct 9, 2018 at 2:17 PM <julien@jdemoor.com> wrote: > I just caught an error in my patch, it's fixed in the attachment. The 'never' and 'maybe' collapse modes were mixed upin one location. Here's a partial review of this version, did not read the doc part very carefully. First of all, I agree that this is a desirable feature as, for a large number of notiifications, the O(n^2) overhead quickly becomes very noticeable. I would expect the collapse mode to be an enum which is created from the string early on during parsing and used for the rest of the code. Instead the string is used all the way leading to string comparisons in the notification dispatcher and to the need of hardcoding special strings in various places, including the contrib module. This comment in the beginning of async.c should also be updated: * Duplicate notifications from the same transaction are sent out as one * notification only. This is done to save work when for example a trigger pg_notify_3args duplicates pg_notify, I would expect a helper function to be extracted and called from both. There are braces placed on the same line as the if, for example if (strlen(collapse_mode) != 0) { which seems to not be the project's style. > > I can't find a reasonable way to build a regression test that checks whether notifications are effectively deduplicated.The output of the LISTEN command lists the PID of the notifying backend for each notification, e.g. : 'Asynchronousnotification "foobar" received from server process with PID 24917'. I can't just add this to async.out. I didtest manually for all eight combinations : four collapse mode values (missing, empty string, 'maybe' and 'never'), bothwith NOTIFY and pg_notify(). One way could be to take inspiration from src/test/isolation/specs/async-notify.spec and check that pg_notification_queue_usage() does grow when repeating the same payload with collapse_mode='never' (while for always it would grow). But I'm not sure it's worth the effort.
On Wed, Oct 10, 2018 at 5:42 PM Catalin Iacob <iacobcatalin@gmail.com> wrote: > One way could be to take inspiration from > src/test/isolation/specs/async-notify.spec and check that > pg_notification_queue_usage() does grow when repeating the same > payload with collapse_mode='never' (while for always it would grow). Sorry, the last part should be "(while for *maybe* it would *not* grow)".
On 10/10/2018 19:42, Catalin Iacob wrote: > On Tue, Oct 9, 2018 at 2:17 PM <julien@jdemoor.com> wrote: >> I just caught an error in my patch, it's fixed in the attachment. The >> 'never' and 'maybe' collapse modes were mixed up in one location. > > Here's a partial review of this version, did not read the doc part > very carefully. > > First of all, I agree that this is a desirable feature as, for a large > number of notiifications, the O(n^2) overhead quickly becomes very > noticeable. > > I would expect the collapse mode to be an enum which is created from > the string early on during parsing and used for the rest of the code. > Instead the string is used all the way leading to string comparisons > in the notification dispatcher and to the need of hardcoding special > strings in various places, including the contrib module. > > This comment in the beginning of async.c should also be updated: > * Duplicate notifications from the same transaction are sent out as one > * notification only. This is done to save work when for example a trigger > > pg_notify_3args duplicates pg_notify, I would expect a helper function > to be extracted and called from both. > > There are braces placed on the same line as the if, for example if > (strlen(collapse_mode) != 0) { which seems to not be the project's > style. Thank you for the review. I've addressed all your points in the attached patch. The patch was made against release 11.1. I couldn't find a way to make a good helper function for pg_notify_3args and pg_notify, I hope my proposed solution is acceptable.
Вложения
> On Mon, Nov 19, 2018 at 8:30 AM Julien Demoor <julien@jdemoor.com> wrote: > > Thank you for the review. I've addressed all your points in the attached > patch. The patch was made against release 11.1. I've noticed, that cfbot complains about this patch [1], since: Duplicate OIDs detected: 3423 found 1 duplicate OID(s) in catalog data At the same time it's quite minor problem, so I'm moving it to the nexc CF as "Needs review". Also since it's performance related patch, and the latest tests I see in the history of this topic were posted around the time of the initial thread (which was quite some time ago), could we expect to see some new benchmarks with this patch and without? Probably the positive difference would be obvious, but still. [1]: https://travis-ci.org/postgresql-cfbot/postgresql/builds/461617098
Is there a particular format that is needed for the benchmark? Here's a quick benchmark below. Unpatched, generating N notifications takes t milliseconds: N= 10000, t= 348 N= 20000, t=1419 N= 30000, t=3253 N= 40000, t=6170 Patched, with the 'never' collapse mode, on another (much faster) machine: N= 10000, t= 6 N= 20000, t= 32 N= 30000, t= 11 N= 40000, t= 14 N=100000, t= 37 N=200000, t= 86 The benchmark shows that the time to send notifications grows quadratically with the number of notifications per transaction without the patch while it grows linearly with the patch applied and notification collapsing disabled. The actual psql output is below. -- Unpatched jdemoor=# listen my_channel; LISTEN Time: 0.100 ms jdemoor=# jdemoor=# jdemoor=# begin; BEGIN Time: 0.034 ms jdemoor=# select count(*) from (select pg_notify('my_channel', g::text) from generate_series(1, 10000) g) x; count ------- 10000 (1 row) Time: 348.214 ms jdemoor=# rollback; ROLLBACK Time: 0.054 ms jdemoor=# jdemoor=# begin; BEGIN Time: 0.050 ms jdemoor=# select count(*) from (select pg_notify('my_channel', g::text) from generate_series(1, 20000) g) x; count ------- 20000 (1 row) Time: 1419.723 ms (00:01.420) jdemoor=# rollback; ROLLBACK Time: 0.062 ms jdemoor=# jdemoor=# begin; BEGIN Time: 0.020 ms jdemoor=# select count(*) from (select pg_notify('my_channel', g::text) from generate_series(1, 30000) g) x; count ------- 30000 (1 row) Time: 3253.845 ms (00:03.254) jdemoor=# rollback; ROLLBACK Time: 0.064 ms jdemoor=# jdemoor=# begin; BEGIN Time: 0.020 ms jdemoor=# select count(*) from (select pg_notify('my_channel', g::text) from generate_series(1, 40000) g) x; count ------- 40000 (1 row) Time: 6170.646 ms (00:06.171) jdemoor=# rollback; ROLLBACK Time: 0.063 ms jdemoor=# -- Patched postgres=# listen my_channel; LISTEN Time: 0.164 ms postgres=# postgres=# postgres=# begin; BEGIN Time: 0.099 ms postgres=# select count(*) from (select pg_notify('my_channel', g::text, 'never') from generate_series(1, 10000) g) x; count ------- 10000 (1 row) Time: 6.092 ms postgres=# rollback; ROLLBACK Time: 0.112 ms postgres=# postgres=# begin; BEGIN Time: 0.032 ms postgres=# select count(*) from (select pg_notify('my_channel', g::text, 'never') from generate_series(1, 20000) g) x; count ------- 20000 (1 row) Time: 7.378 ms postgres=# rollback; ROLLBACK Time: 0.258 ms postgres=# postgres=# begin; BEGIN Time: 0.070 ms postgres=# select count(*) from (select pg_notify('my_channel', g::text, 'never') from generate_series(1, 30000) g) x; count ------- 30000 (1 row) Time: 11.782 ms postgres=# rollback; ROLLBACK Time: 0.256 ms postgres=# postgres=# begin; BEGIN Time: 0.073 ms postgres=# select count(*) from (select pg_notify('my_channel', g::text, 'never') from generate_series(1, 40000) g) x; count ------- 40000 (1 row) Time: 14.269 ms postgres=# rollback; ROLLBACK Time: 0.144 ms postgres=# begin; BEGIN Time: 0.204 ms postgres=# select count(*) from (select pg_notify('my_channel', g::text, 'never') from generate_series(1, 100000) g) x; count -------- 100000 (1 row) Time: 37.199 ms postgres=# rollback; ROLLBACK Time: 0.864 ms postgres=# postgres=# postgres=# begin; BEGIN Time: 0.126 ms postgres=# select count(*) from (select pg_notify('my_channel', g::text, 'never') from generate_series(1, 200000) g) x; count -------- 200000 (1 row) Time: 86.477 ms postgres=# rollback; ROLLBACK Time: 1.292 ms On 01/12/2018 02:35, Dmitry Dolgov wrote: >> On Mon, Nov 19, 2018 at 8:30 AM Julien Demoor <julien@jdemoor.com> wrote: >> >> Thank you for the review. I've addressed all your points in the attached >> patch. The patch was made against release 11.1. > > I've noticed, that cfbot complains about this patch [1], since: > > Duplicate OIDs detected: > 3423 > found 1 duplicate OID(s) in catalog data > > At the same time it's quite minor problem, so I'm moving it to the nexc CF > as > "Needs review". > > Also since it's performance related patch, and the latest tests I see in the > history of this topic were posted around the time of the initial thread > (which > was quite some time ago), could we expect to see some new benchmarks with > this > patch and without? Probably the positive difference would be obvious, but > still. > > [1]: https://travis-ci.org/postgresql-cfbot/postgresql/builds/461617098 >
Julien Demoor <julien@jdemoor.com> writes: > [ postgres-notify-all-v8.patch ] I took a quick look through this. A few comments: * I find the proposed syntax extension for NOTIFY to be fairly bizarre; it's unlike the way that we handle options for any other utility statement. It's also non-orthogonal, since you can't specify a collapse mode without giving a payload string. I think possibly a better answer is the way that we've been adding optional parameters to VACUUM and suchlike recently: NOTIFY [ (collapse = off/on) ] channel [ , payload ] This'd be more extensible if we ever find a need for other options, too. * I'm also unimpressed with the idea of having a "maybe" collapse mode. What's that supposed to do? It doesn't appear to be different from "always", so why not just reduce this to a boolean collapse-or-not flag? * The documentation doesn't agree with the code, since it fails to mention "always" mode. * I was kind of disappointed to find that the patch doesn't really do anything to fix the performance problem for duplicate notifies. The thread title had led me to hope for more ;-). I wonder if we couldn't do something involving hashing. OTOH, it's certainly arguable that that would be an independent patch, and that this one should be seen as a feature patch ("make NOTIFY's behavior for duplicate notifies more flexible and more clearly specified") rather than a performance patch. regards, tom lane
Hi, On 2019-01-10 15:56:18 -0500, Tom Lane wrote: > Julien Demoor <julien@jdemoor.com> writes: > > [ postgres-notify-all-v8.patch ] > > I took a quick look through this. A few comments: > > * I find the proposed syntax extension for NOTIFY to be fairly bizarre; > it's unlike the way that we handle options for any other utility > statement. It's also non-orthogonal, since you can't specify a collapse > mode without giving a payload string. I think possibly a better answer > is the way that we've been adding optional parameters to VACUUM and > suchlike recently: > > NOTIFY [ (collapse = off/on) ] channel [ , payload ] > > This'd be more extensible if we ever find a need for other options, > too. > > * I'm also unimpressed with the idea of having a "maybe" collapse mode. > What's that supposed to do? It doesn't appear to be different from > "always", so why not just reduce this to a boolean collapse-or-not > flag? > > * The documentation doesn't agree with the code, since it fails to > mention "always" mode. > > * I was kind of disappointed to find that the patch doesn't really > do anything to fix the performance problem for duplicate notifies. > The thread title had led me to hope for more ;-). I wonder if we > couldn't do something involving hashing. OTOH, it's certainly > arguable that that would be an independent patch, and that this > one should be seen as a feature patch ("make NOTIFY's behavior > for duplicate notifies more flexible and more clearly specified") > rather than a performance patch. Given there's been no movement since this review, I'm marking this patch as returned with feedback. Please resubmit once updated. Greetings, Andres Freund