Обсуждение: Re: Re: Refresh Publication takes hours and doesn´t finish
[ redirecting to pgsql-hackers as the more relevant list ] I wrote: > PegoraroF10 <marcos@f10.com.br> writes: >> I tried sometime ago ... but with no responses, I ask you again. >> pg_publication_tables is a view that is used to refresh publication, but as >> we have 15.000 tables, it takes hours and doesn't complete. If I change that >> view I can have an immediate result. The question is: Can I change that view >> ? There is some trouble changing those system views ? > Hmm ... given that pg_get_publication_tables() shouldn't return any > duplicate OIDs, it does seem unnecessarily inefficient to put it in > an IN-subselect condition. Peter, is there a reason why this isn't > a straight lateral join? I get a much saner-looking plan from > FROM pg_publication P, pg_class C > - JOIN pg_namespace N ON (N.oid = C.relnamespace) > - WHERE C.oid IN (SELECT relid FROM pg_get_publication_tables(P.pubname)); > + JOIN pg_namespace N ON (N.oid = C.relnamespace), > + LATERAL pg_get_publication_tables(P.pubname) > + WHERE C.oid = pg_get_publication_tables.relid; For the record, the attached seems like what to do here. It's easy to show that there's a big performance gain even for normal numbers of tables, eg if you do CREATE PUBLICATION mypub FOR ALL TABLES; SELECT * FROM pg_publication_tables; in the regression database, the time for the select drops from ~360ms to ~6ms on my machine. The existing view's performance will drop as O(N^2) the more publishable tables you have ... Given that this change impacts the regression test results, project rules say that it should come with a catversion bump. Since we are certainly going to have a catversion bump before beta2 because of the pg_statistic_ext permissions business, that doesn't seem like a reason not to push it into v12 --- any objections? regards, tom lane diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 566100d..52a6c31 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -258,9 +258,10 @@ CREATE VIEW pg_publication_tables AS P.pubname AS pubname, N.nspname AS schemaname, C.relname AS tablename - FROM pg_publication P, pg_class C - JOIN pg_namespace N ON (N.oid = C.relnamespace) - WHERE C.oid IN (SELECT relid FROM pg_get_publication_tables(P.pubname)); + FROM pg_publication P, + LATERAL pg_get_publication_tables(P.pubname) GPT, + pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace) + WHERE C.oid = GPT.relid; CREATE VIEW pg_locks AS SELECT * FROM pg_lock_status() AS L; diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out index 0c392e5..4363ca1 100644 --- a/src/test/regress/expected/rules.out +++ b/src/test/regress/expected/rules.out @@ -1441,10 +1441,10 @@ pg_publication_tables| SELECT p.pubname, n.nspname AS schemaname, c.relname AS tablename FROM pg_publication p, + LATERAL pg_get_publication_tables((p.pubname)::text) gpt(relid), (pg_class c JOIN pg_namespace n ON ((n.oid = c.relnamespace))) - WHERE (c.oid IN ( SELECT pg_get_publication_tables.relid - FROM pg_get_publication_tables((p.pubname)::text) pg_get_publication_tables(relid))); + WHERE (c.oid = gpt.relid); pg_replication_origin_status| SELECT pg_show_replication_origin_status.local_id, pg_show_replication_origin_status.external_id, pg_show_replication_origin_status.remote_lsn,
On Tue, May 21, 2019 at 4:42 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> [ redirecting to pgsql-hackers as the more relevant list ]
>
> I wrote:
> > PegoraroF10 <marcos@f10.com.br> writes:
> >> I tried sometime ago ... but with no responses, I ask you again.
> >> pg_publication_tables is a view that is used to refresh publication, but as
> >> we have 15.000 tables, it takes hours and doesn't complete. If I change that
> >> view I can have an immediate result. The question is: Can I change that view
> >> ? There is some trouble changing those system views ?
>
> > Hmm ... given that pg_get_publication_tables() shouldn't return any
> > duplicate OIDs, it does seem unnecessarily inefficient to put it in
> > an IN-subselect condition. Peter, is there a reason why this isn't
> > a straight lateral join? I get a much saner-looking plan from
>
> > FROM pg_publication P, pg_class C
> > - JOIN pg_namespace N ON (N.oid = C.relnamespace)
> > - WHERE C.oid IN (SELECT relid FROM pg_get_publication_tables(P.pubname));
> > + JOIN pg_namespace N ON (N.oid = C.relnamespace),
> > + LATERAL pg_get_publication_tables(P.pubname)
> > + WHERE C.oid = pg_get_publication_tables.relid;
>
> For the record, the attached seems like what to do here. It's easy
> to show that there's a big performance gain even for normal numbers
> of tables, eg if you do
>
> CREATE PUBLICATION mypub FOR ALL TABLES;
> SELECT * FROM pg_publication_tables;
>
> in the regression database, the time for the select drops from ~360ms
> to ~6ms on my machine. The existing view's performance will drop as
> O(N^2) the more publishable tables you have ...
>
> Given that this change impacts the regression test results, project
> rules say that it should come with a catversion bump. Since we are
> certainly going to have a catversion bump before beta2 because of
> the pg_statistic_ext permissions business, that doesn't seem like
> a reason not to push it into v12 --- any objections?
>
I completely agree to push it into v12.
Regards,
--
Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
--
Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento