Re: Make deparsing of column defaults faster

Поиск
Список
Период
Сортировка
От Justin Pryzby
Тема Re: Make deparsing of column defaults faster
Дата
Msg-id 20180705165827.GK7025@telsasoft.com
обсуждение исходный текст
Ответ на Re: Make deparsing of column defaults faster  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Ответы Re: Make deparsing of column defaults faster  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Список pgsql-hackers
On Mon, Jun 04, 2018 at 10:00:53PM -0400, Peter Eisentraut wrote:
> On 6/4/18 20:55, Jeff Janes wrote:
> > Since defaults can't contain Vars, this patch converts the second
> > parameter to zero in places where pg_get_expr is invoked on column
> > defaults.
> 
> My in-progress generated columns patch removes that assumption (since a
> generated column generally refers to another column of the same table).

On Thu, Jul 05, 2018 at 04:45:07PM +0200, Peter Eisentraut wrote:
> On 29.06.18 05:15, Jeff Janes wrote:
> > Since pg_dump calls pg_get_expr once over and over again on the same
> > table consecutively, perhaps we could cache the column alias assignments
> > in a single-entry cache, so if it is called on the same table as last
> > time it just re-uses the aliases from last time.  I am not planning on
> 
> I looked into that.  deparse_context_for() is actually not that
> expensive on its own, well below one second, but it gets somewhat
> expensive when you call it 1600 times for one table.  So to address that
> case, we can cache the deparse context between calls in the fn_extra
> field of pg_get_expr.  The attached patch does that.  This makes the
> pg_dump -s times pretty much constant even with 1600 columns with
> defaults.

I checked on one customer running PG10.4, for which pg_dump takes 8 minutes to
pg_dump -s.

I imported existing schema to PG12dev (which itself took 25min) and compared:
patched: 2m33.616s
unpatched: 7m19.578s

Note that I've reduced the number of child tables in this DB recently (by
repartitioning tables from daily to monthly granularity), thereby reducing the
number of columns of the largest tables by a factor of 30, and reducing the
size of pg_dump -s to 51MB from 120MB (6 months ago).  I expect this patch
would've saved even more before the cleanup.

> How realistic is this use case?  Is it worth it?

Note, that affects pg_upgrade, which is how this issue originally came up [0].
(But I believe pg_upgrade likes to call pg_dump from the old server version, so
pg_upgrade to v11 couldn't benefit unless this was included in PG10.5).

[pryzbyj@database postgresql]$ grep -c 'SET DEFAULT' /srv/cdrperfbackup/ts/2018-07-04/pg_dump-section=pre-data
183915

Justin

[0] https://www.postgresql.org/message-id/CAMkU%3D1x-e%2BmaqefhM1yMeSiJ8J9Z%2BSJHgW7c9bqo3E3JMG4iJA%40mail.gmail.com


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Gierth
Дата:
Сообщение: Re: Regarding shared_preload_libraries (postgresql.conf file)
Следующее
От: Fujii Masao
Дата:
Сообщение: Re: New function pg_stat_statements_reset_query() to reset statisticsof a specific query