Re: pg_dump versus hash partitioning

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: pg_dump versus hash partitioning
Дата
Msg-id 1602203.1678650412@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: pg_dump versus hash partitioning  (Julien Rouhaud <rjuju123@gmail.com>)
Ответы Re: pg_dump versus hash partitioning  (Justin Pryzby <pryzby@telsasoft.com>)
Re: pg_dump versus hash partitioning  (Julien Rouhaud <rjuju123@gmail.com>)
Список pgsql-hackers
Julien Rouhaud <rjuju123@gmail.com> writes:
> The BEGIN + TRUNCATE is only there to avoid generating WAL records just in case
> the wal_level is minimal.  I don't remember if that optimization still exists,
> but if yes we could avoid doing that if the server's wal_level is replica or
> higher?  That's not perfect but it would help in many cases.

After thinking about this, it seems like a better idea is to skip the
TRUNCATE if we're doing load-via-partition-root.  In that case it's
clearly a dangerous thing to do regardless of deadlock worries, since
it risks discarding previously-loaded data that came over from another
partition.  (IOW this seems like an independent, pre-existing bug in
load-via-partition-root mode.)

The trick is to detect in pg_restore whether pg_dump chose to do
load-via-partition-root.  If we have a COPY statement we can fairly
easily examine it to see if the target table is what we expect or
something else.  However, if the table was dumped as INSERT statements
it'd be far messier; the INSERTs are not readily accessible from the
code that needs to make the decision.

What I propose we do about that is further tweak things so that
load-via-partition-root forces dumping via COPY.  AFAIK the only
compelling use-case for dump-as-INSERTs is in transferring data
to a non-Postgres database, which is a context in which dumping
partitioned tables as such is pretty hopeless anyway.  (I wonder if
we should have some way to dump all the contents of a partitioned
table as if it were unpartitioned, to support such migration.)

An alternative could be to extend the archive TOC format to record
directly whether a given TABLE DATA object loads data via partition
root or normally.  Potentially we could do that without an archive
format break by defining te->defn for TABLE DATA to be empty for
normal dumps (as it is now) or something like "-- load via partition root"
for the load-via-partition-root case.  However, depending on examination
of the COPY command would already work for the vast majority of existing
archive files, so I feel like it might be the preferable choice.

Thoughts?

            regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: Add LZ4 compression in pg_dump
Следующее
От: Justin Pryzby
Дата:
Сообщение: Re: pg_dump versus hash partitioning