Re: Warn when parallel restoring a custom dump without data offsets
От | David Gilman |
---|---|
Тема | Re: Warn when parallel restoring a custom dump without data offsets |
Дата | |
Msg-id | CALBH9DBYLxYxA7-=CZNa0K_Sg4wCv=RHFJDxOcSW8cZB5orezQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Warn when parallel restoring a custom dump without data offsets (David Gilman <davidgilman1@gmail.com>) |
Ответы |
Re: Warn when parallel restoring a custom dump without data offsets
|
Список | pgsql-hackers |
I've rounded this patch out with a test and I've set up the commitfest website for this thread. The latest patches are attached and I think they are ready for review. On Wed, May 20, 2020 at 11:05 PM David Gilman <davidgilman1@gmail.com> wrote: > > I did some more digging. To keep everyone on the same page there are > four different ways to order TOCs: > > 1. topological order, > 2. dataLength order, size of the table, is always zero when pg_dump can't seek, > 3. dumpId order, which should be thought as random but roughly > correlates to topological order to make things fun, > 4. file order, the order that tables are physically stored in the > custom dump file. > > Without being able to seek backwards a parallel restore of the custom > dump archive format has to be ordered by #1 and #4. The reference > counting that reduce_dependencies does inside of the parallel restore > logic upholds ordering #1. Unfortunately, 548e50976ce changed > TocEntrySizeCompare (which is used to break ties within #1) to order > by #2, then by #3. This most often breaks on dumps written by pg_dump > without seeks (everything has a dataLength of zero) as it then falls > back to #3 ordering every time. But, because nothing in pg_restore > does any ordering by #4 you could potentially run into this with any > custom dump so I think it's a regression. > > For some troubleshooting I changed ready_list_sort to never call > qsort. This fixes the problem by never ordering by #3, leaving things > in #4 order, but breaks the new algorithm introduced in 548e50976ce. > > I did what Justin suggested earlier and it works great. Parallel > restore requires seekable input (enforced elsewhere) so everyone's > parallel restores should work again. > > On Wed, May 20, 2020 at 10:48 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > David Gilman <davidgilman1@gmail.com> writes: > > >> I think the PG11 > > >> commit you mentioned (548e5097) happens to make some databases fail in > > >> parallel restore that previously worked (I didn't check). > > > > > Correct, if you do the bisect around that yourself you'll see > > > pg_restore start failing with the expected "possibly due to > > > out-of-order restore request" on offset-less dumps. > > > > Yeah. Now, the whole point of that patch was to decouple the restore > > order from the dump order ... but with an offset-less dump file, we > > can't do that, or at least the restore order is greatly constrained. > > I wonder if it'd be sensible for pg_restore to use a different parallel > > scheduling algorithm if it notices that the input lacks offsets. > > (There could still be some benefit from parallelism, just not as much.) > > No idea if this is going to be worth the trouble, but it probably > > is worth looking into. > > > > regards, tom lane > > > > -- > David Gilman > :DG< -- David Gilman :DG<
Вложения
В списке pgsql-hackers по дате отправления: