Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication

Поиск
Список
Период
Сортировка
От Peter Smith
Тема Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Дата
Msg-id CAHut+Psfn0p4NrOHYzrT+_7-mqhibHqvKjhUhh8O=79iYnrLdw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication  (Melih Mutlu <m.melihmutlu@gmail.com>)
Ответы Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication  (Melih Mutlu <m.melihmutlu@gmail.com>)
Список pgsql-hackers
On Thu, May 25, 2023 at 6:59 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> Hi,
>
>
> Peter Smith <smithpb2250@gmail.com>, 24 May 2023 Çar, 05:59 tarihinde şunu yazdı:
>>
>> Hi, and thanks for the patch! It is an interesting idea.
>>
>> I have not yet fully read this thread, so below are only my first
>> impressions after looking at patch 0001. Sorry if some of these were
>> already discussed earlier.
>>
>> TBH the patch "reuse-workers" logic seemed more complicated than I had
>> imagined it might be.
>
>
> If you mean patch 0001 by the patch "reuse-workers", most of the complexity comes with some refactoring to split
applyworker and tablesync worker paths. [1] 
> If you mean the whole patch set, then I believe it's because reusing replication slots also requires having a proper
snapshoteach time the worker moves to a new table. [2] 
>

Yes, I was mostly referring to the same as point 1 below about patch
0001. I guess I just found the concept of mixing A) launching TSW (via
apply worker) with B) reassigning TSW to another relation (by the TSW
battling with its peers) to be a bit difficult to understand. I
thought most of the refactoring seemed to arise from choosing to do it
that way.

>>
>>
>> 1.
>> IIUC with patch 0001, each/every tablesync worker (a.k.a. TSW) when it
>> finishes dealing with one table then goes looking to find if there is
>> some relation that it can process next. So now every TSW has a loop
>> where it will fight with every other available TSW over who will get
>> to process the next relation.
>>
>> Somehow this seems all backwards to me. Isn't it strange for the TSW
>> to be the one deciding what relation it would deal with next?
>>
>> IMO it seems more natural to simply return the finished TSW to some
>> kind of "pool" of available workers and the main Apply process can
>> just grab a TSW from that pool instead of launching a brand new one in
>> the existing function process_syncing_tables_for_apply(). Or, maybe
>> those "available" workers can be returned to a pool maintained within
>> the launcher.c code, which logicalrep_worker_launch() can draw from
>> instead of launching a whole new process?
>>
>> (I need to read the earlier posts to see if these options were already
>> discussed and rejected)
>
>
> I think ([3]) relying on a single apply worker for the assignment of several tablesync workers might bring some
overhead,it's possible that some tablesync workers wait in idle until the apply worker assigns them something. OTOH
yes,the current approach makes tablesync workers race for a new table to sync. 

Yes, it might be slower than the 'patched' code because "available"
workers might be momentarily idle while they wait to be re-assigned to
the next relation. We would need to try it to find out.

> TBF both ways might be worth discussing/investigating more, before deciding which way to go.

+1. I think it would be nice to see POC of both ways for benchmark
comparison because IMO performance is not the only consideration --
unless there is an obvious winner, then they need to be judged also by
the complexity of the logic, the amount of code that needed to be
refactored, etc.

>
>>
>> 2.
>> AFAIK the thing that identifies a  tablesync worker is the fact that
>> only TSW will have a 'relid'.
>>
>> But it feels very awkward to me to have a TSW marked as "available"
>> and yet that LogicalRepWorker must still have some OLD relid field
>> value lurking (otherwise it will forget that it is a "tablesync"
>> worker!).
>>
>> IMO perhaps it is time now to introduce some enum 'type' to the
>> LogicalRepWorker. Then an "in_use" type=TSW would have a valid 'relid'
>> whereas an "available" type=TSW would have relid == InvalidOid.
>
>
> Hmm, relid will be immediately updated when the worker moves to a new table. And the time between finishing sync of a
tableand finding a new table to sync should be minimal. I'm not sure how having an old relid for such a small amount of
timecan do any harm. 

There is no "harm", but it just didn't feel right to make the
LogicalRepWorker to transition through some meaningless state
("available" for re-use but still assigned some relid), just because
it was easy to do it that way. I think it is more natural for the
'relid' to be valid only when it is valid for the worker and to be
InvalidOid when it is not valid. --- Maybe this gripe would become
more apparent if the implementation use the "free-list" idea because
then you would have a lot of bogus relids assigned to the workers of
that list for longer than just a moment.

>
>>
>> 3.
>> Maybe I am mistaken, but it seems the benchmark results posted are
>> only using quite a small/default values for
>> "max_sync_workers_per_subscription", so I wondered how those results
>> are affected by increasing that GUC. I think having only very few
>> workers would cause more sequential processing, so conveniently the
>> effect of the patch avoiding re-launch might be seen in the best
>> possible light. OTOH, using more TSW in the first place might reduce
>> the overall tablesync time because the subscriber can do more work in
>> parallel.
>>
>>
>>
>> So I'm not quite sure what the goal is here. E.g. if the user doesn't
>>
>> care much about how long tablesync phase takes then there is maybe no
>> need for this patch at all. OTOH, I thought if a user does care about
>> the subscription startup time, won't those users be opting for a much
>> larger "max_sync_workers_per_subscription" in the first place?
>> Therefore shouldn't the benchmarking be using a larger number too?
>
>
> Regardless of how many tablesync workers there are, reusing workers will speed things up if a worker has a chance to
syncmore than one table. Increasing the number of tablesync workers, of course, improves the tablesync performance. But
ifit doesn't make 100% parallel ( meaning that # of sync workers != # of tables to sync), then reusing workers can
bringan additional improvement. 
>
> Here are some benchmarks similar to earlier, but with 100 tables and different number of workers:
>
> +--------+-------------+-------------+-------------+------------+
> |        | 2 workers   | 4 workers   | 6 workers   | 8 workers  |
> +--------+-------------+-------------+-------------+------------+
> | master | 2579.154 ms | 1383.153 ms | 1001.559 ms | 911.758 ms |
> +--------+-------------+-------------+-------------+------------+
> | patch  | 1724.230 ms | 853.894 ms  | 601.176 ms  | 496.395 ms |
> +--------+-------------+-------------+-------------+------------+
>
> So yes, increasing the number of workers makes it faster. But reusing workers can still improve more.
>

Thanks for the benchmark results! There is no denying they seem pretty
good numbers.

But it is difficult to get an overall picture of the behaviour. Mostly
when benchmarks were posted you hold one variable fixed and show only
one other varying. It always leaves me wondering -- what about not
empty tables, or what about different numbers of tables etc. Is it
possible to make some script to gather a bigger set of results so we
can see everything at once? Perhaps then it will become clear there is
some "sweet spot" where the patch is really good but beyond that it
degrades (actually, who knows what it might show).

For example:

=== empty tables

workers:2     workers:4     workers:8     workers:16
tables:10     tables:10     tables:10     tables:10
data:0        data:0        data:0        data:0
master/patch  master/patch  master/patch  master/patch

workers:2     workers:4     workers:8     workers:16
tables:100    tables:100    tables:100    tables:100
data:0        data:0        data:0        data:0
master/patch  master/patch  master/patch  master/patch

workers:2     workers:4     workers:8     workers:16
tables:1000   tables:1000   tables:1000   tables:1000
data:0        data:0        data:0        data:0
master/patch  master/patch  master/patch  master/patch

=== 1M data

workers:2     workers:4     workers:8     workers:16
tables:10     tables:10     tables:10     tables:10
data:1M       data:1M       data:1M       data:1M
master/patch  master/patch  master/patch  master/patch

workers:2     workers:4     workers:8     workers:16
tables:100    tables:100    tables:100    tables:100
data:1M       data:1M       data:1M       data:1M
master/patch  master/patch  master/patch  master/patch

workers:2     workers:4     workers:8     workers:16
tables:1000   tables:1000   tables:1000   tables:1000
data:1M       data:1M       data:1M       data:1M
master/patch  master/patch  master/patch  master/patch

=== 10M data

workers:2     workers:4     workers:8     workers:16
tables:10     tables:10     tables:10     tables:10
data:10M      data:10M      data:10M      data:10M
master/patch  master/patch  master/patch  master/patch

workers:2     workers:4     workers:8     workers:16
tables:100    tables:100    tables:100    tables:100
data:10M      data:10M      data:10M      data:10M
master/patch  master/patch  master/patch  master/patch

workers:2     workers:4     workers:8     workers:16
tables:1000   tables:1000   tables:1000   tables:1000
data:10M      data:10M      data:10M      data:10M
master/patch  master/patch  master/patch  master/patch

== 100M data

workers:2     workers:4     workers:8     workers:16
tables:10     tables:10     tables:10     tables:10
data:100M     data:100M     data:100M     data:100M
master/patch  master/patch  master/patch  master/patch

workers:2     workers:4     workers:8     workers:16
tables:100    tables:100    tables:100    tables:100
data:100M     data:100M     data:100M     data:100M
master/patch  master/patch  master/patch  master/patch

workers:2     workers:4     workers:8     workers:16
tables:1000   tables:1000   tables:1000   tables:1000
data:100M     data:100M     data:100M     data:100M
master/patch  master/patch  master/patch  master/patch

------
Kind Regards,
Peter Smith
Fujitsu Australia



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Yu Shi (Fujitsu)"
Дата:
Сообщение: BF animal dikkop reported a failure in 035_standby_logical_decoding
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Cleaning up nbtree after logical decoding on standby work