Re: logical decoding and replication of sequences, take 2
От | Tomas Vondra |
---|---|
Тема | Re: logical decoding and replication of sequences, take 2 |
Дата | |
Msg-id | c6a08e9d-7b08-76d4-197d-fab54039ed91@enterprisedb.com обсуждение исходный текст |
Ответ на | RE: logical decoding and replication of sequences, take 2 ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>) |
Ответы |
RE: logical decoding and replication of sequences, take 2
("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
|
Список | pgsql-hackers |
On 12/1/23 12:08, Hayato Kuroda (Fujitsu) wrote: > Dear Tomas, > >> I did some micro-benchmarking today, trying to identify cases where this >> would cause unexpected problems, either due to having to maintain all >> the relfilenodes, or due to having to do hash lookups for every sequence >> change. But I think it's fine, mostly ... >> > > I did also performance tests (especially case 3). First of all, there are some > variants from yours. > > 1. patch 0002 was reverted because it has an issue. So this test checks whether > refactoring around ReorderBufferSequenceIsTransactional seems really needed. FWIW I also did the benchmarks without the 0002 patch, for the same reason. I forgot to mention that. > 2. per comments from Amit, I also measured the abort case. In this case, the > alter_sequence() is called but the transaction is aborted. > 3. I measured with changing number of clients {8, 16, 32, 64, 128}. In any cases, > clients executed 1000 transactions. The performance machine has 128 core so that > result for 128 clients might be saturated. > 4. a short sleep (0.1s) was added in alter_sequence(), especially between > "alter sequence" and nextval(). Because while testing, I found that the > transaction is too short to execute in parallel. I think it is reasonable > because ReorderBufferSequenceIsTransactional() might be worse when the parallelism > is increased. > > I attached one backend process via perf and executed pg_slot_logical_get_changes(). > Attached txt file shows which function occupied CPU time, especially from > pg_logical_slot_get_changes_guts() and ReorderBufferSequenceIsTransactional(). > Here are my observations about them. > > * In case of commit, as you said, SnapBuildCommitTxn() seems dominant for 8-64 > clients case. > * For (commit, 128 clients) case, however, ReorderBufferRestoreChanges() waste > many times. I think this is because changes exceed logical_decoding_work_mem, > so we do not have to analyze anymore. > * In case of abort, CPU time used by ReorderBufferSequenceIsTransactional() is linearly > longer. This means that we need to think some solution to avoid the overhead by > ReorderBufferSequenceIsTransactional(). > > ``` > 8 clients 3.73% occupied time > 16 7.26% > 32 15.82% > 64 29.14% > 128 46.27% > ``` Interesting, so what exactly does the transaction do? Anyway, I don't think this is very surprising - I believe it behaves like this because of having to search in many hash tables (one in each toplevel xact). And I think the solution I explained before (maintaining a single toplevel hash, instead of many per-top-level hashes). FWIW I find this case interesting, but not very practical, because no practical workload has that many aborts. > > * In case of abort, I also checked CPU time used by ReorderBufferAddRelFileLocator(), but > it seems not so depends on the number of clients. > > ``` > 8 clients 3.66% occupied time > 16 6.94% > 32 4.65% > 64 5.39% > 128 3.06% > ``` > > As next step, I've planned to run the case which uses setval() function, because it > generates more WALs than normal nextval(); > How do you think? > Sure, although I don't think it's much different from the test selecting 40 values from the sequence (in each transaction). That generates about the same amount of WAL. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Tomas VondraДата:
Сообщение: Re: logical decoding and replication of sequences, take 2