Re: Single transaction in the tablesync worker?
От | Peter Smith |
---|---|
Тема | Re: Single transaction in the tablesync worker? |
Дата | |
Msg-id | CAHut+Ptjk-Qgd3R1a1_tr62CmiswcYphuv0pLmVA-+2s8r0Bkw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Single transaction in the tablesync worker? (Amit Kapila <amit.kapila16@gmail.com>) |
Ответы |
Re: Single transaction in the tablesync worker?
|
Список | pgsql-hackers |
On Mon, Dec 7, 2020 at 7:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > As the slot of apply worker is created before all the tablesync > workers it should never miss any LSN which tablesync workers would > have processed. Also, the table sync workers should not process any > xact if the apply worker has not processed anything. I think tablesync > currently always processes one transaction (because we call > process_sync_tables at commit of a txn) even if that is not required > to be in sync with the apply worker. This should solve both the > problems (a) visibility of partial transactions (b) allow prepared > transactions because tablesync worker no longer needs to combine > multiple transactions data. > > I think the other advantages of this would be that it would reduce the > load (both CPU and I/O) on the publisher-side by allowing to decode > the data only once instead of for each table sync worker once and > separately for the apply worker. I think it will use fewer resources > to finish the work. Yes, I observed this same behavior. IIUC the only way for the tablesync worker to go from CATCHUP mode to SYNCDONE is via the call to process_sync_tables. But a side-effect of this is, when messages arrive during this CATCHUP phase one tx will be getting handled by the tablesync worker before the process_sync_tables() is ever encountered. I have created and attached a simple patch which allows the tablesync to detect if there is anything to do *before* it enters the apply main loop. Calling process_sync_tables() before the apply main loop offers a quick way out so the message handling will not be split unnecessarily between the workers. ~ The result of the patch is demonstrated by the following test/logs which are also attached. Note: I added more logging (not in this patch) to make it easier to see what is going on. LOGS1. Current code. Test: 10 x INSERTS done at CATCHUP time. Result: tablesync worker does 1 x INSERT, then apply worker skips 1 and does remaining 9 x INSERTs. LOGS2. Patched code. Test: Same 10 x INSERTS done at CATCHUP time. Result: tablesync can exit early. apply worker handles all 10 x INSERTs LOGS3. Patched code. Test: 2PC PREPARE then COMMIT PREPARED [1] done at CATCHUP time psql -d test_pub -c "BEGIN;INSERT INTO test_tab VALUES(1, 'foo');PREPARE TRANSACTION 'test_prepared_tab';" psql -d test_pub -c "COMMIT PREPARED 'test_prepared_tab';" Result: The PREPARE and COMMIT PREPARED are both handle by apply worker. This avoids complications which the split otherwise causes. [1] 2PC prepare test requires v29 patch from https://www.postgresql.org/message-id/flat/CAMGcDxeqEpWj3fTXwqhSwBdXd2RS9jzwWscO-XbeCfso6ts3%2BQ%40mail.gmail.com --- Kind Regards, Peter Smith. Fujitsu Australia
Вложения
В списке pgsql-hackers по дате отправления: