Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
От | Alexey Kondratov |
---|---|
Тема | Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions |
Дата | |
Msg-id | b25ce80e-f536-78c8-d5c8-a5df3e230785@postgrespro.ru обсуждение исходный текст |
Ответ на | Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>) |
Ответы |
Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
|
Список | pgsql-hackers |
>> >> FWIW my understanding is that the speedup comes mostly from >> elimination of >> the serialization to a file. That however requires savepoints to handle >> aborts of subtransactions - I'm pretty sure I'd be trivial to create a >> workload where this will be much slower (with many aborts of large >> subtransactions). >> Yes, and it was my main motivation to eliminate that extra serialization to file. I've experimented a bit with large transactions + savepoints + aborts and ended up with a following query (the same schema as before with 600k rows): BEGIN; SAVEPOINT s1; UPDATE large_test SET num1 = num1 + 1, num2 = num2 + 1, num3 = num3 + 1; SAVEPOINT s2; UPDATE large_test SET num1 = num1 + 1, num2 = num2 + 1, num3 = num3 + 1; SAVEPOINT s3; UPDATE large_test SET num1 = num1 + 1, num2 = num2 + 1, num3 = num3 + 1; ROLLBACK TO SAVEPOINT s3; ROLLBACK TO SAVEPOINT s2; ROLLBACK TO SAVEPOINT s1; END; It looks like the worst case scenario, as we do a lot of work and then abort all subxacts one by one. As expected,it takes much longer (up to x30) to process using background worker instead of spilling to file. Surely, it is much easier to truncate a file, than apply all changes + abort. However, I guess that this kind of load pattern is not the most typical for real-life applications. Also this test helped me to find a bug in my current savepoints routine, so new patch is attached. On 30.08.2019 18:59, Konstantin Knizhnik wrote: > > I think that instead of defining savepoints it is simpler and more > efficient to use > > BeginInternalSubTransaction + > ReleaseCurrentSubTransaction/RollbackAndReleaseCurrentSubTransaction > > as it is done in PL/pgSQL (pl_exec.c). > Not sure if it can pr > Both BeginInternalSubTransaction and DefineSavepoint use PushTransaction() internally for a normal subtransaction start. So they seems to be identical from the performance perspective, which is also stated in the comment section: /* * BeginInternalSubTransaction * This is the same as DefineSavepoint except it allows TBLOCK_STARTED, * TBLOCK_IMPLICIT_INPROGRESS, TBLOCK_END, and TBLOCK_PREPARE states, * and therefore it can safely be used in functions that might be called * when not inside a BEGIN block or when running deferred triggers at * COMMIT/PREPARE time. Also, it automatically does * CommitTransactionCommand/StartTransactionCommand instead of expecting * the caller to do it. */ Please, correct me if I'm wrong. Anyway, I've performed a profiling of my apply worker (flamegraph is attached) and it spends the vast amount of time (>90%) applying changes. So the problem is not in the savepoints their-self, but in the fact that we first apply all changes and then abort all the work. Not sure, that it is possible to do something in this case. Regards -- Alexey Kondratov Postgres Professional https://www.postgrespro.com Russian Postgres Company
Вложения
В списке pgsql-hackers по дате отправления: