Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions

Поиск
Список
Период
Сортировка
От Erik Rijkers
Тема Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Дата
Msg-id 84b7076830fbedc155670b859926e99e@xs4all.nl
обсуждение исходный текст
Ответ на Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
>>>> 
>>>> logical replication of 2 instances is OK but 3 and up fail with:
>>>> 
>>>> TRAP: FailedAssertion("!(last_lsn < change->lsn)", File:
>>>> "reorderbuffer.c", Line: 1773)
>>>> 
>>>> I can cobble up a script but I hope you have enough from the 
>>>> assertion
>>>> to see what's going wrong...
>>> 
>>> The assertion says that the iterator produces changes in order that 
>>> does
>>> not correlate with LSN. But I have a hard time understanding how that
>>> could happen, particularly because according to the line number this
>>> happens in ReorderBufferCommit(), i.e. the current (non-streaming) 
>>> case.
>>> 
>>> So instructions to reproduce the issue would be very helpful.
>> 
>> Using:
>> 
>> 0001-Introduce-logical_work_mem-to-limit-ReorderBuffer-v2.patch
>> 0002-Issue-XLOG_XACT_ASSIGNMENT-with-wal_level-logical-v2.patch
>> 0003-Issue-individual-invalidations-with-wal_level-log-v2.patch
>> 0004-Extend-the-output-plugin-API-with-stream-methods-v2.patch
>> 0005-Implement-streaming-mode-in-ReorderBuffer-v2.patch
>> 0006-Add-support-for-streaming-to-built-in-replication-v2.patch
>> 
>> As you expected the problem is the same with these new patches.
>> 
>> I have now tested more, and seen that it not always fails.  I guess 
>> that
>> it here fails 3 times out of 4.  But the laptop I'm using at the 
>> moment
>> is old and slow -- it may well be a factor as we've seen before [1].
>> 
>> Attached is the bash that I put together.  I tested with
>> NUM_INSTANCES=2, which yields success, and NUM_INSTANCES=3, which 
>> fails
>> often.  This same program run with HEAD never seems to fail (I tried a
>> few dozen times).
>> 
> 
> Thanks. Unfortunately I still can't reproduce the issue. I even tried
> running it in valgrind, to see if there are some memory access issues
> (which should also slow it down significantly).

One wonders again if 2ndquadrant shouldn't invest in some old hardware 
;)

Another Good Thing would be if there was a provision in the buildfarm to 
test patches like these.

But I'm probably not to first one to suggest that; no doubt it'll be 
possible someday.  In the meantime I'll try to repeat this crash on 
other machines (but that will be after the holidays).


Erik Rijkers


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fabien COELHO
Дата:
Сообщение: Re: General purpose hashing func in pgbench
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions