Re: [HACKERS] Speedup twophase transactions

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: [HACKERS] Speedup twophase transactions
Дата
Msg-id CAB7nPqTUx8quw_J+T_jfxhruTEEZCOd__vMyAUkjd6eMT4oC9g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Speedup twophase transactions  (Michael Paquier <michael.paquier@gmail.com>)
Ответы Re: [HACKERS] Speedup twophase transactions  (Nikhil Sontakke <nikhils@2ndquadrant.com>)
Список pgsql-hackers
On Thu, Mar 16, 2017 at 9:25 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Thu, Mar 16, 2017 at 7:18 PM, Nikhil Sontakke
> <nikhils@2ndquadrant.com> wrote:
>>> + *      * RecoverPreparedTransactions(),
>>> StandbyRecoverPreparedTransactions()
>>> + *        and PrescanPreparedTransactions() have been modified to go
>>> throug
>>> + *        gxact->inredo entries that have not made to disk yet.
>>>
>>> It seems to me that there should be an initial scan of pg_twophase at
>>> the beginning of recovery, discarding on the way with a WARNING
>>> entries that are older than the checkpoint redo horizon. This should
>>> fill in shmem entries using something close to PrepareRedoAdd(), and
>>> mark those entries as inredo. Then, at the end of recovery,
>>> PrescanPreparedTransactions does not need to look at the entries in
>>> pg_twophase. And that's the case as well of
>>> RecoverPreparedTransaction(). I think that you could get the patch
>>> much simplified this way, as any 2PC data can be fetched directly from
>>> WAL segments and there is no need to rely on scans of pg_twophase,
>>> this is replaced by scans of entries in TwoPhaseState.
>>>
>>
>> I don't think this will work. We cannot replace pg_twophase with shmem
>> entries + WAL pointers. This is because we cannot expect to have WAL entries
>> around for long running prepared queries which survive across checkpoints.
>
> But at the beginning of recovery, we can mark such entries with ondisk
> and inredo, in which case the WAL pointers stored in the shmem entries
> do not matter because the data is already on disk.

Nikhil, do you mind if I try something like that? As we already know
what is the first XID when beginning redo via
ShmemVariableCache->nextXid it is possible to discard 2PC files that
should not be here. What makes me worry is the control of the maximum
number of entries in shared memory. If there are legit 2PC files that
are flushed on disk at checkpoint, you would finish with potentially
more 2PC transactions than what should be possible (even if updates of
max_prepared_xacts are WAL-logged).
-- 
Michael



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Tsunakawa, Takayuki"
Дата:
Сообщение: Re: [HACKERS] PATCH: Make pg_stop_backup() archive wait optional
Следующее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: [HACKERS] Radix tree for character conversion