Re: Problem with txid_snapshot_in/out() functionality

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Problem with txid_snapshot_in/out() functionality
Дата
Msg-id 5348EAD4.2010202@vmware.com
обсуждение исходный текст
Ответ на Problem with txid_snapshot_in/out() functionality  (Jan Wieck <jan@wi3ck.info>)
Ответы Re: Problem with txid_snapshot_in/out() functionality  (Jan Wieck <jan@wi3ck.info>)
Re: Problem with txid_snapshot_in/out() functionality  (Andres Freund <andres@2ndquadrant.com>)
Re: Problem with txid_snapshot_in/out() functionality  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On 04/12/2014 12:07 AM, Jan Wieck wrote:
> Hackers,
>
> the Slony team has been getting seldom reports of a problem with the
> txid_snapshot data type.
>
> The symptom is that a txid_snapshot on output lists the same txid
> multiple times in the xip list part of the external representation. This
> string is later rejected by txid_snapshot_in() when trying to determine
> if a particular txid is visible in that snapshot using the
> txid_visible_in_snapshot() function.
>
> I was not yet able to reproduce this problem in a lab environment. It
> might be related to subtransactions and/or two phase commit (at least
> one user is using both of them). The reported PostgreSQL version
> involved in that case was 9.1.

It's two-phase commit. When preparing a transaction, the state of the 
transaction is first transfered to a dummy PGXACT entry, and then the 
PGXACT entry of the backend is cleared. There is a transient state when 
both PGXACT entries have the same xid.

You can reproduce that by putting a sleep or breakpoint in 
PrepareTransaction(), just before the 
"ProcArrayClearTransaction(MyProc);" call. If you call 
txid_current_snapshot() from another session at that point, it will 
output two duplicate xids. (you will have to also commit one more 
unrelated transaction to bump up xmax).

> At this point I would find it extremely helpful to "sanitize" the
> external representation in txid_snapshot_out() while emitting some
> NOTICE level logging when this actually happens. I am aware that this
> does amount to a functional change for a back release, but considering
> that the _out() generated external representation of an existing binary
> datum won't pass the type's _in() function, I argue that such change is
> warranted. Especially since this problem could possibly corrupt a dump.

Hmm. Do we snapshots to be stored in tables, and included in a dump? I 
don't think we can guarantee that will work, at least not across 
versions, as the way we handle snapshot internally can change.

But yeah, we probably should do something about that. The most 
straightforward fix would be to scan the array in 
txid_current_snapshot() and remove any duplicates.

- Heikki



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Christian Ullrich
Дата:
Сообщение: Re: PostgreSQL in Windows console and Ctrl-C
Следующее
От: "MauMau"
Дата:
Сообщение: Re: [bug fix] pg_ctl always uses the same event source