Re: snapshot too old, configured by time
От | Steve Singer |
---|---|
Тема | Re: snapshot too old, configured by time |
Дата | |
Msg-id | BLU437-SMTP193E7367B788D2EC2B2B81DC160@phx.gbl обсуждение исходный текст |
Ответ на | Re: snapshot too old, configured by time (Kevin Grittner <kgrittn@ymail.com>) |
Ответы |
Re: snapshot too old, configured by time
(Michael Paquier <michael.paquier@gmail.com>)
|
Список | pgsql-hackers |
On 10/15/2015 05:47 PM, Kevin Grittner wrote: > All other issues raised by Álvaro and Steve have been addressed, > except for this one, which I will argue against: I've been looking through the updated patch In snapmgr.c + * XXX: If we can trust a read of an int64 value to be atomic, we can skip the + * spinlock here. + */ +int64 +GetOldSnapshotThresholdTimestamp(void) Was your intent with the XXX for it to be a TODO to only aquire the lock on platforms without the atomic 64bit operations? On a more general note: I've tried various manual tests of this feature and it sometimes works as expected and sometimes doesn't. I'm getting the feeling that how I expect it to work isn't quite in sync with how it does work. I'd expect the following to be sufficient to generate the test T1: Obtains a snapshot that can see some rows T2: Waits 60 seconds and performs an update on those rows T2: Performs a vacuum T1: Waits 60 seconds, tries to select from the table. The snapshot should be too old For example it seems that in test 002 the select_ok on conn1 following the vacuum but right before the final sleep is critical to the snapshot too old error showing up (ie if I remove that select_ok but leave in the sleep I don't get the error) Is this intended and if so is there a better way we can explain how things work? Also is 0 intended to be an acceptable value for old_snapshot_threshold and if so what should we expect to see then? >> So if I understand correctly, every call to BufferGetPage needs to have >> a TestForOldSnapshot immediately afterwards? It seems easy to later >> introduce places that fail to test for old snapshots. What happens if >> they do? Does vacuum remove tuples anyway and then the query returns >> wrong results? That seems pretty dangerous. Maybe the snapshot could >> be an argument of BufferGetPage? > There are 486 occurences of BufferGetPage in the source code, and > this patch follows 36 of them with TestForOldSnapshot. This only > needs to be done when a page is going to be used for a scan which > produces user-visible results. That is, it is *not* needed for > positioning within indexes to add or vacuum away entries, for heap > inserts or deletions (assuming the row to be deleted has already > been found). It seems wrong to modify about 450 BufferGetPage > references to add a NULL parameter; and if we do want to make that > noop change as insurance, it seems like it should be a separate > patch, since the substance of this patch would be buried under the > volume of that. > > I will add this to the November CF. > > -- > Kevin Grittner > EDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > >
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Andres FreundДата:
Сообщение: Re: PATCH: 9.5 replication origins fix for logical decoding