Re: ATTACH/DETACH PARTITION CONCURRENTLY
От | Andres Freund |
---|---|
Тема | Re: ATTACH/DETACH PARTITION CONCURRENTLY |
Дата | |
Msg-id | 20180807132925.grxgp3mtg4i6mpib@alap3.anarazel.de обсуждение исходный текст |
Ответ на | Re: ATTACH/DETACH PARTITION CONCURRENTLY (David Rowley <david.rowley@2ndquadrant.com>) |
Ответы |
Re: ATTACH/DETACH PARTITION CONCURRENTLY
Re: ATTACH/DETACH PARTITION CONCURRENTLY Re: ATTACH/DETACH PARTITION CONCURRENTLY Re: ATTACH/DETACH PARTITION CONCURRENTLY |
Список | pgsql-hackers |
On 2018-08-08 01:23:51 +1200, David Rowley wrote: > On 8 August 2018 at 00:47, Andres Freund <andres@anarazel.de> wrote: > > On 2018-08-08 00:40:12 +1200, David Rowley wrote: > >> 1. Obtain a ShareUpdateExclusiveLock on the partitioned table rather > >> than an AccessExclusiveLock. > >> 2. Do all the normal partition attach partition validation. > >> 3. Insert pg_partition record with partvalid = true. > >> 4. Invalidate relcache entry for the partitioned table > >> 5. Any loops over a partitioned table's PartitionDesc must check > >> PartitionIsValid(). This will return true if the current snapshot > >> should see the partition or not. The partition is valid if partisvalid > >> = true and the xmin precedes or is equal to the current snapshot. > > > > How does this protect against other sessions actively using the relcache > > entry? Currently it is *NOT* safe to receive invalidations for > > e.g. partitioning contents afaics. > > I'm not proposing that sessions running older snapshots can't see that > there's a new partition. The code I have uses PartitionIsValid() to > test if the partition should be visible to the snapshot. The > PartitionDesc will always contain details for all partitions stored in > pg_partition whether they're valid to the current snapshot or not. I > did it this way as there's no way to invalidate the relcache based on > a point in transaction, only a point in time. I don't think that solves the problem that an arriving relcache invalidation would trigger a rebuild of rd_partdesc, while it actually is referenced by running code. You'd need to build infrastructure to prevent that. One approach would be to make sure that everything relying on rt_partdesc staying the same stores its value in a local variable, and then *not* free the old version of rt_partdesc (etc) when the refcount > 0, but delay that to the RelationClose() that makes refcount reach 0. That'd be the start of a framework for more such concurrenct handling. Regards, Andres Freund
В списке pgsql-hackers по дате отправления: