Re: Replication slot stats misgivings
От | Amit Kapila |
---|---|
Тема | Re: Replication slot stats misgivings |
Дата | |
Msg-id | CAA4eK1KBV4JJYrgB7KZXW65h3uXawYO-vEqR=7hX-uXDY058MA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Replication slot stats misgivings (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: Replication slot stats misgivings
(Andres Freund <andres@anarazel.de>)
|
Список | pgsql-hackers |
On Fri, Mar 26, 2021 at 1:17 AM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2021-03-25 17:12:31 +0530, Amit Kapila wrote: > > On Thu, Mar 25, 2021 at 11:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > On Wed, Mar 24, 2021 at 7:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > Leaving aside restart case, without some sort of such sanity checking, > > > > if both drop (of old slot) and create (of new slot) messages are lost > > > > then we will start accumulating stats in old slots. However, if only > > > > one of them is lost then there won't be any such problem. > > > > > > > > > Perhaps we could have RestoreSlotFromDisk() send something to the stats > > > > > collector ensuring the mapping makes sense? > > > > > > > > > > > > > Say if we send just the index location of each slot then probably we > > > > can setup replSlotStats. Now say before the restart if one of the drop > > > > messages was missed (by stats collector) and that happens to be at > > > > some middle location, then we would end up restoring some already > > > > dropped slot, leaving some of the still required ones. However, if > > > > there is some sanity identifier like name along with the index, then I > > > > think that would have worked for such a case. > > > > > > Even such messages could also be lost? Given that any message could be > > > lost under a UDP connection, I think we cannot rely on a single > > > message. Instead, I think we need to loosely synchronize the indexes > > > while assuming the indexes in replSlotStats and > > > ReplicationSlotCtl->replication_slots are not synchronized. > > > > > > > > > > > I think it would have been easier if we would have some OID type of > > > > identifier for each slot. But, without that may be index location of > > > > ReplicationSlotCtl->replication_slots and slotname combination can > > > > reduce the chances of slot stats go wrong quite less even if not zero. > > > > If not name, do we have anything else in a slot that can be used for > > > > some sort of sanity checking? > > > > > > I don't see any useful information in a slot for sanity checking. > > > > > > > In that case, can we do a hard check for which slots exist if > > replSlotStats runs out of space (that can probably happen only after > > restart and when we lost some drop messages)? > > I suggest we wait doing anything about this until we know if the shared > stats patch gets in or not (I'd give it 50% maybe). If it does get in > things get a good bit easier, because we don't have to deal with the > message loss issues anymore. > Okay, that makes sense. -- With Regards, Amit Kapila.
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Peter GeogheganДата:
Сообщение: Re: New IndexAM API controlling index vacuum strategies