Re: free space map and visibility map

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема Re: free space map and visibility map
Дата
Msg-id 20170329.104007.89580821.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Fwd: free space map and visibility map  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-hackers
Hello,

At Tue, 28 Mar 2017 08:50:58 -0700, Jeff Janes <jeff.janes@gmail.com> wrote in
<CAMkU=1zKfqGePWG+qqKthmWERBn8UAA2_9Sb+qTUUREhFkqLCA@mail.gmail.com>
> > > I now think this is not the cause of the problem I am seeing.  I made the
> > > replay of FREEZE_PAGE update the FSM (both with and without FPI), but
> > that
> > > did not fix it.  With frequent crashes, it still accumulated a lot of
> > > frozen and empty (but full according to FSM) pages.  I also set up
> > replica
> > > streaming and turned off crashing on the master, and the FSM of the
> > replica
> > > stays accurate, so the WAL stream and replay logic is doing the right
> > thing
> > > on the replica.
> > >
> > > I now think the dirtied FSM pages are somehow not getting marked as
> > dirty,
> > > or are getting marked as dirty but somehow the checkpoint is skipping
> > > them.  It looks like MarkBufferDirtyHint does do some operations unlocked
> > > which could explain lost update, but it seems unlikely that that would
> > > happen often enough to see the amount of lost updates I am seeing.
> >
> > Hmm.. clearing dirty hint seems already protected by exclusive
> > lock. And I think it can occur without lock failure.
> >
> > Other than by FPI, FSM update is omitted when record LSN is older
> > than page LSN. If heap page is evicted but FSM page is not after
> > vacuuming and before power cut, replaying HEAP2_CLEAN skips
> > update of FSM even though FPI is not attached. Of course this
> > cannot occur on standby. One FSM page covers as many heap pages
> > as about 4k, so FSM can stay far longer than heap pages.
> >
> 
> This corresponds to action == BLK_DONE case, right?

Yes. WAL with older LSN results in BLK_DONE. It works as long as
heap page and FSM are consistent but leaves FSM broken during
crach-recovery for the situation.

> > ALL_FROZEN is set with other than HEAP2_FREEZE_PAGE. When a page
> > is already empty when entering lazy_sacn_heap, or a page of
> > non-indexed heap is empitied in lazy_scan_heap, HRAP2_VISIBLE is
> > issued to set ALL_FROZEN.
> >
> > Perhaps the problem will be fixed by forcing heap_xlog_visible to
> > update FSM (addition to FREEZE_PAGE), or the same in
> > heap_xlog_clean. (As menthined in the previous mail, I prefer the
> > latter.)
> >
> 
> When I make heap_xlog_clean update FSM even on BLK_RESTORED (but not on
> BLK_DONE), it solves the problem I was seeing.  Which still leaves me
> wondering why the problem doesn't show up on the standby because, unlike
> BLK_DONE, BLK_RESTORED should have the same issue on standby as it does on
> a recovering master, shouldn't it? Maybe the difference is that the
> existence a replication slot delays the clean up in a way that causes a
> different pattern of WAL records.

While all WAL records are new to target page during standby
recovery, several WAL records at the beginning can be old in
a crash-recovery.

> > > > > /*
> > > > >  * Update the FSM as well.
> > > > >  *
> > > > >  * XXX: Don't do this if the page was restored from full page image.
> > We
> > > > >  * don't bother to update the FSM in that case, it doesn't need to be
> > > > >  * totally accurate anyway.
> > > > >  */
> > > >
> > >
> > > What does that save us?  If we restored from FPI, we already have the
> > block
> > > in memory (we don't need to see the old version, just the new one), so it
> > > doesn't save us a random read IO.
> >
> > Updates on random pages can cause visits to many unloaded FSM
> > pages. It may be intending to avoid that.
> 
> 
> But I think that that would be no worse for BLK_RESTORED than it is for
> BLK_NEEDS_REDO.  Why optimize only one of the cases, if it is worth
> optimizing either one?

I agree with you. FPI increases and descreases free space just
the same as redoing WAL record. The following is the discussion
about that.

https://www.postgresql.org/message-id/49072021.7010801%40enterprisedb.com

https://www.postgresql.org/message-id/24334.1225205478%40sss.pgh.pa.us

Tom Lane wrote:
> Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> > One issue with this patch is that it doesn't update the FSM at all when 
> > pages are restored from full page images. It would require fetching the 
> > page and checking the free space on it, or peeking into the size of the 
> > backup block data, and I'm not sure if it's worth the extra code to do that.
> 
> I'd vote not to bother, at least not in the first cut.  As you say, 100%
> accuracy isn't required, and I think that in typical scenarios an
> insert/update that causes a page to become full would be relatively less
> likely to have a full-page image.

So, the reason seems to be that it just doesn't seem necessary.

Including another branch of this thread, the following options
are proposed.

- Let FREEZE_PAGE and VISIBLE update FSM.
 This causes extra fetch of a heap page, summing up of free space and FSM update for every frozen pages.

- Let CLEAN always update FSM.
 This causes extra counting of free space and FSM update for every vacuuming of heap pages regardless of frozen-ness.

- Let FREEZE_PAGE/VISIBLE or CLEAN records have free space.
 This doesn't need to fetch a heap page. But breaks the policy (really?) that FSM is not WAL-logged, or that FSM is
updatedjust as the result of heap udpates. 
 

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Getting server crash after running sqlsmith
Следующее
От: Claudio Freire
Дата:
Сообщение: Re: Vacuum: allow usage of more than 1GB of work mem