Re: Eager page freeze criteria clarification

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Eager page freeze criteria clarification
Дата
Msg-id 20230927174633.hrnoia3vz5s7a5uv@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: Eager page freeze criteria clarification  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: Eager page freeze criteria clarification  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
Hi,

On 2023-09-27 10:25:00 -0700, Peter Geoghegan wrote:
> On Wed, Sep 27, 2023 at 10:01 AM Andres Freund <andres@anarazel.de> wrote:
> > On 2023-09-26 09:07:13 -0700, Peter Geoghegan wrote:
> > I don't think doing this on a system wide basis with a metric like #unfrozen
> > pages is a good idea. It's quite common to have short lived data in some
> > tables while also having long-lived data in other tables. Making opportunistic
> > freezing more aggressive in that situation will just hurt, without a benefit
> > (potentially even slowing down the freezing of older data!). And even within a
> > single table, making freezing more aggressive because there's a decent sized
> > part of the table that is updated regularly and thus not frozen, doesn't make
> > sense.
>
> I never said that #unfrozen pages should be the sole criterion, for
> anything. Just that it would influence the overall strategy, making
> the system veer towards more aggressive freezing. It would complement
> a more sophisticated algorithm that decides whether or not to freeze a
> page based on its individual characteristics.
>
> For example, maybe the page-level algorithm would have a random
> component. That could potentially be where the global (or at least
> table level) view gets to influence things -- the random aspect is
> weighed using the global view of debt. That kind of thing seems like
> an interesting avenue of investigation.

I don't disagree that we should do something in that direction - I just don't
see the raw number of unfrozen pages being useful in that regard. If you have
a database where no pages live long, we don't need to freeze
oppportunistically, yet the fraction of unfrozen pages will be huge.


> > If we want to take global freeze debt into account, which I think is a good
> > idea, we'll need a smarter way to represent the debt than just the number of
> > unfrozen pages.  I think we would need to track the age of unfrozen pages in
> > some way. If there are a lot of unfrozen pages with a recent xid, then it's
> > fine, but if they are older and getting older, it's a problem and we need to
> > be more aggressive.
>
> Tables like pgbench_history will have lots of unfrozen pages with a
> recent XID that get scanned during every VACUUM. We should be freezing
> such pages at the earliest opportunity.

I think we ought to be able to freeze tables with as simple a workload as
pgbench_history has aggressively without taking a global freeze debt into
account.


> > The problem I see is how track the age of unfrozen data -
> > it'd be easy enough to track the mean(oldest-64bit-xid-on-page), but then we
> > again have the issue of rare outliers moving the mean too much...
>
> I think that XID age is mostly not very important compared to the
> absolute amount of unfrozen pages, and the cost profile of freezing
> now versus later. (XID age *is* important in emergencies, but that's
> mostly not what we're discussing right now.)

We definitely *also* should take the number of unfrozen pages into account. I
just don't determining freeze debt primarily using the number of unfrozen
pages will be useful. The presence of unfrozen pages that are likely to be
updated again soon is not a problem and makes the simple metric pretty much
useless.


> To be clear, that doesn't mean that XID age shouldn't play an
> important role in helping VACUUM to differentiate between pages that
> should not be frozen and pages that should be frozen.

I think we need to take it into acocunt to determine a useful freeze debt on a
table level (and potentially system wide too).

Assuming we could compute it cheaply enough, if we had an approximate median
oldest-64bit-xid-on-page and the number of unfrozen pages, we could
differentiate between tables that have lots of recent unfrozen pages (the
median will be low) and pages with lots of unfrozen pages that are unlikely to
be updated again (the median will be high and growing).  Something like the
median 64bit xid would be interesting because it'd not get "invalidated" if
relfrozenxid is increased.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: pg_stat_get_activity(): integer overflow due to (int) * (int) for MemoryContextAllocHuge()
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: Eager page freeze criteria clarification