Re: O(n) tasks cause lengthy startups and checkpoints

Поиск
Список
Период
Сортировка
От Nathan Bossart
Тема Re: O(n) tasks cause lengthy startups and checkpoints
Дата
Msg-id 20220217210022.GA3248793@nathanxps13
обсуждение исходный текст
Ответ на Re: O(n) tasks cause lengthy startups and checkpoints  (Andres Freund <andres@anarazel.de>)
Ответы Re: O(n) tasks cause lengthy startups and checkpoints  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Thu, Feb 17, 2022 at 11:27:09AM -0800, Andres Freund wrote:
> On 2022-02-17 10:23:37 -0800, Nathan Bossart wrote:
>> On Wed, Feb 16, 2022 at 10:59:38PM -0800, Andres Freund wrote:
>> > They're accessed by xid. The LSN is just for cleanup. Accessing files
>> > left over from a previous transaction with the same xid wouldn't be
>> > good - we'd read wrong catalog state for decoding...
>> 
>> Okay, that part makes sense to me.  However, I'm still confused about how
>> this is handled today and why moving cleanup to a separate auxiliary
>> process makes matters worse.
> 
> Right now cleanup happens every checkpoint. So cleanup can't be deferred all
> that far. We currently include a bunch of 32bit xids inside checkspoints, so
> if they're rarer than 2^31-1, we're in trouble independent of logical
> decoding.
> 
> But with this patch cleanup of logical decoding mapping files (and other
> pieces) can be *indefinitely* deferred, without being noticeable.

I see.  The custodian should ordinarily remove the files as quickly as
possible.  In fact, I bet it will typically line up with checkpoints for
most users, as the checkpointer will set the latch.  However, if there are
many temporary files to clean up, removing the logical decoding files could
be delayed for some time, as you said.

> One possible way to improve this would be to switch the on-disk filenames to
> be based on 64bit xids. But that might also present some problems (file name
> length, cost of converting 32bit xids to 64bit xids).

Okay.

>> I've done quite a bit of reading, and I haven't found anything that seems
>> intended to prevent this problem.  Do you have any pointers?
> 
> I don't know if we have an iron-clad enforcement of checkpoints happening
> every 2*31-1 xids. It's very unlikely to happen - you'd run out of space
> etc. But it'd be good to have something better than that.

Okay.  So IIUC the problem might already exist today, but offloading these
tasks to a separate process could make it more likely.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: buildfarm warnings
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: Nonrandom scanned_pages distorts pg_class.reltuples set by VACUUM