Re: index prefetching

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: index prefetching
Дата
Msg-id f0fda1a3-23ef-4cbe-b6d5-48f0bd9432cb@enterprisedb.com
обсуждение исходный текст
Ответ на Re: index prefetching  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: index prefetching  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers

On 2/15/24 17:42, Peter Geoghegan wrote:
> On Thu, Feb 15, 2024 at 9:36 AM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>> On 2/15/24 00:06, Peter Geoghegan wrote:
>>> I suppose that it might be much more important than I imagine it is
>>> right now, but it'd be nice to have something a bit more concrete to
>>> go on.
>>>
>>
>> This probably depends on which corner cases are considered important.
>>
>> The page-at-a-time approach essentially means index items at the
>> beginning of the page won't get prefetched (or vice versa, prefetch
>> distance drops to 0 when we get to end of index page).
> 
> I don't think that's true. At least not for nbtree scans.
> 
> As I went into last year, you'd get the benefit of the work I've done
> on "boundary cases" (most recently in commit c9c0589f from just a
> couple of months back), which helps us get the most out of suffix
> truncation. This maximizes the chances of only having to scan a single
> index leaf page in many important cases. So I can see no reason why
> index items at the beginning of the page are at any particular
> disadvantage (compared to those from the middle or the end of the
> page).
> 

I may be missing something, but it seems fairly self-evident to me an
entry at the beginning of an index page won't get prefetched (assuming
the page-at-a-time thing).

If I understand your point about boundary cases / suffix truncation,
that helps us by (a) picking the split in a way to minimize a single key
spanning multiple pages, if possible and (b) increasing the number of
entries that fit onto a single index page.

That's certainly true / helpful, and it makes the "first entry" issue
much less common. But the issue is still there. Of course, this says
nothing about the importance of the issue - the impact may easily be so
small it's not worth worrying about.

> Where you might have a problem is cases where it's just inherently
> necessary to visit more than a single leaf page, despite the best
> efforts of the nbtsplitloc.c logic -- cases where the scan just
> inherently needs to return tuples that "straddle the boundary between
> two neighboring pages". That isn't a particularly natural restriction,
> but it's also not obvious that it's all that much of a disadvantage in
> practice.
> 

One case I've been thinking about is sorting using index, where we often
read large part of the index.

>> It certainly was a great improvement, no doubt about that. I dislike the
>> restriction, but that's partially for aesthetic reasons - it just seems
>> it'd be nice to not have this.
>>
>> That being said, I'd be OK with having this restriction if it makes v1
>> feasible. For me, the big question is whether it'd mean we're stuck with
>> this restriction forever, or whether there's a viable way to improve
>> this in v2.
> 
> I think that there is no question that this will need to not
> completely disable kill_prior_tuple -- I'd be surprised if one single
> person disagreed with me on this point. There is also a more nuanced
> way of describing this same restriction, but we don't necessarily need
> to agree on what exactly that is right now.
> 

Even for the page-at-a-time approach? Or are you talking about the v2?

>> And I don't have answer to that :-( I got completely lost in the ongoing
>> discussion about the locking implications (which I happily ignored while
>> working on the PoC patch), layering tensions and questions which part
>> should be "in control".
> 
> Honestly, I always thought that it made sense to do things on the
> index AM side. When you went the other way I was surprised. Perhaps I
> should have said more about that, sooner, but I'd already said quite a
> bit at that point, so...
> 
> Anyway, I think that it's pretty clear that "naive desynchronization"
> is just not acceptable, because that'll disable kill_prior_tuple
> altogether. So you're going to have to do this in a way that more or
> less preserves something like the current kill_prior_tuple behavior.
> It's going to have some downsides, but those can be managed. They can
> be managed from within the index AM itself, a bit like the
> _bt_killitems() no-pin stuff does things already.
> 
> Obviously this interpretation suggests that doing things at the index
> AM level is indeed the right way to go, layering-wise. Does it make
> sense to you, though?
> 

Yeah. The basic idea was that by moving this above index AM it will work
for all indexes automatically - but given the current discussion about
kill_prior_tuple, locking etc. I'm not sure that's really feasible.

The index AM clearly needs to have more control over this.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bertrand Drouvot
Дата:
Сообщение: Re: Synchronizing slots from primary to standby
Следующее
От: Jelte Fennema-Nio
Дата:
Сообщение: Re: Add trim_trailing_whitespace to editorconfig file