Обсуждение: doc: explain pgstatindex fragmentation

Поиск
Список
Период
Сортировка

doc: explain pgstatindex fragmentation

От
Benoit Lobréau
Дата:
Hi,

I like the clarification but I think that:

A higher <literal>leaf_fragmentation</literal> indicates that the 
physical order of the index leaf
pages increasingly deviates from their logical order.

Would be cleaner than:

The higher <literal>leaf_fragmentation</literal> is, the less the 
physical order
of the index leaf pages corresponds to the logical order we would have 
just after
a <command>REINDEX</command>.

-- 
Benoit Lobréau
Consultant
http://dalibo.com



Re: doc: explain pgstatindex fragmentation

От
Frédéric Yhuel
Дата:

On 1/22/25 10:24, Benoit Lobréau wrote:
> Here is an updated patch to save you some time.

Thanks ! it does seems better that way.




Re: doc: explain pgstatindex fragmentation

От
Bertrand Drouvot
Дата:
Hi,

On Tue, Nov 05, 2024 at 06:36:47PM +0100, Frédéric Yhuel wrote:
> Hi, I thought it would be nice to give the user a better idea of what
> avg_leaf_density and leaf_fragmentation mean.
> 
> Patch attached. What do you think?

Yeah, I think that can not hurt to give more details, thanks for the proposal!

A few comments:

=== 1

+     <literal>avg_leaf_density</literal> can be seen as the inverse of bloat,

I'm not sure it's good to describe something as the inverse of "something
else". See my proposal below.

=== 2

I’m not sure we need to add the extra details in a paragraph below the fields
description. What about changing the fields description?

Something concise enough like?

avg_leaf_density: shows how full leaf pages currently are (100 if full)
leaf_fragmentation: shows how much physical and logical ordering of leaf pages
differ (zero if they don't)

Also the comments made in [1], [2] and [3] are not linked to this main thread,
adding them for reference here (but better to keep the conversation going
by replying to this email).

[1]: https://www.postgresql.org/message-id/4c5dee3a-8381-4e0f-b882-d1bd950e8972%40dalibo.com
[2]: https://www.postgresql.org/message-id/c70fcc72-eed6-475b-81c8-508422299351%40dalibo.com
[3]: https://www.postgresql.org/message-id/e8a6db36-073e-4ca3-b38c-b42d7094cba8%40dalibo.com

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: doc: explain pgstatindex fragmentation

От
Frédéric Yhuel
Дата:

On 1/22/25 12:34, Bertrand Drouvot wrote:
> Hi,
> 
> On Tue, Nov 05, 2024 at 06:36:47PM +0100, Frédéric Yhuel wrote:
>> Hi, I thought it would be nice to give the user a better idea of what
>> avg_leaf_density and leaf_fragmentation mean.
>>
>> Patch attached. What do you think?
> 
> Yeah, I think that can not hurt to give more details, thanks for the proposal!
> 

Hi Bertrand, thanks for your review!

> A few comments:
> 
> === 1
> 
> +     <literal>avg_leaf_density</literal> can be seen as the inverse of bloat,
> 
> I'm not sure it's good to describe something as the inverse of "something
> else". See my proposal below.
> 

Yeah... bloat is a more familiar concept, so I wanted to link these two 
metrics... but "inverse" is confusing... or maybe something like that:

A small <literal>avg_leaf_density</literal> means that the index is bloated.

> === 2
> 
> I’m not sure we need to add the extra details in a paragraph below the fields
> description. What about changing the fields description?
> 
> Something concise enough like?
> 
> avg_leaf_density: shows how full leaf pages currently are (100 if full)

That should do :-)

> leaf_fragmentation: shows how much physical and logical ordering of leaf pages
> differ (zero if they don't)
> 

It looks good to me.

I've noticed that maximum leaf_fragmentation can have a huge impact on a 
range index-only scan, when reading all blocs from disks, even on my 
laptop machine with SSD, but I don't know if this is the right place to 
document this?

I used the following psql scripts to test the effect of 
leaf_fragmentation (the first one calls the second one):

https://github.com/dalibo/misc/blob/main/fyhuel/leaf_fragmentation.sql
https://github.com/dalibo/misc/blob/main/fyhuel/evict_from_both_caches.sql

> Also the comments made in [1], [2] and [3] are not linked to this main thread,
> adding them for reference here (but better to keep the conversation going
> by replying to this email).
> 
> [1]: https://www.postgresql.org/message-id/4c5dee3a-8381-4e0f-b882-d1bd950e8972%40dalibo.com
> [2]: https://www.postgresql.org/message-id/c70fcc72-eed6-475b-81c8-508422299351%40dalibo.com
> [3]: https://www.postgresql.org/message-id/e8a6db36-073e-4ca3-b38c-b42d7094cba8%40dalibo.com
> 

Indeed, I think Benoît mistakenly thought that thread aggregation was 
based on thread titles alone. He appended the second conversation to the 
commitfest entry.



Re: doc: explain pgstatindex fragmentation

От
Bertrand Drouvot
Дата:
Hi Frédéric,

On Thu, Jan 23, 2025 at 10:00:27AM +0100, Frédéric Yhuel wrote:
> On 1/22/25 12:34, Bertrand Drouvot wrote:
> > I'm not sure it's good to describe something as the inverse of "something
> > else". See my proposal below.
> > 
> 
> Yeah... bloat is a more familiar concept, so I wanted to link these two
> metrics

Yeah but in the (rare?) case "bloat" is not known then one would have to make
sense of it first.

> > I’m not sure we need to add the extra details in a paragraph below the fields
> > description. What about changing the fields description?
> > 
> > Something concise enough like?
> > 
> > avg_leaf_density: shows how full leaf pages currently are (100 if full)
> 
> That should do :-)

Thanks!

> > leaf_fragmentation: shows how much physical and logical ordering of leaf pages
> > differ (zero if they don't)
> > 
> 
> It looks good to me.

Thanks!

> I've noticed that maximum leaf_fragmentation can have a huge impact on a
> range index-only scan, when reading all blocs from disks, even on my laptop
> machine with SSD, but I don't know if this is the right place to document
> this?

Yeah, that might be worth to mention. Maybe below the descriptions then? (keeping
the changes above in the description).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: doc: explain pgstatindex fragmentation

От
Bertrand Drouvot
Дата:
Hi,

On Fri, Jan 24, 2025 at 12:34:08PM +0100, Frédéric Yhuel wrote:
> I don't know if you noticed Laurenz's suggestion, because he forgot to CC
> you, but I like it very much. I think we should mention the default
> fillfactor (90 for indexes).

Thanks for mentioning Laurenz's suggestion!

=== 1

+ Since indexes have a default fillfactor of 90, this should be around 0.9 for
+ newly built indexes

I think 0.9 should be replaced by 90 (that's the actual kind of output we'd get).

But having said that, I'm not sure we should mention those 90 stuff because it
depends of the amount of data indexed (I mean if the index has a very few
leaf pages, say < 5, then it's easy to be << 90 since it's an average). That's
probably not the majority of indexes though so maybe just nuance the sentence a
bit.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: doc: explain pgstatindex fragmentation

От
Laurenz Albe
Дата:
On Fri, 2025-01-24 at 13:34 +0000, Bertrand Drouvot wrote:
> + Since indexes have a default fillfactor of 90, this should be around 0.9 for
> + newly built indexes
>
> I think 0.9 should be replaced by 90 (that's the actual kind of output we'd get).
>
> But having said that, I'm not sure we should mention those 90 stuff because it
> depends of the amount of data indexed (I mean if the index has a very few
> leaf pages, say < 5, then it's easy to be << 90 since it's an average). That's
> probably not the majority of indexes though so maybe just nuance the sentence a
> bit.

Sorry about the 0.9.

Perhaps the wording could be more careful: ... this should be around 90 for
most newly built indexes of non-neglectable size.

Yours,
Laurenz Albe



Re: doc: explain pgstatindex fragmentation

От
Peter Eisentraut
Дата:
On 24.01.25 15:41, Frédéric Yhuel wrote:
> 
> 
> On 1/24/25 14:58, Laurenz Albe wrote:
>> On Fri, 2025-01-24 at 13:34 +0000, Bertrand Drouvot wrote:
>>> + Since indexes have a default fillfactor of 90, this should be 
>>> around 0.9 for
>>> + newly built indexes
>>>
>>> I think 0.9 should be replaced by 90 (that's the actual kind of 
>>> output we'd get).
>>>
> 
> Damn! I missed that one too...
> 
>>> But having said that, I'm not sure we should mention those 90 stuff 
>>> because it
>>> depends of the amount of data indexed (I mean if the index has a very 
>>> few
>>> leaf pages, say < 5, then it's easy to be << 90 since it's an 
>>> average). That's
>>> probably not the majority of indexes though so maybe just nuance the 
>>> sentence a
>>> bit.
>>
>> Sorry about the 0.9.
>>
>> Perhaps the wording could be more careful: ... this should be around 
>> 90 for
>> most newly built indexes of non-neglectable size.
>>
> 
> It looks good to me (apart from the typo). v4 attached

committed