Обсуждение: doc: explain pgstatindex fragmentation
Hi, I like the clarification but I think that: A higher <literal>leaf_fragmentation</literal> indicates that the physical order of the index leaf pages increasingly deviates from their logical order. Would be cleaner than: The higher <literal>leaf_fragmentation</literal> is, the less the physical order of the index leaf pages corresponds to the logical order we would have just after a <command>REINDEX</command>. -- Benoit Lobréau Consultant http://dalibo.com
On 1/22/25 10:24, Benoit Lobréau wrote: > Here is an updated patch to save you some time. Thanks ! it does seems better that way.
Hi, On Tue, Nov 05, 2024 at 06:36:47PM +0100, Frédéric Yhuel wrote: > Hi, I thought it would be nice to give the user a better idea of what > avg_leaf_density and leaf_fragmentation mean. > > Patch attached. What do you think? Yeah, I think that can not hurt to give more details, thanks for the proposal! A few comments: === 1 + <literal>avg_leaf_density</literal> can be seen as the inverse of bloat, I'm not sure it's good to describe something as the inverse of "something else". See my proposal below. === 2 I’m not sure we need to add the extra details in a paragraph below the fields description. What about changing the fields description? Something concise enough like? avg_leaf_density: shows how full leaf pages currently are (100 if full) leaf_fragmentation: shows how much physical and logical ordering of leaf pages differ (zero if they don't) Also the comments made in [1], [2] and [3] are not linked to this main thread, adding them for reference here (but better to keep the conversation going by replying to this email). [1]: https://www.postgresql.org/message-id/4c5dee3a-8381-4e0f-b882-d1bd950e8972%40dalibo.com [2]: https://www.postgresql.org/message-id/c70fcc72-eed6-475b-81c8-508422299351%40dalibo.com [3]: https://www.postgresql.org/message-id/e8a6db36-073e-4ca3-b38c-b42d7094cba8%40dalibo.com Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On 1/22/25 12:34, Bertrand Drouvot wrote: > Hi, > > On Tue, Nov 05, 2024 at 06:36:47PM +0100, Frédéric Yhuel wrote: >> Hi, I thought it would be nice to give the user a better idea of what >> avg_leaf_density and leaf_fragmentation mean. >> >> Patch attached. What do you think? > > Yeah, I think that can not hurt to give more details, thanks for the proposal! > Hi Bertrand, thanks for your review! > A few comments: > > === 1 > > + <literal>avg_leaf_density</literal> can be seen as the inverse of bloat, > > I'm not sure it's good to describe something as the inverse of "something > else". See my proposal below. > Yeah... bloat is a more familiar concept, so I wanted to link these two metrics... but "inverse" is confusing... or maybe something like that: A small <literal>avg_leaf_density</literal> means that the index is bloated. > === 2 > > I’m not sure we need to add the extra details in a paragraph below the fields > description. What about changing the fields description? > > Something concise enough like? > > avg_leaf_density: shows how full leaf pages currently are (100 if full) That should do :-) > leaf_fragmentation: shows how much physical and logical ordering of leaf pages > differ (zero if they don't) > It looks good to me. I've noticed that maximum leaf_fragmentation can have a huge impact on a range index-only scan, when reading all blocs from disks, even on my laptop machine with SSD, but I don't know if this is the right place to document this? I used the following psql scripts to test the effect of leaf_fragmentation (the first one calls the second one): https://github.com/dalibo/misc/blob/main/fyhuel/leaf_fragmentation.sql https://github.com/dalibo/misc/blob/main/fyhuel/evict_from_both_caches.sql > Also the comments made in [1], [2] and [3] are not linked to this main thread, > adding them for reference here (but better to keep the conversation going > by replying to this email). > > [1]: https://www.postgresql.org/message-id/4c5dee3a-8381-4e0f-b882-d1bd950e8972%40dalibo.com > [2]: https://www.postgresql.org/message-id/c70fcc72-eed6-475b-81c8-508422299351%40dalibo.com > [3]: https://www.postgresql.org/message-id/e8a6db36-073e-4ca3-b38c-b42d7094cba8%40dalibo.com > Indeed, I think Benoît mistakenly thought that thread aggregation was based on thread titles alone. He appended the second conversation to the commitfest entry.
Hi Frédéric, On Thu, Jan 23, 2025 at 10:00:27AM +0100, Frédéric Yhuel wrote: > On 1/22/25 12:34, Bertrand Drouvot wrote: > > I'm not sure it's good to describe something as the inverse of "something > > else". See my proposal below. > > > > Yeah... bloat is a more familiar concept, so I wanted to link these two > metrics Yeah but in the (rare?) case "bloat" is not known then one would have to make sense of it first. > > I’m not sure we need to add the extra details in a paragraph below the fields > > description. What about changing the fields description? > > > > Something concise enough like? > > > > avg_leaf_density: shows how full leaf pages currently are (100 if full) > > That should do :-) Thanks! > > leaf_fragmentation: shows how much physical and logical ordering of leaf pages > > differ (zero if they don't) > > > > It looks good to me. Thanks! > I've noticed that maximum leaf_fragmentation can have a huge impact on a > range index-only scan, when reading all blocs from disks, even on my laptop > machine with SSD, but I don't know if this is the right place to document > this? Yeah, that might be worth to mention. Maybe below the descriptions then? (keeping the changes above in the description). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Hi, On Fri, Jan 24, 2025 at 12:34:08PM +0100, Frédéric Yhuel wrote: > I don't know if you noticed Laurenz's suggestion, because he forgot to CC > you, but I like it very much. I think we should mention the default > fillfactor (90 for indexes). Thanks for mentioning Laurenz's suggestion! === 1 + Since indexes have a default fillfactor of 90, this should be around 0.9 for + newly built indexes I think 0.9 should be replaced by 90 (that's the actual kind of output we'd get). But having said that, I'm not sure we should mention those 90 stuff because it depends of the amount of data indexed (I mean if the index has a very few leaf pages, say < 5, then it's easy to be << 90 since it's an average). That's probably not the majority of indexes though so maybe just nuance the sentence a bit. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Fri, 2025-01-24 at 13:34 +0000, Bertrand Drouvot wrote: > + Since indexes have a default fillfactor of 90, this should be around 0.9 for > + newly built indexes > > I think 0.9 should be replaced by 90 (that's the actual kind of output we'd get). > > But having said that, I'm not sure we should mention those 90 stuff because it > depends of the amount of data indexed (I mean if the index has a very few > leaf pages, say < 5, then it's easy to be << 90 since it's an average). That's > probably not the majority of indexes though so maybe just nuance the sentence a > bit. Sorry about the 0.9. Perhaps the wording could be more careful: ... this should be around 90 for most newly built indexes of non-neglectable size. Yours, Laurenz Albe
On 24.01.25 15:41, Frédéric Yhuel wrote: > > > On 1/24/25 14:58, Laurenz Albe wrote: >> On Fri, 2025-01-24 at 13:34 +0000, Bertrand Drouvot wrote: >>> + Since indexes have a default fillfactor of 90, this should be >>> around 0.9 for >>> + newly built indexes >>> >>> I think 0.9 should be replaced by 90 (that's the actual kind of >>> output we'd get). >>> > > Damn! I missed that one too... > >>> But having said that, I'm not sure we should mention those 90 stuff >>> because it >>> depends of the amount of data indexed (I mean if the index has a very >>> few >>> leaf pages, say < 5, then it's easy to be << 90 since it's an >>> average). That's >>> probably not the majority of indexes though so maybe just nuance the >>> sentence a >>> bit. >> >> Sorry about the 0.9. >> >> Perhaps the wording could be more careful: ... this should be around >> 90 for >> most newly built indexes of non-neglectable size. >> > > It looks good to me (apart from the typo). v4 attached committed