Re: Deleting older versions in unique indexes to avoid page splits
От | Peter Geoghegan |
---|---|
Тема | Re: Deleting older versions in unique indexes to avoid page splits |
Дата | |
Msg-id | CAH2-WzmEic9JJ_NJXWo9frRgTqg7q8YuOfew_et3UxJt6zUPfg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Deleting older versions in unique indexes to avoid page splits (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>) |
Ответы |
Re: Deleting older versions in unique indexes to avoid page splits
|
Список | pgsql-hackers |
On Wed, Oct 14, 2020 at 7:07 AM Anastasia Lubennikova <a.lubennikova@postgrespro.ru> wrote: > The idea seems very promising, especially when extended to handle non-unique indexes too. Thanks! > That's exactly what I wanted to discuss after the first letter. If we could make (non)HOT-updates index specific, I thinkit could improve performance a lot. Do you mean accomplishing the same goal in heapam, by making the optimization more intelligent about which indexes need new versions? We did have a patch that did that in 2007, as you may recall -- this was called WARM: https://www.postgresql.org/message-id/flat/CABOikdMNy6yowA%2BwTGK9RVd8iw%2BCzqHeQSGpW7Yka_4RSZ_LOQ%40mail.gmail.com This didn't go anywhere. I think that this solution in more pragmatic. It's cheap enough to remove it if a better solution becomes available in the future. But this is a pretty good solution by all important measures. > I think that this optimization can affect low cardinality indexes negatively, but it is hard to estimate impact withouttests. Maybe it won't be a big deal, given that we attempt to eliminate old copies not very often and that low cardinalityb-trees are already not very useful. Besides, we can always make this thing optional, so that users could tuneit to their workload. Right. The trick is to pay only a fixed low cost (maybe as low as one heap page access) when we start out, and ratchet it up only if the first heap page access looks promising. And to avoid posting list tuples. Regular deduplication takes place when this fails. It's useful for the usual reasons, but also because this new mechanism learns not to try the posting list TIDs. > I wonder, how this new feature will interact with physical replication? Replica may have quite different performance profile. I think of that as equivalent to having a long running transaction on the primary. When I first started working on this patch I thought about having "long running transaction detection". But I quickly realized that that isn't a meaningful concept. A transaction is only truly long running relative to the writes that take place that have obsolete row versions that cannot be cleaned up. It has to be something we can deal with, but it cannot be meaningfully special-cased. -- Peter Geoghegan
В списке pgsql-hackers по дате отправления: