Re: Query with straightforward plan changes and becomes 6520 times slower after VACUUM ANALYZE
От | David Rowley |
---|---|
Тема | Re: Query with straightforward plan changes and becomes 6520 times slower after VACUUM ANALYZE |
Дата | |
Msg-id | CAApHDvoadk-GA_0x_bRYSe4HtRO3jyzFbP1aY4eNSPNnX_n-0g@mail.gmail.com обсуждение исходный текст |
Ответ на | RE: Query with straightforward plan changes and becomes 6520 times slower after VACUUM ANALYZE (Jan Kort <jan.kort@genetics.nl>) |
Список | pgsql-bugs |
On Wed, 19 May 2021 at 22:55, Jan Kort <jan.kort@genetics.nl> wrote: > Ok so the problem is that the MCV list is created, because there are groups of common values, but then this has no effect,because there are not enough single values to create a histogram? The problem is mainly that the join costing code for the merge join looks at the statistics on the column to get the upper bound of the merge. Ideally this value would be 1000000 in my "three" table. However, the MCV list sees the value of 1 as a most common value due to it appearing twice. When we consider building a histogram with the remaining values, we see we don't have enough to build one. We need at least 2 values. This means when the merge join costing code tries to get the upper bound of the merge (ideally 1000000), it only has the MCV list to go on, which just contains the value 1. The 1000000 value is not tracked at all. This makes it appear that the merge join will start and end with the same value, which is very cheap, and it costs that path accordingly. > That is not such a rare case as I would have hoped, for example this would then also cause the same behavior: > > values(1,1),(2,1),(3,1000000),(4,2),(5,2),(6,3),(7,3); Yeah, technically you can add enough duplicate values to fill the MCV list then just not enough to build the histogram. However, to maintain the problem case, the MCVs that you do add will need to be very early in the range. Your last MCV here is 3, which would make the estimated merge range of 1 to 3, which is not as cheap as 1 to 1 > And it seems to do. > > Ranges can then be longer too: > > values(1,1),(2,1),(4,1),(3,1000000); > > for some reason I can't reverse the ranges: > > values(1,1000000),(2,1000000),(3,1); > > I don’t understand why that doesn't create a bad plan, there seems to be order involved too. That's because the 1000000 value will make it into the MCV list and extend the estimated range of the merge join. > Using text as primary key seems to perform good too, even if it keeps the same "merge" plan: That will be because of the change in sort order between text and int. e.g 1000000 sorts before 2, for example. I don't think you should work around the problem this way. > What the high value is and how close it is to the low value seems to matter: Yeah, that's because merge join can end by reading fewer values from the "million" table. When the highest value is 1000000 it needs to read the entire "million" table. > values(1,1),(2,1),(3,100000); > values(1,1),(2,1),(3,10000); > values(1,1),(2,1),(3,1000); > > Gets progressively less bad, again the plan is not changed, but the execution of it becomes more efficient. > > It seems like it goes through the records from 1 till say 10000 one by one. This is how Merge Join works. See https://en.wikipedia.org/wiki/Sort-merge_join . There's a c# example there that shows how you have to consume a value from the side with the lowest value until you find an equal value, where you join, or a higher value, where you must consume the other side until you find another equal value. The join ends when you run out of values on one side. David
В списке pgsql-bugs по дате отправления: