Re: BUG #16280: dead tuples (probably) effect plan and query performance
От | James Coleman |
---|---|
Тема | Re: BUG #16280: dead tuples (probably) effect plan and query performance |
Дата | |
Msg-id | CAAaqYe96jM14OuvjYh2bb2EyR=bCjWGwB4sVEFFM_sVMhHfB6g@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #16280: dead tuples (probably) effect plan and query performance ("Serbin, Ilya" <iserbin@bostonsd.ru>) |
Ответы |
Re: BUG #16280: dead tuples (probably) effect plan and query performance
|
Список | pgsql-bugs |
On Wed, Mar 11, 2020 at 6:51 AM Serbin, Ilya <iserbin@bostonsd.ru> wrote: > > Hello, > Asking again just in case someone can help to find out why is that happening in my case and if it is a bug. > > пт, 28 февр. 2020 г. в 12:25, Serbin, Ilya <iserbin@bostonsd.ru>: >> >> Hello, Tom! >> Thanks for your answer. My concern is that plan changes after a relatively small number of dead tuples. Bad plan is beinggenerated when table1 contain 300-400 dead tuples. It is only 0.07%-0.08% of the whole table (409k+ entries). >> In addition, table is growing and currently there are 425k of entries. However even on 425k size table plan stays gooduntil number of dead tuples reaches 300-400, аfter that plan changes to the bad one. >> As I said, I tried analyzing table with various default_statistics_target (100-1000 with step of 100) - plan stays bad.Tried setting random_page_cost=0.1 and seq_page_cost=1 (2, 3, 4, etc). Plan changed to good one only starting from random_page_cost=0.1 and seq_page_cost=8. However, once I ran vacuum - plan changed to good one and stayed the same evenwhen I set random_page_cost=30;set seq_page_cost=1; >> I realize that I can set autovacuum thresholds for this table to trigger it once dead tuples reach 300, but it doesn'tseem right to me (this number of changes happens in something like 5 minutes and tables is ~2GB size as of now). >> >> Why does such a small amount (0.07%) of dead tuples changes cost estimations so dramatically? Or am I missing somethingand dead tuples has nothing to do with it? That looks like an almost impossible query to plan correctly -- the planner would need some kind of stats on fields internal to the json blob, but it can't have that since the blob is necessarily schema less (at least from PG's perspective). That's why Tom is wondering about making the representation map more closely to a SQL/relational table model with discrete columns. Alternatively have you tried adding function indexes on the (it looks like) ~2 paths you're querying in the JSON field? It's not impossible there's a bug in the planner here, but the opaqueness of this query to the planner is why that's not anyone's first assumption. To show otherwise someone would have to step through the execution and show that it's not just reaching some cutoff (that makes somewhat reasonable sense for how hard this query is to plan) or that the cutoff is being calculated incorrectly. James
В списке pgsql-bugs по дате отправления: