Re: BUG #19332: Sudden 330x performance degradation of SELECT amid INSERTs
| От | David Rowley |
|---|---|
| Тема | Re: BUG #19332: Sudden 330x performance degradation of SELECT amid INSERTs |
| Дата | |
| Msg-id | CAApHDvqe5Gk-9mzJoutVkwMzLV9HOPGg9ZCdYze48hsLUucSow@mail.gmail.com обсуждение исходный текст |
| Ответ на | Re: BUG #19332: Sudden 330x performance degradation of SELECT amid INSERTs (Sergey Naumov <sknaumov@gmail.com>) |
| Ответы |
Re: BUG #19332: Sudden 330x performance degradation of SELECT amid INSERTs
|
| Список | pgsql-bugs |
On Thu, 4 Dec 2025 at 23:20, Sergey Naumov <sknaumov@gmail.com> wrote: > I've collected slow and fast query plans and it looks like when data is cleaned up, PostgreSQL doesn't know what tableis big and what is small, and when data generation is in one big transaction, data from this uncommitted transactionalready affects SELECT queries, but VACUUM doesn't see this uncommitted data to adjust stats => query plannercould come up with suboptimal query plan. > > But still, the query itself has a hint as to what table has to be filtered first - there is a WHERE clause to keep justone line from this table. But query planner decides to join another (very big) table first => performance degrades byorders of magnitude. > > For me It looks like a flaw in query planner logic, that, having no data about tables content, ignores the WHERE clausethat hints what table has to be processed first => not sure whether it should be treated as performance issue or bug. I've studied this for a bit and pretty much feel like I've got one hand tied behind my back due to the format you've submitted the EXPLAIN in. I don't really understand the visual format as details I'm used to seeing are not there, and the JSON "raw" format isn't much better for humans to read, as it doesn't fit on a page. From what I see, it looks like the slow version opts to build the hash table for the hash join from an Index Scan on an index on test_run, but that index is bloated. For the fast plan, the bloated index isn't scanned because that same index is on the inner side of a nested loop and the outer side didn't find any ways, resulting in the inner side never being executed. From looking at get_relation_info(), I see we do fetch the actual size of the index with "info->pages = RelationGetNumberOfBlocks(indexRelation);", and from looking at genericcostestimate(), we should estimate the number of pages with "numIndexPages = ceil(numIndexTuples * index->pages / index->tuples);", so I think if the index did contain the 74962 pages when the query was planned, then the costs should have known about it. Are you using prepared statements here? (I suspect not, since the UUID is in the EXPLAIN output, but you could have hardcoded that) > Query plans are attached as PEV2 standalone HTML pages. I imagine some people might like this format, but I can bearly read it. You'll probably have more luck around here with EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) And for me, I've not really seen enough evidence that there's any bug here. If you think there is, then you might need to work a bit harder and provide a script that we can reproduce this with ourselves. David
В списке pgsql-bugs по дате отправления: