Re: BitmapHeapScan streaming read user and prelim refactoring
От | Tomas Vondra |
---|---|
Тема | Re: BitmapHeapScan streaming read user and prelim refactoring |
Дата | |
Msg-id | 45bed4f3-a5bf-4a34-b544-7e751bd437e1@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: BitmapHeapScan streaming read user and prelim refactoring (Melanie Plageman <melanieplageman@gmail.com>) |
Ответы |
Re: BitmapHeapScan streaming read user and prelim refactoring
|
Список | pgsql-hackers |
On 2/28/24 21:06, Melanie Plageman wrote: > On Wed, Feb 28, 2024 at 2:23 PM Tomas Vondra > <tomas.vondra@enterprisedb.com> wrote: >> >> On 2/28/24 15:56, Tomas Vondra wrote: >>>> ... >>> >>> Sure, I can do that. It'll take a couple hours to get the results, I'll >>> share them when I have them. >>> >> >> Here are the results with only patches 0001 - 0012 applied (i.e. without >> the patch introducing the streaming read API, and the patch switching >> the bitmap heap scan to use it). >> >> The changes in performance don't disappear entirely, but the scale is >> certainly much smaller - both in the complete results for all runs, and >> for the "optimal" runs that would actually pick bitmapscan. > > Hmm. I'm trying to think how my refactor could have had this impact. > It seems like all the most notable regressions are with 4 parallel > workers. What do the numeric column labels mean across the top > (2,4,8,16...) -- are they related to "matches"? And if so, what does > that mean? > That's the number of distinct values matched by the query, which should be an approximation of the number of matching rows. The number of distinct values in the data set differs by data set, but for 1M rows it's roughly like this: uniform: 10k linear: 10k cyclic: 100 So for example matches=128 means ~1% of rows for uniform/linear, and 100% for cyclic data sets. As for the possible cause, I think it's clear most of the difference comes from the last patch that actually switches bitmap heap scan to the streaming read API. That's mostly expected/understandable, although we probably need to look into the regressions or cases with e_i_c=0. To analyze the 0001-0012 patches, maybe it'd be helpful to run tests for individual patches. I can try doing that tomorrow. It'll have to be a limited set of tests, to reduce the time, but might tell us whether it's due to a single patch or multiple patches. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: