Re: BitmapHeapScan streaming read user and prelim refactoring

Поиск

Список

Период

Сортировка

От	Tomas Vondra
Тема	Re: BitmapHeapScan streaming read user and prelim refactoring
Дата	29 февраля 2024 г. 22:44:27
Msg-id	8de6f6d7-9b9d-47a4-98cc-84bf31a314db@enterprisedb.com обсуждение исходный текст
Ответ на	Re: BitmapHeapScan streaming read user and prelim refactoring (Melanie Plageman <melanieplageman@gmail.com>)
Ответы	Re: BitmapHeapScan streaming read user and prelim refactoring Re: BitmapHeapScan streaming read user and prelim refactoring
Список	pgsql-hackers

Дерево обсуждения


On 2/29/24 22:19, Melanie Plageman wrote:
> On Thu, Feb 29, 2024 at 7:54 AM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>>
>>
>>
>> On 2/29/24 00:40, Melanie Plageman wrote:
>>> On Wed, Feb 28, 2024 at 6:17 PM Tomas Vondra
>>> <tomas.vondra@enterprisedb.com> wrote:
>>>>
>>>>
>>>>
>>>> On 2/28/24 21:06, Melanie Plageman wrote:
>>>>> On Wed, Feb 28, 2024 at 2:23 PM Tomas Vondra
>>>>> <tomas.vondra@enterprisedb.com> wrote:
>>>>>>
>>>>>> On 2/28/24 15:56, Tomas Vondra wrote:
>>>>>>>> ...
>>>>>>>
>>>>>>> Sure, I can do that. It'll take a couple hours to get the results, I'll
>>>>>>> share them when I have them.
>>>>>>>
>>>>>>
>>>>>> Here are the results with only patches 0001 - 0012 applied (i.e. without
>>>>>> the patch introducing the streaming read API, and the patch switching
>>>>>> the bitmap heap scan to use it).
>>>>>>
>>>>>> The changes in performance don't disappear entirely, but the scale is
>>>>>> certainly much smaller - both in the complete results for all runs, and
>>>>>> for the "optimal" runs that would actually pick bitmapscan.
>>>>>
>>>>> Hmm. I'm trying to think how my refactor could have had this impact.
>>>>> It seems like all the most notable regressions are with 4 parallel
>>>>> workers. What do the numeric column labels mean across the top
>>>>> (2,4,8,16...) -- are they related to "matches"? And if so, what does
>>>>> that mean?
>>>>>
>>>>
>>>> That's the number of distinct values matched by the query, which should
>>>> be an approximation of the number of matching rows. The number of
>>>> distinct values in the data set differs by data set, but for 1M rows
>>>> it's roughly like this:
>>>>
>>>> uniform: 10k
>>>> linear: 10k
>>>> cyclic: 100
>>>>
>>>> So for example matches=128 means ~1% of rows for uniform/linear, and
>>>> 100% for cyclic data sets.
>>>
>>> Ah, thank you for the explanation. I also looked at your script after
>>> having sent this email and saw that it is clear in your script what
>>> "matches" is.
>>>
>>>> As for the possible cause, I think it's clear most of the difference
>>>> comes from the last patch that actually switches bitmap heap scan to the
>>>> streaming read API. That's mostly expected/understandable, although we
>>>> probably need to look into the regressions or cases with e_i_c=0.
>>>
>>> Right, I'm mostly surprised about the regressions for patches 0001-0012.
>>>
>>> Re eic 0: Thomas Munro and I chatted off-list, and you bring up a
>>> great point about eic 0. In old bitmapheapscan code eic 0 basically
>>> disabled prefetching but with the streaming read API, it will still
>>> issue fadvises when eic is 0. That is an easy one line fix. Thomas
>>> prefers to fix it by always avoiding an fadvise for the last buffer in
>>> a range before issuing a read (since we are about to read it anyway,
>>> best not fadvise it too). This will fix eic 0 and also cut one system
>>> call from each invocation of the streaming read machinery.
>>>
>>>> To analyze the 0001-0012 patches, maybe it'd be helpful to run tests for
>>>> individual patches. I can try doing that tomorrow. It'll have to be a
>>>> limited set of tests, to reduce the time, but might tell us whether it's
>>>> due to a single patch or multiple patches.
>>>
>>> Yes, tomorrow I planned to start trying to repro some of the "red"
>>> cases myself. Any one of the commits could cause a slight regression
>>> but a 3.5x regression is quite surprising, so I might focus on trying
>>> to repro that locally and then narrow down which patch causes it.
>>>
>>> For the non-cached regressions, perhaps the commit to use the correct
>>> recheck flag (0004) when prefetching could be the culprit. And for the
>>> cached regressions, my money is on the commit which changes the whole
>>> control flow of BitmapHeapNext() and the next_block() and next_tuple()
>>> functions (0010).
>>>
>>
>> I do have some partial results, comparing the patches. I only ran one of
>> the more affected workloads (cyclic) on the xeon, attached is a PDF
>> comparing master and the 0001-0014 patches. The percentages are timing
>> vs. the preceding patch (green - faster, red - slower).
> 
> Just confirming: the results are for uncached?
> 

Yes, cyclic data set, uncached case. I picked this because it seemed
like one of the most affected cases. Do you want me to test some other
cases too?

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: BitmapHeapScan streaming read user and prelim refactoring