Обсуждение: Re: Some ExecSeqScan optimizations

Поиск
Список
Период
Сортировка

Re: Some ExecSeqScan optimizations

От
Vladlen Popolitov
Дата:
Amit Langote писал(а) 2025-01-10 16:22:
> On Fri, Jan 10, 2025 at 1:06 PM David Rowley <dgrowleyml@gmail.com> 
> wrote:
>> On Fri, 10 Jan 2025 at 02:46, Amit Langote <amitlangote09@gmail.com> 
>> wrote:
>> >
>> > On Mon, Jan 6, 2025 at 10:18 PM David Rowley <dgrowleyml@gmail.com> wrote:
>> > > I've attached my workings of what I was messing around with. It seems
>> > > to perform about the same as your version. I think maybe we'd need
>> > > some sort of execScan.h instead of where I've stuffed the functions
>> > > in.
>> >
>> > I've done that in the attached v2.
>> 
>> I think 0001 looks ok, aside from what the attached fixes. (at least
>> one is my mistake)
> 
> Oops, thanks for the fixes.  Attaching an updated version.
> 
>> Did you test the performance of 0002? I didn't look at it hard enough
>> to understand what you've done.
> 
> I reran the test suite I used before and I don't see a consistent
> improvement due to 0002 or perhaps rather degradation.  I've saved the
> results in the sheet named 2025-01-10 in the spreadsheet at [1].
> 
> Comparing the latency for the query `select count(*) from test_table
> where <first_column> = <nonexistant_value>` (where test_table has 30
> integer columns and 1 million rows in it) between v17, master, and the
> patched (0001 or 0001+0002) shows an improvement of close to 10% with
> the patch.
> 
> -- v17
> select count(*) from test_table where a_1 = 1000000;
>  count
> -------
>      0
> (1 row)
> Time: 286.618 ms
> 
> -- master
> select count(*) from test_table where a_1 = 1000000;
>  count
> -------
>      0
> (1 row)
> Time: 283.564 ms
> 
> -- patched (0001+0002)
> select count(*) from test_table where a_1 = 1000000;
>  count
> -------
>      0
> (1 row)
> Time: 260.547 ms
> 
> Note that I turned off Gather for these tests, because then I find
> that the improvements to ExecScan() are better measurable.
> 
>> I can look if performance tests show
>> that it might be worthwhile considering.
> 
> Sure, that would be great.
Hi

Could you clarify, how do you get this improvements (283 ms to 260 ms) 
in this patch?
I see additional code ( if ... else if ... else if ...) and the same 
function declared
as inline, but it is called by pointer as before, and it does not 
matter, that it is
declared as inline.

  In case of query
select count(*) from test_table where a_1 = 1000000;
I would expect increase of query time due to additional if...else . It 
is not clear
what code was eliminated to decrease query time.

-- 
Best regards,

Vladlen Popolitov.



Re: Some ExecSeqScan optimizations

От
David Rowley
Дата:
On Fri, 10 Jan 2025 at 22:53, Vladlen Popolitov
<v.popolitov@postgrespro.ru> wrote:
>   In case of query
> select count(*) from test_table where a_1 = 1000000;
> I would expect increase of query time due to additional if...else . It
> is not clear
> what code was eliminated to decrease query time.

Are you talking about the code added to ExecInitSeqScan() to determine
which node function to call? If so, that's only called during executor
startup.  The idea here is to reduce the branching during execution by
calling one of those special functions which has a more specialised
version of the ExecScan code for the particular purpose it's going to
be used for.

David



Re: Some ExecSeqScan optimizations

От
Vladlen Popolitov
Дата:
Amit Langote писал(а) 2025-01-10 18:22:
> On Fri, Jan 10, 2025 at 7:36 PM David Rowley <dgrowleyml@gmail.com> 
> wrote:
>> On Fri, 10 Jan 2025 at 22:53, Vladlen Popolitov
>> <v.popolitov@postgrespro.ru> wrote:
>> >   In case of query
>> > select count(*) from test_table where a_1 = 1000000;
>> > I would expect increase of query time due to additional if...else . It
>> > is not clear
>> > what code was eliminated to decrease query time.
>> 
>> Are you talking about the code added to ExecInitSeqScan() to determine
>> which node function to call? If so, that's only called during executor
>> startup.  The idea here is to reduce the branching during execution by
>> calling one of those special functions which has a more specialised
>> version of the ExecScan code for the particular purpose it's going to
>> be used for.
> 
> Looks like I hadn't mentioned this key aspect of the patch in the
> commit message, so did that in the attached.
> 
> Vladlen, does what David wrote and the new commit message answer your
> question(s)?

Hi Amit,

  Yes, David clarified the idea, but it is still hard to believe in 5% of
improvements.
The query
select count(*) from test_table where a_1 = 1000000;
has both qual and projection, and ExecScanExtended() will be generated
similar to ExecScan() (the same not NULL values to check in if()).
  Do you have some scripts to reproduce your benchmark?
-- 
Best regards,

Vladlen Popolitov.



Re: Some ExecSeqScan optimizations

От
Junwang Zhao
Дата:
On Fri, Jan 10, 2025 at 10:49 PM Vladlen Popolitov
<v.popolitov@postgrespro.ru> wrote:
>
> Amit Langote писал(а) 2025-01-10 18:22:
> > On Fri, Jan 10, 2025 at 7:36 PM David Rowley <dgrowleyml@gmail.com>
> > wrote:
> >> On Fri, 10 Jan 2025 at 22:53, Vladlen Popolitov
> >> <v.popolitov@postgrespro.ru> wrote:
> >> >   In case of query
> >> > select count(*) from test_table where a_1 = 1000000;
> >> > I would expect increase of query time due to additional if...else . It
> >> > is not clear
> >> > what code was eliminated to decrease query time.
> >>
> >> Are you talking about the code added to ExecInitSeqScan() to determine
> >> which node function to call? If so, that's only called during executor
> >> startup.  The idea here is to reduce the branching during execution by
> >> calling one of those special functions which has a more specialised
> >> version of the ExecScan code for the particular purpose it's going to
> >> be used for.
> >
> > Looks like I hadn't mentioned this key aspect of the patch in the
> > commit message, so did that in the attached.
> >
> > Vladlen, does what David wrote and the new commit message answer your
> > question(s)?
>
> Hi Amit,
>
>   Yes, David clarified the idea, but it is still hard to believe in 5% of
> improvements.
> The query
> select count(*) from test_table where a_1 = 1000000;
> has both qual and projection, and ExecScanExtended() will be generated
> similar to ExecScan() (the same not NULL values to check in if()).
>   Do you have some scripts to reproduce your benchmark?

The benchmark is provided [0].

Here is a rough comparison of compiled variants' assembly code.

<ExecSeqScan>: start 2b8590, end 2b868c => 252
<ExecSeqScanWithProject>: start 2b8034, end 2b8140 => 268
<ExecSeqScanWithQual>: start 2b8144, end 2b831c => 472
<ExecSeqScanWithQualProject>: start 2b8320, end 2b858c => 620

Before Amit's optimization, it was basically called the
ExecSeqScanWithQualProject, you
can see the other 3 variants all have some reduction in function size.

> --
> Best regards,
>
> Vladlen Popolitov.

[0] https://docs.google.com/spreadsheets/d/1AsJOUgIfSsYIJUJwbXk4aO9FVOFOrBCvrfmdQYkHIw4/edit?usp=sharing

--
Regards
Junwang Zhao



Re: Some ExecSeqScan optimizations

От
Junwang Zhao
Дата:
On Sat, Jan 11, 2025 at 4:57 PM Junwang Zhao <zhjwpku@gmail.com> wrote:
>
> On Fri, Jan 10, 2025 at 10:49 PM Vladlen Popolitov
> <v.popolitov@postgrespro.ru> wrote:
> >
> > Amit Langote писал(а) 2025-01-10 18:22:
> > > On Fri, Jan 10, 2025 at 7:36 PM David Rowley <dgrowleyml@gmail.com>
> > > wrote:
> > >> On Fri, 10 Jan 2025 at 22:53, Vladlen Popolitov
> > >> <v.popolitov@postgrespro.ru> wrote:
> > >> >   In case of query
> > >> > select count(*) from test_table where a_1 = 1000000;
> > >> > I would expect increase of query time due to additional if...else . It
> > >> > is not clear
> > >> > what code was eliminated to decrease query time.
> > >>
> > >> Are you talking about the code added to ExecInitSeqScan() to determine
> > >> which node function to call? If so, that's only called during executor
> > >> startup.  The idea here is to reduce the branching during execution by
> > >> calling one of those special functions which has a more specialised
> > >> version of the ExecScan code for the particular purpose it's going to
> > >> be used for.
> > >
> > > Looks like I hadn't mentioned this key aspect of the patch in the
> > > commit message, so did that in the attached.
> > >
> > > Vladlen, does what David wrote and the new commit message answer your
> > > question(s)?
> >
> > Hi Amit,
> >
> >   Yes, David clarified the idea, but it is still hard to believe in 5% of
> > improvements.
> > The query
> > select count(*) from test_table where a_1 = 1000000;
> > has both qual and projection, and ExecScanExtended() will be generated
> > similar to ExecScan() (the same not NULL values to check in if()).
> >   Do you have some scripts to reproduce your benchmark?
>
> The benchmark is provided [0].
>
> Here is a rough comparison of compiled variants' assembly code.
>
> <ExecSeqScan>: start 2b8590, end 2b868c => 252
> <ExecSeqScanWithProject>: start 2b8034, end 2b8140 => 268
> <ExecSeqScanWithQual>: start 2b8144, end 2b831c => 472
> <ExecSeqScanWithQualProject>: start 2b8320, end 2b858c => 620

Here is my compile options:

meson setup $HOME/build --prefix=$HOME/pgsql --buildtype=release

and I use `objdump -D postgres | less` to see the assembly code.

>
> Before Amit's optimization, it was basically called the
> ExecSeqScanWithQualProject, you
> can see the other 3 variants all have some reduction in function size.
>
> > --
> > Best regards,
> >
> > Vladlen Popolitov.
>
> [0] https://docs.google.com/spreadsheets/d/1AsJOUgIfSsYIJUJwbXk4aO9FVOFOrBCvrfmdQYkHIw4/edit?usp=sharing
>
> --
> Regards
> Junwang Zhao



--
Regards
Junwang Zhao



Re: Some ExecSeqScan optimizations

От
Junwang Zhao
Дата:
Hi Amit,

On Fri, Jan 10, 2025 at 7:22 PM Amit Langote <amitlangote09@gmail.com> wrote:
>
> On Fri, Jan 10, 2025 at 7:36 PM David Rowley <dgrowleyml@gmail.com> wrote:
> > On Fri, 10 Jan 2025 at 22:53, Vladlen Popolitov
> > <v.popolitov@postgrespro.ru> wrote:
> > >   In case of query
> > > select count(*) from test_table where a_1 = 1000000;
> > > I would expect increase of query time due to additional if...else . It
> > > is not clear
> > > what code was eliminated to decrease query time.
> >
> > Are you talking about the code added to ExecInitSeqScan() to determine
> > which node function to call? If so, that's only called during executor
> > startup.  The idea here is to reduce the branching during execution by
> > calling one of those special functions which has a more specialised
> > version of the ExecScan code for the particular purpose it's going to
> > be used for.
>
> Looks like I hadn't mentioned this key aspect of the patch in the
> commit message, so did that in the attached.

Thanks for updating the patch. While seeing the patch, the es_epq_active
confused me a little bit mostly because its name, a field name ending with
"active" typically suggests a boolean value, but here it is not, should we
change it to sth like es_epqstate? However this is not related to this patch,
I can start a new thread if you think this is worth a patch.

There is one tiny indent issue(my IDE does this automatically), which I
guess you will fix before committing.

-       EPQState *epqstate;
+       EPQState   *epqstate;

>
> Vladlen, does what David wrote and the new commit message answer your
> question(s)?
>
> --
> Thanks, Amit Langote



--
Regards
Junwang Zhao



Re: Some ExecSeqScan optimizations

От
Amit Langote
Дата:
Hi Vladlen,

On Fri, Jan 10, 2025 at 11:49 PM Vladlen Popolitov
<v.popolitov@postgrespro.ru> wrote:
> Amit Langote писал(а) 2025-01-10 18:22:
> > On Fri, Jan 10, 2025 at 7:36 PM David Rowley <dgrowleyml@gmail.com>
> > wrote:
> >> On Fri, 10 Jan 2025 at 22:53, Vladlen Popolitov
> >> <v.popolitov@postgrespro.ru> wrote:
> >> >   In case of query
> >> > select count(*) from test_table where a_1 = 1000000;
> >> > I would expect increase of query time due to additional if...else . It
> >> > is not clear
> >> > what code was eliminated to decrease query time.
> >>
> >> Are you talking about the code added to ExecInitSeqScan() to determine
> >> which node function to call? If so, that's only called during executor
> >> startup.  The idea here is to reduce the branching during execution by
> >> calling one of those special functions which has a more specialised
> >> version of the ExecScan code for the particular purpose it's going to
> >> be used for.
> >
> > Looks like I hadn't mentioned this key aspect of the patch in the
> > commit message, so did that in the attached.
> >
> > Vladlen, does what David wrote and the new commit message answer your
> > question(s)?
>
> Hi Amit,
>
>   Yes, David clarified the idea, but it is still hard to believe in 5% of
> improvements.
> The query
> select count(*) from test_table where a_1 = 1000000;
> has both qual and projection, and ExecScanExtended() will be generated
> similar to ExecScan() (the same not NULL values to check in if()).

Yes, I've noticed that if the plan for the above query contains a
projection, like when it contains a Gather node, the inlined version
of ExecScanExtended() will look more or less the same as the full
ExecScan().  There won't be noticeable speedup with the patch in that
case.

However, I ran the benchmark tests with Gather disabled such that I
get a plan without projection, which uses an inlined version that
doesn't have branches related to projection.  I illustrate my example
below.

>   Do you have some scripts to reproduce your benchmark?

Use these steps.  Set max_parallel_workers_per_gather to 0,
shared_buffers to 512MB.  Compile the patch using --buildtype=release.

create table foo (a int, b int, c int, d int, e int);
insert into foo select i, i, i, i, i from generate_series(1, 10000000) i;

-- pg_prewarm: to ensure that no buffers lead to I/O to reduce noise
select pg_size_pretty(pg_prewarm('foo'));

select count(*) from foo where a = 10000000;

Times I get on v17, master, and with the patch for the above query are
as follows:

v17: 173, 173, 174 ms

master: 173, 175, 169 ms

Patched: 160, 161, 158 ms

Please let me know if you're still unable to reproduce such numbers
with the steps I described.


--
Thanks, Amit Langote



Re: Some ExecSeqScan optimizations

От
Amit Langote
Дата:
Hi Junwang,

On Sat, Jan 11, 2025 at 7:39 PM Junwang Zhao <zhjwpku@gmail.com> wrote:
> Hi Amit,
>
> On Fri, Jan 10, 2025 at 7:22 PM Amit Langote <amitlangote09@gmail.com> wrote:
> >
> > On Fri, Jan 10, 2025 at 7:36 PM David Rowley <dgrowleyml@gmail.com> wrote:
> > > On Fri, 10 Jan 2025 at 22:53, Vladlen Popolitov
> > > <v.popolitov@postgrespro.ru> wrote:
> > > >   In case of query
> > > > select count(*) from test_table where a_1 = 1000000;
> > > > I would expect increase of query time due to additional if...else . It
> > > > is not clear
> > > > what code was eliminated to decrease query time.
> > >
> > > Are you talking about the code added to ExecInitSeqScan() to determine
> > > which node function to call? If so, that's only called during executor
> > > startup.  The idea here is to reduce the branching during execution by
> > > calling one of those special functions which has a more specialised
> > > version of the ExecScan code for the particular purpose it's going to
> > > be used for.
> >
> > Looks like I hadn't mentioned this key aspect of the patch in the
> > commit message, so did that in the attached.
>
> Thanks for updating the patch. While seeing the patch, the es_epq_active
> confused me a little bit mostly because its name, a field name ending with
> "active" typically suggests a boolean value, but here it is not, should we
> change it to sth like es_epqstate? However this is not related to this patch,
> I can start a new thread if you think this is worth a patch.

Yeah, the name has confused me as well from time to time.

Though it might be a good idea to dig the thread that led to the
introduction of this field to find out if the naming has some logic
we're missing.

You may start a new thread to get the attention of other folks who
might have some clue.

> There is one tiny indent issue(my IDE does this automatically), which I
> guess you will fix before committing.
>
> -       EPQState *epqstate;
> +       EPQState   *epqstate;

Thanks for the heads up.

--
Thanks, Amit Langote



Re: Some ExecSeqScan optimizations

От
Amit Langote
Дата:
On Fri, Jan 17, 2025 at 2:05 PM Amit Langote <amitlangote09@gmail.com> wrote:
> Here's v5 with a few commit message tweaks.
>
> Barring objections, I would like to push this early next week.

Pushed yesterday.  Thank you all.

--
Thanks, Amit Langote