Обсуждение: Re: Some ExecSeqScan optimizations

Поиск

Список

Период

Сортировка

Re: Some ExecSeqScan optimizations

От

Andres Freund

Дата:

10 июля 2025 г., 02:34:21

Hi,

On 2025-01-22 10:07:51 +0900, Amit Langote wrote:
> On Fri, Jan 17, 2025 at 2:05 PM Amit Langote <amitlangote09@gmail.com> wrote:
> > Here's v5 with a few commit message tweaks.
> >
> > Barring objections, I would like to push this early next week.
> 
> Pushed yesterday.  Thank you all.

While looking at a profile I recently noticed that ExecSeqScanWithQual() had a
runtime branch to test whether qual is NULL, which seemed a bit silly. I think
we should use pg_assume(), which I just added to avoid a compiler warning, to
improve the code generation here.

The performance gain unsurprisingly isn't significant (but seems repeatably
measureable), but it does cut out a fair bit of unnecessary code.

andres@awork3:/srv/dev/build/postgres/m-dev-optimize$ size executor_nodeSeqscan.c.*o
   text       data        bss        dec        hex    filename
   3330          0          0       3330        d02    executor_nodeSeqscan.c.assume.o
   3834          0          0       3834        efa    executor_nodeSeqscan.c.o

A 13% reduction in actual code size isn't bad for such a small change, imo.

I have a separate question as well - do we need to call ResetExprContext() if
we neither qual, projection nor epq?  I see a small gain by avoiding that.

Greetings,

Andres Freund

Вложения

v1-0001-Optimize-seqscan-code-generation-using-pg_assume.patch

Re: Some ExecSeqScan optimizations

От

Amit Langote

Дата:

10 июля 2025 г., 11:28:50

Hi Andres,

On Thu, Jul 10, 2025 at 8:34 AM Andres Freund <andres@anarazel.de> wrote:
> On 2025-01-22 10:07:51 +0900, Amit Langote wrote:
> > On Fri, Jan 17, 2025 at 2:05 PM Amit Langote <amitlangote09@gmail.com> wrote:
> > > Here's v5 with a few commit message tweaks.
> > >
> > > Barring objections, I would like to push this early next week.
> >
> > Pushed yesterday.  Thank you all.
>
> While looking at a profile I recently noticed that ExecSeqScanWithQual() had a
> runtime branch to test whether qual is NULL, which seemed a bit silly. I think
> we should use pg_assume(), which I just added to avoid a compiler warning, to
> improve the code generation here.

+1. I think this might be what David was getting at in his first
message in this thread.

> The performance gain unsurprisingly isn't significant (but seems repeatably
> measureable), but it does cut out a fair bit of unnecessary code.
>
> andres@awork3:/srv/dev/build/postgres/m-dev-optimize$ size executor_nodeSeqscan.c.*o
>    text    data     bss     dec     hex filename
>    3330       0       0    3330     d02 executor_nodeSeqscan.c.assume.o
>    3834       0       0    3834     efa executor_nodeSeqscan.c.o
>
> A 13% reduction in actual code size isn't bad for such a small change, imo.

Yeah, that seems worthwhile. I had been a bit concerned about code
size growth from having four variant functions with at least some
duplication, so this is a nice offset.

Thanks for the patch.

+    /*
+     * Use pg_assume() for != NULL tests to make the compiler realize no
+     * runtime check for the field is needed in ExecScanExtended().
+     */

I propose changing "to make the compiler realize no runtime check" to
"so the compiler can optimize away the runtime check", assuming that
is what it means.

Also, I assume you intentionally avoided repeating the comment in all
the variant functions.

> I have a separate question as well - do we need to call ResetExprContext() if
> we neither qual, projection nor epq?  I see a small gain by avoiding that.

You're referring to this block, I assume:

    /*
     * If we have neither a qual to check nor a projection to do, just skip
     * all the overhead and return the raw scan tuple.
     */
    if (!qual && !projInfo)
    {
        ResetExprContext(econtext);
        return ExecScanFetch(node, epqstate, accessMtd, recheckMtd);
    }

Yeah, I think it's fine to remove ResetExprContext() here. When I
looked at it before, I left it in because I was unsure whether
accessMtd() might leak memory into the per-tuple context. But on
second thought, that seems unlikely?  Would you like me to do it or do
you have a patch in your tree already?

--
Thanks, Amit Langote

Re: Some ExecSeqScan optimizations

От

Andres Freund

Дата:

10 июля 2025 г., 23:55:33

Hi,

On 2025-07-10 17:28:50 +0900, Amit Langote wrote:
> On Thu, Jul 10, 2025 at 8:34 AM Andres Freund <andres@anarazel.de> wrote:
> > On 2025-01-22 10:07:51 +0900, Amit Langote wrote:
> > > On Fri, Jan 17, 2025 at 2:05 PM Amit Langote <amitlangote09@gmail.com> wrote:
> > > > Here's v5 with a few commit message tweaks.
> > > >
> > > > Barring objections, I would like to push this early next week.
> > >
> > > Pushed yesterday.  Thank you all.
> >
> > While looking at a profile I recently noticed that ExecSeqScanWithQual() had a
> > runtime branch to test whether qual is NULL, which seemed a bit silly. I think
> > we should use pg_assume(), which I just added to avoid a compiler warning, to
> > improve the code generation here.
> 
> +1. I think this might be what David was getting at in his first
> message in this thread.

Indeed.


> > The performance gain unsurprisingly isn't significant (but seems repeatably
> > measureable), but it does cut out a fair bit of unnecessary code.
> >
> > andres@awork3:/srv/dev/build/postgres/m-dev-optimize$ size executor_nodeSeqscan.c.*o
> >    text    data     bss     dec     hex filename
> >    3330       0       0    3330     d02 executor_nodeSeqscan.c.assume.o
> >    3834       0       0    3834     efa executor_nodeSeqscan.c.o
> >
> > A 13% reduction in actual code size isn't bad for such a small change, imo.
> 
> Yeah, that seems worthwhile. I had been a bit concerned about code
> size growth from having four variant functions with at least some
> duplication, so this is a nice offset.

I'm rather surprised by just how much the size reduces...

I built nodeSeqscan.c with -ffunction-sections and looked at the size with
size --format=sysv:

Before:
.text.SeqRecheck                               6      0
.rodata.str1.8                               135      0
.text.unlikely.SeqNext                        53      0
.text.SeqNext                                178      0
.text.ExecSeqScanEPQ                          20      0
.text.ExecSeqScanWithProject                 289      0
.text.unlikely.ExecSeqScanWithQual            53      0
.text.ExecSeqScanWithQual                    441      0
.text.unlikely.ExecSeqScanWithQualProject     53      0
.text.ExecSeqScanWithQualProject             811      0
.text.unlikely.ExecSeqScan                    53      0
.text.ExecSeqScan                            245      0
.text.ExecInitSeqScan                        287      0
.text.ExecEndSeqScan                          33      0
.text.ExecReScanSeqScan                       63      0
.text.ExecSeqScanEstimate                     88      0
.text.ExecSeqScanInitializeDSM               114      0
.text.ExecSeqScanReInitializeDSM              34      0
.text.ExecSeqScanInitializeWorker             64      0

After:
.text.SeqRecheck                               6      0
.rodata.str1.8                               135      0
.text.unlikely.SeqNext                        53      0
.text.SeqNext                                178      0
.text.ExecSeqScanEPQ                          20      0
.text.ExecSeqScanWithProject                 209      0
.text.unlikely.ExecSeqScanWithQual            53      0
.text.ExecSeqScanWithQual                    373      0
.text.unlikely.ExecSeqScanWithQualProject     53      0
.text.ExecSeqScanWithQualProject             474      0
.text.unlikely.ExecSeqScan                    53      0
.text.ExecSeqScan                            245      0
.text.ExecInitSeqScan                        287      0
.text.ExecEndSeqScan                          33      0
.text.ExecReScanSeqScan                       63      0
.text.ExecSeqScanEstimate                     88      0
.text.ExecSeqScanInitializeDSM               114      0
.text.ExecSeqScanReInitializeDSM              34      0
.text.ExecSeqScanInitializeWorker             64      0


I'm rather baffled that the size of ExecSeqScanWithQualProject goes from 811
to 474, just due to those null checks being removed...  But I'll take it.



> Thanks for the patch.
> 
> +    /*
> +     * Use pg_assume() for != NULL tests to make the compiler realize no
> +     * runtime check for the field is needed in ExecScanExtended().
> +     */
> 
> I propose changing "to make the compiler realize no runtime check" to
> "so the compiler can optimize away the runtime check", assuming that
> is what it means.

It does.  I don't really see a meaningful difference between the comments?


> Also, I assume you intentionally avoided repeating the comment in all
> the variant functions.

Yea, it didn't seem helpful to do so.


> > I have a separate question as well - do we need to call ResetExprContext() if
> > we neither qual, projection nor epq?  I see a small gain by avoiding that.
> 
> You're referring to this block, I assume:
> 
>     /*
>      * If we have neither a qual to check nor a projection to do, just skip
>      * all the overhead and return the raw scan tuple.
>      */
>     if (!qual && !projInfo)
>     {
>         ResetExprContext(econtext);
>         return ExecScanFetch(node, epqstate, accessMtd, recheckMtd);
>     }

Yep.


> Yeah, I think it's fine to remove ResetExprContext() here. When I
> looked at it before, I left it in because I was unsure whether
> accessMtd() might leak memory into the per-tuple context.

It's a good question.  I think I unfortunately found a problematic case,
ForeignNext().

I wonder if we instead can MemoryContextReset cheaper, by avoiding a function
call for the common case that no reset is needed. Right now we can't just
check ->isReset in an inline function, because we also delete children.  I
wonder if we could define isReset so that creating a child context unsets
isReset?

Greetings,

Andres Freund

Re: Some ExecSeqScan optimizations

От

Amit Langote

Дата:

11 июля 2025 г., 05:22:36

On Fri, Jul 11, 2025 at 5:55 AM Andres Freund <andres@anarazel.de> wrote:
> On 2025-07-10 17:28:50 +0900, Amit Langote wrote:
> > On Thu, Jul 10, 2025 at 8:34 AM Andres Freund <andres@anarazel.de> wrote:
> > > The performance gain unsurprisingly isn't significant (but seems repeatably
> > > measureable), but it does cut out a fair bit of unnecessary code.
> > >
> > > andres@awork3:/srv/dev/build/postgres/m-dev-optimize$ size executor_nodeSeqscan.c.*o
> > >    text    data     bss     dec     hex filename
> > >    3330       0       0    3330     d02 executor_nodeSeqscan.c.assume.o
> > >    3834       0       0    3834     efa executor_nodeSeqscan.c.o
> > >
> > > A 13% reduction in actual code size isn't bad for such a small change, imo.
> >
> > Yeah, that seems worthwhile. I had been a bit concerned about code
> > size growth from having four variant functions with at least some
> > duplication, so this is a nice offset.
>
> I'm rather surprised by just how much the size reduces...
>
> I built nodeSeqscan.c with -ffunction-sections and looked at the size with
> size --format=sysv:
>
> Before:
> .text.SeqRecheck                               6      0
> .rodata.str1.8                               135      0
> .text.unlikely.SeqNext                        53      0
> .text.SeqNext                                178      0
> .text.ExecSeqScanEPQ                          20      0
> .text.ExecSeqScanWithProject                 289      0
> .text.unlikely.ExecSeqScanWithQual            53      0
> .text.ExecSeqScanWithQual                    441      0
> .text.unlikely.ExecSeqScanWithQualProject     53      0
> .text.ExecSeqScanWithQualProject             811      0
> .text.unlikely.ExecSeqScan                    53      0
> .text.ExecSeqScan                            245      0
> .text.ExecInitSeqScan                        287      0
> .text.ExecEndSeqScan                          33      0
> .text.ExecReScanSeqScan                       63      0
> .text.ExecSeqScanEstimate                     88      0
> .text.ExecSeqScanInitializeDSM               114      0
> .text.ExecSeqScanReInitializeDSM              34      0
> .text.ExecSeqScanInitializeWorker             64      0
>
> After:
> .text.SeqRecheck                               6      0
> .rodata.str1.8                               135      0
> .text.unlikely.SeqNext                        53      0
> .text.SeqNext                                178      0
> .text.ExecSeqScanEPQ                          20      0
> .text.ExecSeqScanWithProject                 209      0
> .text.unlikely.ExecSeqScanWithQual            53      0
> .text.ExecSeqScanWithQual                    373      0
> .text.unlikely.ExecSeqScanWithQualProject     53      0
> .text.ExecSeqScanWithQualProject             474      0
> .text.unlikely.ExecSeqScan                    53      0
> .text.ExecSeqScan                            245      0
> .text.ExecInitSeqScan                        287      0
> .text.ExecEndSeqScan                          33      0
> .text.ExecReScanSeqScan                       63      0
> .text.ExecSeqScanEstimate                     88      0
> .text.ExecSeqScanInitializeDSM               114      0
> .text.ExecSeqScanReInitializeDSM              34      0
> .text.ExecSeqScanInitializeWorker             64      0
>
>
> I'm rather baffled that the size of ExecSeqScanWithQualProject goes from 811
> to 474, just due to those null checks being removed...  But I'll take it.

Wow, indeed.

> > Thanks for the patch.
> >
> > +    /*
> > +     * Use pg_assume() for != NULL tests to make the compiler realize no
> > +     * runtime check for the field is needed in ExecScanExtended().
> > +     */
> >
> > I propose changing "to make the compiler realize no runtime check" to
> > "so the compiler can optimize away the runtime check", assuming that
> > is what it means.
>
> It does.  I don't really see a meaningful difference between the comments?

Maybe not. I just had to pause for a moment to be sure that was what
it actually meant when I first read it. I'm fine leaving it as is if
you prefer.

> > > I have a separate question as well - do we need to call ResetExprContext() if
> > > we neither qual, projection nor epq?  I see a small gain by avoiding that.
> >
> > You're referring to this block, I assume:
> >
> >     /*
> >      * If we have neither a qual to check nor a projection to do, just skip
> >      * all the overhead and return the raw scan tuple.
> >      */
> >     if (!qual && !projInfo)
> >     {
> >         ResetExprContext(econtext);
> >         return ExecScanFetch(node, epqstate, accessMtd, recheckMtd);
> >     }
>
> Yep.
>
>
> > Yeah, I think it's fine to remove ResetExprContext() here. When I
> > looked at it before, I left it in because I was unsure whether
> > accessMtd() might leak memory into the per-tuple context.
>
> It's a good question.  I think I unfortunately found a problematic case,
> ForeignNext().

Ah, so we do have a culprit in the tree.

> I wonder if we instead can MemoryContextReset cheaper, by avoiding a function
> call for the common case that no reset is needed. Right now we can't just
> check ->isReset in an inline function, because we also delete children.  I
> wonder if we could define isReset so that creating a child context unsets
> isReset?

Were you thinking ResetExprContext() could become something like:

#define ResetExprContext(econtext) \
    do { \
        if (!((econtext)->ecxt_per_tuple_memory)->isReset) \
            MemoryContextReset((econtext)->ecxt_per_tuple_memory); \
    } while (0)

that is, once isReset also accounts for whether any child context exists?

--
Thanks, Amit Langote

Re: Some ExecSeqScan optimizations

От

Nikita Malakhov

Дата:

11 июля 2025 г., 14:03:42

Hi Amit!

It's a pity I missed this thread when you developed the patch.

I've developed a feature recently and discovered that SeqScan

does not make use of scan keys, and there is a Tom Lane's

comment regarding this:

* Note that unlike IndexScan, SeqScan never use keys in heap_beginscan
* (and this is very bad) - so, here we do not check are keys ok or not.

Have you considered passing scan keys like it is done in IndexScan?

Thanks!

Regards,

Nikita Malakhov

Postgres Professional

The Russian Postgres Company

https://postgrespro.ru/

Re: Some ExecSeqScan optimizations

От

Andres Freund

Дата:

11 июля 2025 г., 17:34:33

Hi,

On 2025-07-11 11:22:36 +0900, Amit Langote wrote:
> On Fri, Jul 11, 2025 at 5:55 AM Andres Freund <andres@anarazel.de> wrote:
> > On 2025-07-10 17:28:50 +0900, Amit Langote wrote:
> > > Thanks for the patch.
> > >
> > > +    /*
> > > +     * Use pg_assume() for != NULL tests to make the compiler realize no
> > > +     * runtime check for the field is needed in ExecScanExtended().
> > > +     */
> > >
> > > I propose changing "to make the compiler realize no runtime check" to
> > > "so the compiler can optimize away the runtime check", assuming that
> > > is what it means.
> >
> > It does.  I don't really see a meaningful difference between the comments?
> 
> Maybe not. I just had to pause for a moment to be sure that was what
> it actually meant when I first read it. I'm fine leaving it as is if
> you prefer.

To me my version makes a bit more sense, by explaining that we tell the
compiler information that it otherwise doesn't have, which results in the
optimization...


> > > > I have a separate question as well - do we need to call ResetExprContext() if
> > > > we neither qual, projection nor epq?  I see a small gain by avoiding that.
> > >
> > > You're referring to this block, I assume:
> > >
> > >     /*
> > >      * If we have neither a qual to check nor a projection to do, just skip
> > >      * all the overhead and return the raw scan tuple.
> > >      */
> > >     if (!qual && !projInfo)
> > >     {
> > >         ResetExprContext(econtext);
> > >         return ExecScanFetch(node, epqstate, accessMtd, recheckMtd);
> > >     }
> >
> > Yep.
> >
> >
> > > Yeah, I think it's fine to remove ResetExprContext() here. When I
> > > looked at it before, I left it in because I was unsure whether
> > > accessMtd() might leak memory into the per-tuple context.
> >
> > It's a good question.  I think I unfortunately found a problematic case,
> > ForeignNext().
> 
> Ah, so we do have a culprit in the tree.
> 
> > I wonder if we instead can MemoryContextReset cheaper, by avoiding a function
> > call for the common case that no reset is needed. Right now we can't just
> > check ->isReset in an inline function, because we also delete children.  I
> > wonder if we could define isReset so that creating a child context unsets
> > isReset?
> 
> Were you thinking ResetExprContext() could become something like:
> 
> #define ResetExprContext(econtext) \
>     do { \
>         if (!((econtext)->ecxt_per_tuple_memory)->isReset) \
>             MemoryContextReset((econtext)->ecxt_per_tuple_memory); \
>     } while (0)
> 
> that is, once isReset also accounts for whether any child context exists?

Nearly - I was thinking we'd do that in MemoryContextReset(), rather than
ResetExprContext().

Greetings,

Andres Freund

Re: Some ExecSeqScan optimizations

От

Andres Freund

Дата:

11 июля 2025 г., 17:37:24

Hi,

On 2025-07-11 14:03:42 +0300, Nikita Malakhov wrote:
> It's a pity I missed this thread when you developed the patch.
> I've developed a feature recently and discovered that SeqScan
> does not make use of scan keys, and there is a Tom Lane's
> comment regarding this:
>   * Note that unlike IndexScan, SeqScan never use keys in heap_beginscan
>   * (and this is very bad) - so, here we do not check are keys ok or not.
> 
> Have you considered passing scan keys like it is done in IndexScan?

You can't easily do that without causing issues:

1) ScanKeys are evaluated while holding a buffer lock, we shouldn't do that
   with arbitrary functions (since they could recurse and acquire other locks
   in a non-correct order)

2) ScanKeys are rather restrictive in what they can express, but not
   restrictive enough to make 1) not a problem. That means that you can't just
   evaluate the whole predicate using ScanKeys.

3) ScanKey evaluation is actually sometimes *more* expensive than expression
   evaluation, because the columns are deformed one-by-one.

Greetings,

Andres Freund

Re: Some ExecSeqScan optimizations

От

Nikita Malakhov

Дата:

13 июля 2025 г., 15:45:41

Hi!

Andres, thank you for the explanation about the locks.

I've already tried to pass scan keys and saw that it is

quite expensive.

Regards,

Nikita Malakhov

Postgres Professional

The Russian Postgres Company

https://postgrespro.ru/

Re: Some ExecSeqScan optimizations

От

Amit Langote

Дата:

15 июля 2025 г., 06:06:29

Hi,

On Fri, Jul 11, 2025 at 11:34 PM Andres Freund <andres@anarazel.de> wrote:
> On 2025-07-11 11:22:36 +0900, Amit Langote wrote:
> > On Fri, Jul 11, 2025 at 5:55 AM Andres Freund <andres@anarazel.de> wrote:
> > > On 2025-07-10 17:28:50 +0900, Amit Langote wrote:
> > > > Thanks for the patch.
> > > >
> > > > +    /*
> > > > +     * Use pg_assume() for != NULL tests to make the compiler realize no
> > > > +     * runtime check for the field is needed in ExecScanExtended().
> > > > +     */
> > > >
> > > > I propose changing "to make the compiler realize no runtime check" to
> > > > "so the compiler can optimize away the runtime check", assuming that
> > > > is what it means.
> > >
> > > It does.  I don't really see a meaningful difference between the comments?
> >
> > Maybe not. I just had to pause for a moment to be sure that was what
> > it actually meant when I first read it. I'm fine leaving it as is if
> > you prefer.
>
> To me my version makes a bit more sense, by explaining that we tell the
> compiler information that it otherwise doesn't have, which results in the
> optimization...

Ok, that does make sense.

> > > > > I have a separate question as well - do we need to call ResetExprContext() if
> > > > > we neither qual, projection nor epq?  I see a small gain by avoiding that.
> > > >
> > > > You're referring to this block, I assume:
> > > >
> > > >     /*
> > > >      * If we have neither a qual to check nor a projection to do, just skip
> > > >      * all the overhead and return the raw scan tuple.
> > > >      */
> > > >     if (!qual && !projInfo)
> > > >     {
> > > >         ResetExprContext(econtext);
> > > >         return ExecScanFetch(node, epqstate, accessMtd, recheckMtd);
> > > >     }
> > >
> > > Yep.
> > >
> > >
> > > > Yeah, I think it's fine to remove ResetExprContext() here. When I
> > > > looked at it before, I left it in because I was unsure whether
> > > > accessMtd() might leak memory into the per-tuple context.
> > >
> > > I wonder if we instead can MemoryContextReset cheaper, by avoiding a function
> > > call for the common case that no reset is needed. Right now we can't just
> > > check ->isReset in an inline function, because we also delete children.  I
> > > wonder if we could define isReset so that creating a child context unsets
> > > isReset?
> >
> > Were you thinking ResetExprContext() could become something like:
> >
> > #define ResetExprContext(econtext) \
> >     do { \
> >         if (!((econtext)->ecxt_per_tuple_memory)->isReset) \
> >             MemoryContextReset((econtext)->ecxt_per_tuple_memory); \
> >     } while (0)
> >
> > that is, once isReset also accounts for whether any child context exists?
>
> Nearly - I was thinking we'd do that in MemoryContextReset(), rather than
> ResetExprContext().

Ah, ok -- I was confused about which function you meant ("can't just
check ->isReset in an inline function" should have been a clue). I
thought you were referring to avoiding the call to
MemoryContextReset() itself from ExecScanExtended() by checking
isReset.

But it sounds like you meant optimizing within MemoryContextReset() --
specifically, skipping MemoryContextDeleteChildren() when isReset is
already true, so it becomes:

if (context->isReset)
    return;
MemoryContextDeleteChildren(context);
MemoryContextResetOnly(context);

Just out of curiosity, I tried making that change locally, and meson
test (check-world) passed. I assume that's just because nothing
notices leaked child contexts -- there's no mechanism asserting that
everything under a context gets reset if we skip
MemoryContextDeleteChildren().

That’s not to say we don't need MemoryContextCreate() to clear isReset
in the parent when adding a child. :-)


--
Thanks, Amit Langote

Re: Some ExecSeqScan optimizations

От

Andres Freund

Дата:

11 августа 2025 г., 23:19:45

Hi,

On 2025-07-11 11:22:36 +0900, Amit Langote wrote:
> On Fri, Jul 11, 2025 at 5:55 AM Andres Freund <andres@anarazel.de> wrote:
> > On 2025-07-10 17:28:50 +0900, Amit Langote wrote:
> > > On Thu, Jul 10, 2025 at 8:34 AM Andres Freund <andres@anarazel.de> wrote:
> > > > The performance gain unsurprisingly isn't significant (but seems repeatably
> > > > measureable), but it does cut out a fair bit of unnecessary code.
> > > >
> > > > andres@awork3:/srv/dev/build/postgres/m-dev-optimize$ size executor_nodeSeqscan.c.*o
> > > >    text    data     bss     dec     hex filename
> > > >    3330       0       0    3330     d02 executor_nodeSeqscan.c.assume.o
> > > >    3834       0       0    3834     efa executor_nodeSeqscan.c.o
> > > >
> > > > A 13% reduction in actual code size isn't bad for such a small change, imo.
> > >
> > > Yeah, that seems worthwhile. I had been a bit concerned about code
> > > size growth from having four variant functions with at least some
> > > duplication, so this is a nice offset.
> >
> > I'm rather surprised by just how much the size reduces...
> >
> > I built nodeSeqscan.c with -ffunction-sections and looked at the size with
> > size --format=sysv:
> >
> > Before:
> > .text.SeqRecheck                               6      0
> > .rodata.str1.8                               135      0
> > .text.unlikely.SeqNext                        53      0
> > .text.SeqNext                                178      0
> > .text.ExecSeqScanEPQ                          20      0
> > .text.ExecSeqScanWithProject                 289      0
> > .text.unlikely.ExecSeqScanWithQual            53      0
> > .text.ExecSeqScanWithQual                    441      0
> > .text.unlikely.ExecSeqScanWithQualProject     53      0
> > .text.ExecSeqScanWithQualProject             811      0
> > .text.unlikely.ExecSeqScan                    53      0
> > .text.ExecSeqScan                            245      0
> > .text.ExecInitSeqScan                        287      0
> > .text.ExecEndSeqScan                          33      0
> > .text.ExecReScanSeqScan                       63      0
> > .text.ExecSeqScanEstimate                     88      0
> > .text.ExecSeqScanInitializeDSM               114      0
> > .text.ExecSeqScanReInitializeDSM              34      0
> > .text.ExecSeqScanInitializeWorker             64      0
> >
> > After:
> > .text.SeqRecheck                               6      0
> > .rodata.str1.8                               135      0
> > .text.unlikely.SeqNext                        53      0
> > .text.SeqNext                                178      0
> > .text.ExecSeqScanEPQ                          20      0
> > .text.ExecSeqScanWithProject                 209      0
> > .text.unlikely.ExecSeqScanWithQual            53      0
> > .text.ExecSeqScanWithQual                    373      0
> > .text.unlikely.ExecSeqScanWithQualProject     53      0
> > .text.ExecSeqScanWithQualProject             474      0
> > .text.unlikely.ExecSeqScan                    53      0
> > .text.ExecSeqScan                            245      0
> > .text.ExecInitSeqScan                        287      0
> > .text.ExecEndSeqScan                          33      0
> > .text.ExecReScanSeqScan                       63      0
> > .text.ExecSeqScanEstimate                     88      0
> > .text.ExecSeqScanInitializeDSM               114      0
> > .text.ExecSeqScanReInitializeDSM              34      0
> > .text.ExecSeqScanInitializeWorker             64      0
> >
> >
> > I'm rather baffled that the size of ExecSeqScanWithQualProject goes from 811
> > to 474, just due to those null checks being removed...  But I'll take it.
> 
> Wow, indeed.

Thanks for reviewing. Pushed!

Greetings,

Andres Freund

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Re: Some ExecSeqScan optimizations

Вложения