Обсуждение: Efficient output for integer types

Поиск

Список

Период

Сортировка

Efficient output for integer types

От

David Fetter

Дата:

15 сентября 2019 г., 07:18:49

Folks,

Please find attached a couple of patches intended to $subject.

This patch set cut the time to copy ten million rows of randomly sized
int8s (10 of them) by about a third, so at least for that case, it's
pretty decent.

Thanks to Andrew Gierth for lots of patient help.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Вложения

Re: Efficient output for integer types

От

Andrey Borodin

Дата:

15 сентября 2019 г., 09:06:29


> 15 сент. 2019 г., в 12:18, David Fetter <david@fetter.org> написал(а):
>
> Please find attached a couple of patches intended to $subject.
>
> This patch set cut the time to copy ten million rows of randomly sized
> int8s (10 of them) by about a third, so at least for that case, it's
> pretty decent.

Hi! Looks cool.

Just curious if for any fixed base and square here

+        while(uvalue >= base)
         {
+            const int i = (uvalue % square) * 2;
+            uvalue /= square;
+            vallen += 2;
+            memcpy(convert + sizeof(convert) - vallen, digits + i, 2);
+        }

compiler will have a chance to avoid idiv instruction?
Maybe few specialized functions could work better than generic algorithm?

Best regards, Andrey Borodin.

Re: Efficient output for integer types

От

David Fetter

Дата:

15 сентября 2019 г., 16:12:03

On Sun, Sep 15, 2019 at 02:06:29PM +0500, Andrey Borodin wrote:
> > 15 сент. 2019 г., в 12:18, David Fetter <david@fetter.org> написал(а):
> > 
> > Please find attached a couple of patches intended to $subject.
> > 
> > This patch set cut the time to copy ten million rows of randomly sized
> > int8s (10 of them) by about a third, so at least for that case, it's
> > pretty decent.
> 
> Hi! Looks cool.
> 
> Just curious if for any fixed base and square here
> 
> +        while(uvalue >= base)
>          {
> +            const int i = (uvalue % square) * 2;
> +            uvalue /= square;
> +            vallen += 2;
> +            memcpy(convert + sizeof(convert) - vallen, digits + i, 2);
> +        }
> 
> compiler will have a chance to avoid idiv instruction?

That could very well be.  I took the idea (and most of the code) from
the Ryū implementation Andrew Gierth committed for 12.

> Maybe few specialized functions could work better than generic
> algorithm?

Could be.  What do you have in mind?  I'm guessing that the ones for
decimals, that being both the most common case and the least obvious
as to how to optimize, would give the most benefit.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Efficient output for integer types

От

David Fetter

Дата:

17 сентября 2019 г., 06:55:05

On Sun, Sep 15, 2019 at 09:18:49AM +0200, David Fetter wrote:
> Folks,
> 
> Please find attached a couple of patches intended to $subject.
> 
> This patch set cut the time to copy ten million rows of randomly sized
> int8s (10 of them) by about a third, so at least for that case, it's
> pretty decent.

Added int4 output, removed the sprintf stuff, as it didn't seem to
help in any cases I was testing.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Вложения

v2-0001-Make-int4-and-int8-operations-more-efficent.patch

Re: Efficient output for integer types

От

David Fetter

Дата:

17 сентября 2019 г., 07:01:57

On Tue, Sep 17, 2019 at 08:55:05AM +0200, David Fetter wrote:
> On Sun, Sep 15, 2019 at 09:18:49AM +0200, David Fetter wrote:
> > Folks,
> > 
> > Please find attached a couple of patches intended to $subject.
> > 
> > This patch set cut the time to copy ten million rows of randomly sized
> > int8s (10 of them) by about a third, so at least for that case, it's
> > pretty decent.
> 
> Added int4 output, removed the sprintf stuff, as it didn't seem to
> help in any cases I was testing.

Found a couple of "whiles" that should have been "ifs."

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Вложения

v3-0001-Make-int4-and-int8-operations-more-efficent.patch

Re: Efficient output for integer types

От

David Fetter

Дата:

18 сентября 2019 г., 03:42:01

On Tue, Sep 17, 2019 at 09:01:57AM +0200, David Fetter wrote:
> On Tue, Sep 17, 2019 at 08:55:05AM +0200, David Fetter wrote:
> > On Sun, Sep 15, 2019 at 09:18:49AM +0200, David Fetter wrote:
> > > Folks,
> > > 
> > > Please find attached a couple of patches intended to $subject.
> > > 
> > > This patch set cut the time to copy ten million rows of randomly sized
> > > int8s (10 of them) by about a third, so at least for that case, it's
> > > pretty decent.
> > 
> > Added int4 output, removed the sprintf stuff, as it didn't seem to
> > help in any cases I was testing.
> 
> Found a couple of "whiles" that should have been "ifs."

Factored out some inefficient functions and made the guts use the more
efficient function.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Вложения

v4-0001-Make-int4-and-int8-operations-more-efficent.patch

Re: Efficient output for integer types

От

David Fetter

Дата:

18 сентября 2019 г., 05:51:42

On Wed, Sep 18, 2019 at 05:42:01AM +0200, David Fetter wrote:
> On Tue, Sep 17, 2019 at 09:01:57AM +0200, David Fetter wrote:
> > On Tue, Sep 17, 2019 at 08:55:05AM +0200, David Fetter wrote:
> > > On Sun, Sep 15, 2019 at 09:18:49AM +0200, David Fetter wrote:
> > > > Folks,
> > > > 
> > > > Please find attached a couple of patches intended to $subject.
> > > > 
> > > > This patch set cut the time to copy ten million rows of randomly sized
> > > > int8s (10 of them) by about a third, so at least for that case, it's
> > > > pretty decent.
> > > 
> > > Added int4 output, removed the sprintf stuff, as it didn't seem to
> > > help in any cases I was testing.
> > 
> > Found a couple of "whiles" that should have been "ifs."
> 
> Factored out some inefficient functions and made the guts use the more
> efficient function.

Fix copy-paste-o that introduced some unneeded 64-bit math.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Вложения

v5-0001-Make-int4-and-int8-operations-more-efficent.patch

Re: Efficient output for integer types

От

David Fetter

Дата:

18 сентября 2019 г., 06:26:35

On Wed, Sep 18, 2019 at 07:51:42AM +0200, David Fetter wrote:
> On Wed, Sep 18, 2019 at 05:42:01AM +0200, David Fetter wrote:
> > On Tue, Sep 17, 2019 at 09:01:57AM +0200, David Fetter wrote:
> > > On Tue, Sep 17, 2019 at 08:55:05AM +0200, David Fetter wrote:
> > > > On Sun, Sep 15, 2019 at 09:18:49AM +0200, David Fetter wrote:
> > > > > Folks,
> > > > > 
> > > > > Please find attached a couple of patches intended to $subject.
> > > > > 
> > > > > This patch set cut the time to copy ten million rows of randomly sized
> > > > > int8s (10 of them) by about a third, so at least for that case, it's
> > > > > pretty decent.
> > > > 
> > > > Added int4 output, removed the sprintf stuff, as it didn't seem to
> > > > help in any cases I was testing.
> > > 
> > > Found a couple of "whiles" that should have been "ifs."
> > 
> > Factored out some inefficient functions and made the guts use the more
> > efficient function.
> 
> Fix copy-paste-o that introduced some unneeded 64-bit math.

Removed static annotation that shouldn't have been present.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Вложения

v6-0001-Make-int4-and-int8-operations-more-efficent.patch

Re: Efficient output for integer types

От

Kyotaro Horiguchi

Дата:

18 сентября 2019 г., 07:27:46

Hello.

At Wed, 18 Sep 2019 05:42:01 +0200, David Fetter <david@fetter.org> wrote in <20190918034201.GX31596@fetter.org>
> On Tue, Sep 17, 2019 at 09:01:57AM +0200, David Fetter wrote:
> > On Tue, Sep 17, 2019 at 08:55:05AM +0200, David Fetter wrote:
> > > On Sun, Sep 15, 2019 at 09:18:49AM +0200, David Fetter wrote:
> > > > Folks,
> > > > 
> > > > Please find attached a couple of patches intended to $subject.
> > > > 
> > > > This patch set cut the time to copy ten million rows of randomly sized
> > > > int8s (10 of them) by about a third, so at least for that case, it's
> > > > pretty decent.
> > > 
> > > Added int4 output, removed the sprintf stuff, as it didn't seem to
> > > help in any cases I was testing.
> > 
> > Found a couple of "whiles" that should have been "ifs."
> 
> Factored out some inefficient functions and made the guts use the more
> efficient function.

I'm not sure this is on the KISS principle, but looked it and
have several random comments.

+numutils.o: CFLAGS += $(PERMIT_DECLARATION_AFTER_STATEMENT)

I don't think that we are allowing that as project coding
policy. It seems to have been introduced only to accept external
code as-is.

-                    char        str[23];    /* sign, 21 digits and '\0' */
+                    char        str[MAXINT8LEN];

It's uneasy that MAXINT8LEN contains tailling NUL. MAXINT8BUFLEN
can be so. I think MAXINT8LEN should be 20 and the definition
should be str[MAXINT8LEN + 1].

+static const char DIGIT_TABLE[200] = {
+    '0', '0', '0', '1', '0', '2', '0', '3', '0', '4', '0', '5', '0', '6', '0', '7', '0', '8', '0', '9',

Wouldn't it be simpler if it were defined as a constant string?

static const char DIGIT_TABLE[201] =
        "000102030405....19"
        "202122232425....39"
..

+pg_ltoa_n(int32 value, char *a)
...
+    /* Compute the result string. */
+    while (value >= 100000000)

We have only two degits above the value. Isn't the stuff inside
the while a waste of cycles?

+        /* Expensive 64-bit division. Optimize? */

I believe compiler treats such trivial optimizations. (concretely
converts into shifts and subtractons if faster.)

+        memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);

Maybe it'd be easy to read if 'a + olength - i' is a single variable.

+    i += adjust;
+    return i;

If 'a + olength - i' is replaced with a variable, the return
statement is replacable with "return olength + adjust;".

+    return t + (v >= PowersOfTen[t]);

I think it's better that if it were 't - (v < POT[t]) + 1; /*
log10(v) + 1 */'. At least we need an explanation of the
difference.  (I'didn't checked the algorithm is truely right,
though.)

> void
> pg_lltoa(int64 value, char *a)
> {
..
>         memcpy(a, "-9223372036854775808", 21);
..
>        memcpy(a, "0", 2);

The lines need a comment like "/* length contains trailing '\0'
*/"

+    if (value >= 0)
...
+    else
+ {
+        if (value == PG_INT32_MIN)
+            memcpy(str, "-2147483648", 11);
+            return str + 11;
>         }
+        *str++ = '-';
+        return pg_ltostr_zeropad(str, -value, minwidth - 1);

If then block of the if statement were (values < 0), we won't
need to reenter the functaion.

+        len = pg_ltoa_n(value, str);
+        if (minwidth <= len)
+            return str + len;
+
+        memmove(str + minwidth - len, str, len);

If the function had the parameters str with the room only for two
digits and a NUL, 2 as minwidth but 1000 as value, the function
would overrun the buffer. The original function just ignores
overflowing digits.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Efficient output for integer types

От

David Fetter

Дата:

20 сентября 2019 г., 19:14:51

On Wed, Sep 18, 2019 at 04:27:46PM +0900, Kyotaro Horiguchi wrote:
> Hello.
> 
> At Wed, 18 Sep 2019 05:42:01 +0200, David Fetter <david@fetter.org> wrote in <20190918034201.GX31596@fetter.org>
> > On Tue, Sep 17, 2019 at 09:01:57AM +0200, David Fetter wrote:
> > > On Tue, Sep 17, 2019 at 08:55:05AM +0200, David Fetter wrote:
> > > > On Sun, Sep 15, 2019 at 09:18:49AM +0200, David Fetter wrote:
> > > > > Folks,
> > > > > 
> > > > > Please find attached a couple of patches intended to $subject.
> > > > > 
> > > > > This patch set cut the time to copy ten million rows of randomly sized
> > > > > int8s (10 of them) by about a third, so at least for that case, it's
> > > > > pretty decent.
> > > > 
> > > > Added int4 output, removed the sprintf stuff, as it didn't seem to
> > > > help in any cases I was testing.
> > > 
> > > Found a couple of "whiles" that should have been "ifs."
> > 
> > Factored out some inefficient functions and made the guts use the more
> > efficient function.
> 
> I'm not sure this is on the KISS principle, but looked it and
> have several random comments.
> 
> +numutils.o: CFLAGS += $(PERMIT_DECLARATION_AFTER_STATEMENT)
> 
> I don't think that we are allowing that as project coding
> policy. It seems to have been introduced only to accept external
> code as-is.

Changed to fit current policy.

> -                    char        str[23];    /* sign, 21 digits and '\0' */
> +                    char        str[MAXINT8LEN];
> 
> It's uneasy that MAXINT8LEN contains tailling NUL. MAXINT8BUFLEN
> can be so. I think MAXINT8LEN should be 20 and the definition
> should be str[MAXINT8LEN + 1].

Done.

> +static const char DIGIT_TABLE[200] = {
> +    '0', '0', '0', '1', '0', '2', '0', '3', '0', '4', '0', '5', '0', '6', '0', '7', '0', '8', '0', '9',
> 
> Wouldn't it be simpler if it were defined as a constant string?
> 
> static const char DIGIT_TABLE[201] =
>         "000102030405....19"
>         "202122232425....39"
> ..

I thought this might be even clearer:

"00" "01" "02" "03" "04" "05" "06" "07" "08" "09"
"10" "11" "12" "13" "14" "15" "16" "17" "18" "19"
"20" "21" "22" "23" "24" "25" "26" "27" "28" "29"
"30" "31" "32" "33" "34" "35" "36" "37" "38" "39"
"40" "41" "42" "43" "44" "45" "46" "47" "48" "49"
"50" "51" "52" "53" "54" "55" "56" "57" "58" "59"
"60" "61" "62" "63" "64" "65" "66" "67" "68" "69"
"70" "71" "72" "73" "74" "75" "76" "77" "78" "79"
"80" "81" "82" "83" "84" "85" "86" "87" "88" "89"
"90" "91" "92" "93" "94" "95" "96" "97" "98" "99";

> +pg_ltoa_n(int32 value, char *a)
> ...
> +    /* Compute the result string. */
> +    while (value >= 100000000)
> 
> We have only two degits above the value. Isn't the stuff inside
> the while a waste of cycles?

Changed the while to an if.

> +        /* Expensive 64-bit division. Optimize? */
> 
> I believe compiler treats such trivial optimizations. (concretely
> converts into shifts and subtractons if faster.)

Comments removed.

> +        memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);
> 
> Maybe it'd be easy to read if 'a + olength - i' is a single variable.

Done.

> +    i += adjust;
> +    return i;
> 
> If 'a + olength - i' is replaced with a variable, the return
> statement is replacable with "return olength + adjust;".

I'm not sure I understand this.

> +    return t + (v >= PowersOfTen[t]);
> 
> I think it's better that if it were 't - (v < POT[t]) + 1; /*
> log10(v) + 1 */'. At least we need an explanation of the
> difference.  (I'didn't checked the algorithm is truely right,
> though.)

Comments added.

> > void
> > pg_lltoa(int64 value, char *a)
> > {
> ..
> >         memcpy(a, "-9223372036854775808", 21);
> ..
> >        memcpy(a, "0", 2);
> 
> The lines need a comment like "/* length contains trailing '\0'
> */"

Comments added.

> +    if (value >= 0)
> ...
> +    else
> + {
> +        if (value == PG_INT32_MIN)
> +            memcpy(str, "-2147483648", 11);
> +            return str + 11;
> >         }
> +        *str++ = '-';
> +        return pg_ltostr_zeropad(str, -value, minwidth - 1);
> 
> If then block of the if statement were (values < 0), we won't
> need to reenter the functaion.

This is a tail-call recursion, so it's probably optimized already.

> +        len = pg_ltoa_n(value, str);
> +        if (minwidth <= len)
> +            return str + len;
> +
> +        memmove(str + minwidth - len, str, len);
> 
> If the function had the parameters str with the room only for two
> digits and a NUL, 2 as minwidth but 1000 as value, the function
> would overrun the buffer. The original function just ignores
> overflowing digits.

I believe the original was incorrect.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Вложения

v7-0001-Make-int4-and-int8-operations-more-efficent.patch

Re: Efficient output for integer types

От

David Fetter

Дата:

20 сентября 2019 г., 21:09:16

On Fri, Sep 20, 2019 at 09:14:51PM +0200, David Fetter wrote:
> On Wed, Sep 18, 2019 at 04:27:46PM +0900, Kyotaro Horiguchi wrote:
> > Hello.
> > 
> > At Wed, 18 Sep 2019 05:42:01 +0200, David Fetter <david@fetter.org> wrote in <20190918034201.GX31596@fetter.org>
> > > On Tue, Sep 17, 2019 at 09:01:57AM +0200, David Fetter wrote:
> > > > On Tue, Sep 17, 2019 at 08:55:05AM +0200, David Fetter wrote:
> > > > > On Sun, Sep 15, 2019 at 09:18:49AM +0200, David Fetter wrote:
> > > > > > Folks,
> > > > > > 
> > > > > > Please find attached a couple of patches intended to $subject.
> > > > > > 
> > > > > > This patch set cut the time to copy ten million rows of randomly sized
> > > > > > int8s (10 of them) by about a third, so at least for that case, it's
> > > > > > pretty decent.
> > > > > 
> > > > > Added int4 output, removed the sprintf stuff, as it didn't seem to
> > > > > help in any cases I was testing.
> > > > 
> > > > Found a couple of "whiles" that should have been "ifs."
> > > 
> > > Factored out some inefficient functions and made the guts use the more
> > > efficient function.
> > 
> > I'm not sure this is on the KISS principle, but looked it and
> > have several random comments.
> > 
> > +numutils.o: CFLAGS += $(PERMIT_DECLARATION_AFTER_STATEMENT)

Oops.  Missed a few.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Вложения

v8-0001-Make-int4-and-int8-operations-more-efficent.patch

Re: Efficient output for integer types

От

David Fetter

Дата:

20 сентября 2019 г., 21:18:13

On Fri, Sep 20, 2019 at 11:09:16PM +0200, David Fetter wrote:
> On Fri, Sep 20, 2019 at 09:14:51PM +0200, David Fetter wrote:
> > On Wed, Sep 18, 2019 at 04:27:46PM +0900, Kyotaro Horiguchi wrote:
> > > Hello.
> > > 
> > > At Wed, 18 Sep 2019 05:42:01 +0200, David Fetter <david@fetter.org> wrote in <20190918034201.GX31596@fetter.org>
> > > > On Tue, Sep 17, 2019 at 09:01:57AM +0200, David Fetter wrote:
> > > > > On Tue, Sep 17, 2019 at 08:55:05AM +0200, David Fetter wrote:
> > > > > > On Sun, Sep 15, 2019 at 09:18:49AM +0200, David Fetter wrote:
> > > > > > > Folks,
> > > > > > > 
> > > > > > > Please find attached a couple of patches intended to $subject.
> > > > > > > 
> > > > > > > This patch set cut the time to copy ten million rows of randomly sized
> > > > > > > int8s (10 of them) by about a third, so at least for that case, it's
> > > > > > > pretty decent.
> > > > > > 
> > > > > > Added int4 output, removed the sprintf stuff, as it didn't seem to
> > > > > > help in any cases I was testing.
> > > > > 
> > > > > Found a couple of "whiles" that should have been "ifs."
> > > > 
> > > > Factored out some inefficient functions and made the guts use the more
> > > > efficient function.
> > > 
> > > I'm not sure this is on the KISS principle, but looked it and
> > > have several random comments.
> > > 
> > > +numutils.o: CFLAGS += $(PERMIT_DECLARATION_AFTER_STATEMENT)
> 
> Oops.  Missed a few.

D'oh!  Wrong patch.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Вложения

v9-0001-Make-int4-and-int8-operations-more-efficent.patch

Re: Efficient output for integer types

От

Andrew Gierth

Дата:

21 сентября 2019 г., 02:36:21

>>>>> "David" == David Fetter <david@fetter.org> writes:

 David> +    /* Compute the result string. */
 David> +    if (value >= 100000000)
 David> +    {
 David> +        const    uint32 value2 = value % 100000000;
 David> +
 David> +        const uint32 c = value2 % 10000;
 David> +        const uint32 d = value2 / 10000;
 David> +        const uint32 c0 = (c % 100) << 1;
 David> +        const uint32 c1 = (c / 100) << 1;
 David> +        const uint32 d0 = (d % 100) << 1;
 David> +        const uint32 d1 = (d / 100) << 1;
 David> +
 David> +        char *pos = a + olength - i;
 David> +
 David> +        value /= 100000000;
 David> +
 David> +        memcpy(pos - 2, DIGIT_TABLE + c0, 2);
 David> +        memcpy(pos - 4, DIGIT_TABLE + c1, 2);
 David> +        memcpy(pos - 6, DIGIT_TABLE + d0, 2);
 David> +        memcpy(pos - 8, DIGIT_TABLE + d1, 2);
 David> +        i += 8;
 David> +    }

For the 32-bit case, there's no point in doing an 8-digit divide
specially, it doesn't save any time. It's sufficient to just change

 David> +    if (value >= 10000)

to              while(value >= 10000)

in order to process 4 digits at a time.

 David> +        for(int i = 0; i < minwidth - len; i++)
 David> +        {
 David> +            memcpy(str + i, DIGIT_TABLE, 1);
 David> +        }

Should be:
                        memset(str, '0', minwidth-len);

-- 
Andrew (irc:RhodiumToad)

Re: Efficient output for integer types

От

David Fetter

Дата:

21 сентября 2019 г., 06:08:35

On Sat, Sep 21, 2019 at 03:36:21AM +0100, Andrew Gierth wrote:
> >>>>> "David" == David Fetter <david@fetter.org> writes:
> 
>  David> +    /* Compute the result string. */
>  David> +    if (value >= 100000000)
>  David> +    {
>  David> +        const    uint32 value2 = value % 100000000;
>  David> +
>  David> +        const uint32 c = value2 % 10000;
>  David> +        const uint32 d = value2 / 10000;
>  David> +        const uint32 c0 = (c % 100) << 1;
>  David> +        const uint32 c1 = (c / 100) << 1;
>  David> +        const uint32 d0 = (d % 100) << 1;
>  David> +        const uint32 d1 = (d / 100) << 1;
>  David> +
>  David> +        char *pos = a + olength - i;
>  David> +
>  David> +        value /= 100000000;
>  David> +
>  David> +        memcpy(pos - 2, DIGIT_TABLE + c0, 2);
>  David> +        memcpy(pos - 4, DIGIT_TABLE + c1, 2);
>  David> +        memcpy(pos - 6, DIGIT_TABLE + d0, 2);
>  David> +        memcpy(pos - 8, DIGIT_TABLE + d1, 2);
>  David> +        i += 8;
>  David> +    }
> 
> For the 32-bit case, there's no point in doing an 8-digit divide
> specially, it doesn't save any time. It's sufficient to just change
> 
>  David> +    if (value >= 10000)
> 
> to              while(value >= 10000)

Done.

> in order to process 4 digits at a time.
> 
>  David> +        for(int i = 0; i < minwidth - len; i++)
>  David> +        {
>  David> +            memcpy(str + i, DIGIT_TABLE, 1);
>  David> +        }
> 
> Should be:
>                         memset(str, '0', minwidth-len);

Done.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Вложения

v10-0001-Make-int4-and-int8-operations-more-efficent.patch

Re: Efficient output for integer types

От

Andrew Gierth

Дата:

21 сентября 2019 г., 06:29:25

>>>>> "David" == David Fetter <david@fetter.org> writes:

 David> +static inline uint32
 David> +decimalLength64(const uint64_t v)

Should be uint64, not uint64_t.

Also return an int, not a uint32.

For int vs. int32, my own inclination is to use "int" where the value is
just a (smallish) number, especially one that will be used as an index
or loop count, and use "int32" when it actually matters that it's 32
bits rather than some other size. Other opinions may differ.

 David> +{
 David> +    uint32            t;
 David> +    static uint64_t    PowersOfTen[] = {

uint64 not uint64_t here too.

 David> +int32
 David> +pg_ltoa_n(uint32 value, char *a)

If this is going to handle only unsigned values, it should probably be
named pg_ultoa_n.

 David> +    uint32    i = 0, adjust = 0;

"adjust" is not assigned anywhere else. Presumably that's from previous
handling of negative numbers?

 David> +        memcpy(a, "0", 1);

 *a = '0';  would suffice.

 David> +    i += adjust;

Superfluous?

 David> +    uint32_t    uvalue = (uint32_t)value;

uint32 not uint32_t.

 David> +    int32        len;

See above re. int vs. int32.

 David> +        uvalue = (uint32_t)0 - (uint32_t)value;

Should be uint32 not uint32_t again.

For anyone wondering, I suggested this to David in place of the ugly
special casing of INT32_MIN. This method avoids the UB of doing (-value)
where value==INT32_MIN, and is nevertheless required to produce the
correct result:

1. If value < 0, then ((uint32)value) is (value + UINT32_MAX + 1)
2. (uint32)0 - (uint32)value
      becomes  (UINT32_MAX+1)-(value+UINT32_MAX+1)
      which is (-value) as required

 David> +int32
 David> +pg_lltoa_n(uint64_t value, char *a)

Again, if this is doing unsigned, then it should be named pg_ulltoa_n

 David> +        if (value == PG_INT32_MIN)

This being inconsistent with the others is not nice.

-- 
Andrew (irc:RhodiumToad)

Re: Efficient output for integer types

От

David Fetter

Дата:

22 сентября 2019 г., 21:58:04

On Sat, Sep 21, 2019 at 07:29:25AM +0100, Andrew Gierth wrote:
> >>>>> "David" == David Fetter <david@fetter.org> writes:
> 
>  David> +static inline uint32
>  David> +decimalLength64(const uint64_t v)
> 
> Should be uint64, not uint64_t.

Fixed.

> Also return an int, not a uint32.

Fixed.

> For int vs. int32, my own inclination is to use "int" where the value is
> just a (smallish) number, especially one that will be used as an index
> or loop count, and use "int32" when it actually matters that it's 32
> bits rather than some other size. Other opinions may differ.

Done with int.

>  David> +{
>  David> +    uint32            t;
>  David> +    static uint64_t    PowersOfTen[] = {
> 
> uint64 not uint64_t here too.

Fixed.

>  David> +int32
>  David> +pg_ltoa_n(uint32 value, char *a)
> 
> If this is going to handle only unsigned values, it should probably be
> named pg_ultoa_n.

It does signed values now.

>  David> +    uint32    i = 0, adjust = 0;
> 
> "adjust" is not assigned anywhere else. Presumably that's from previous
> handling of negative numbers?

It was, and now it's gone.

>  David> +        memcpy(a, "0", 1);
> 
>  *a = '0';  would suffice.

Fixed.

>  David> +    i += adjust;
> 
> Superfluous?

Yep. Gone.

>  David> +    uint32_t    uvalue = (uint32_t)value;
> 
> uint32 not uint32_t.

Fixed.

>  David> +    int32        len;
> 
> See above re. int vs. int32.

Done that way.

>  David> +        uvalue = (uint32_t)0 - (uint32_t)value;
> 
> Should be uint32 not uint32_t again.

Done.

> For anyone wondering, I suggested this to David in place of the ugly
> special casing of INT32_MIN. This method avoids the UB of doing (-value)
> where value==INT32_MIN, and is nevertheless required to produce the
> correct result:
> 
> 1. If value < 0, then ((uint32)value) is (value + UINT32_MAX + 1)
> 2. (uint32)0 - (uint32)value
>       becomes  (UINT32_MAX+1)-(value+UINT32_MAX+1)
>       which is (-value) as required
> 
>  David> +int32
>  David> +pg_lltoa_n(uint64_t value, char *a)
> 
> Again, if this is doing unsigned, then it should be named pg_ulltoa_n

Renamed to allow the uint64s that de-special-casing INT32_MIN/INT64_MIN requires.

>  David> +        if (value == PG_INT32_MIN)
> 
> This being inconsistent with the others is not nice.

Fixed.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Вложения

v11-0001-Make-int4-and-int8-operations-more-efficent.patch

Re: Efficient output for integer types

От

Tels

Дата:

23 сентября 2019 г., 08:28:09

Moin,

On 2019-09-22 23:58, David Fetter wrote:
> On Sat, Sep 21, 2019 at 07:29:25AM +0100, Andrew Gierth wrote:
>> >>>>> "David" == David Fetter <david@fetter.org> writes:

> Fixed.

Good work, more performance is sure nice :)

Noticed one more thing in the patch:

> -        *start++ = *a;
> -        *a-- = swap;
> +        memcpy(pos - 2, DIGIT_TABLE + c, 2);
> +        i += 2;
>     }
> +    else
> +        *a = (char) ('0' + value2);
> +
> +    return olength;
> }

The line "i += 2;" modifies i, but i is never used again nor returned.

Best regards,

Tels

Re: Efficient output for integer types

От

Andrew Gierth

Дата:

23 сентября 2019 г., 12:16:36

>>>>> "David" == David Fetter <david@fetter.org> writes:

 David> + return pg_ltostr_zeropad(str, (uint32)0 - (uint32)value, minwidth - 1);

No, this is just reintroducing the undefined behavior again. Once the
value has been converted to unsigned you can't cast it back to signed or
pass it to a function expecting a signed value, since it will overflow
in the INT_MIN case. (and in this example would probably output '-'
signs until it ran off the end of memory).

Here's how I would do it:

char *
pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
{
    int32        len;
    uint32        uvalue = value;

    Assert(minwidth > 0);

    if (value >= 0)
    {
        if (value < 100 && minwidth == 2) /* Short cut for common case */
        {
            memcpy(str, DIGIT_TABLE + value*2, 2);
            return str + 2;
        }
    }
    else
    {
        *str++ = '-';
        minwidth -= 1;
        uvalue = (uint32)0 - uvalue;
    }
            
    len = pg_ultoa_n(uvalue, str);
    if (len >= minwidth)
        return str + len;

    memmove(str + minwidth - len, str, len);
    memset(str, '0', minwidth - len);
    return str + minwidth;
}

 David>  pg_ltostr(char *str, int32 value)
 David> +    int32    len = pg_ultoa_n(value, str);
 David> +    return str + len;

This seems to have lost its handling of negative numbers entirely (which
doesn't say much for the regression test coverage)

-- 
Andrew (irc:RhodiumToad)

Re: Efficient output for integer types

От

David Fetter

Дата:

23 сентября 2019 г., 20:25:54

On Mon, Sep 23, 2019 at 10:28:09AM +0200, Tels wrote:
> Moin,
> 
> On 2019-09-22 23:58, David Fetter wrote:
> > On Sat, Sep 21, 2019 at 07:29:25AM +0100, Andrew Gierth wrote:
> > > >>>>> "David" == David Fetter <david@fetter.org> writes:
> 
> > Fixed.
> 
> Good work, more performance is sure nice :)
> 
> Noticed one more thing in the patch:
> 
> > -        *start++ = *a;
> > -        *a-- = swap;
> > +        memcpy(pos - 2, DIGIT_TABLE + c, 2);
> > +        i += 2;
> >     }
> > +    else
> > +        *a = (char) ('0' + value2);
> > +
> > +    return olength;
> > }
> 
> The line "i += 2;" modifies i, but i is never used again nor returned.

I found a similar one in a similar function, and removed it, too.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Efficient output for integer types

От

David Fetter

Дата:

23 сентября 2019 г., 21:35:07

On Mon, Sep 23, 2019 at 01:16:36PM +0100, Andrew Gierth wrote:
> >>>>> "David" == David Fetter <david@fetter.org> writes:
> 
>  David> + return pg_ltostr_zeropad(str, (uint32)0 - (uint32)value, minwidth - 1);
> 
> No, this is just reintroducing the undefined behavior again. Once the
> value has been converted to unsigned you can't cast it back to signed or
> pass it to a function expecting a signed value, since it will overflow
> in the INT_MIN case. (and in this example would probably output '-'
> signs until it ran off the end of memory).
> 
> Here's how I would do it:
> 
> char *
> pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
> {
>     int32        len;
>     uint32        uvalue = value;
> 
>     Assert(minwidth > 0);
> 
>     if (value >= 0)
>     {
>         if (value < 100 && minwidth == 2) /* Short cut for common case */
>         {
>             memcpy(str, DIGIT_TABLE + value*2, 2);
>             return str + 2;
>         }
>     }
>     else
>     {
>         *str++ = '-';
>         minwidth -= 1;
>         uvalue = (uint32)0 - uvalue;
>     }
>             
>     len = pg_ultoa_n(uvalue, str);
>     if (len >= minwidth)
>         return str + len;
> 
>     memmove(str + minwidth - len, str, len);
>     memset(str, '0', minwidth - len);
>     return str + minwidth;
> }

Done pretty much that way.

>  David>  pg_ltostr(char *str, int32 value)
>  David> +    int32    len = pg_ultoa_n(value, str);
>  David> +    return str + len;
> 
> This seems to have lost its handling of negative numbers entirely

Given the comment that precedes it and all the use cases in the code,
I changed the signature to take an unsigned integer instead. It's
pretty clear that the intent was to add digits and only digits to the
passed-in string.

> (which doesn't say much for the regression test coverage)

I didn't see any obvious way to test functions not surfaced to SQL.
Should we have one?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Вложения

v12-0001-Make-int4-and-int8-operations-more-efficent.patch

Re: Efficient output for integer types

От

David Fetter

Дата:

24 сентября 2019 г., 04:30:18

On Mon, Sep 23, 2019 at 11:35:07PM +0200, David Fetter wrote:
> On Mon, Sep 23, 2019 at 01:16:36PM +0100, Andrew Gierth wrote:

Per discussion on IRC, change some functions to take only unsigned
integer types so as not to branch for the case of negative numbers
they're never actually called with.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Вложения

v13-0001-Make-int4-and-int8-operations-more-efficent.patch

Re: Efficient output for integer types

От

David Fetter

Дата:

24 сентября 2019 г., 05:26:21

On Tue, Sep 24, 2019 at 06:30:18AM +0200, David Fetter wrote:
> On Mon, Sep 23, 2019 at 11:35:07PM +0200, David Fetter wrote:
> > On Mon, Sep 23, 2019 at 01:16:36PM +0100, Andrew Gierth wrote:
> 
> Per discussion on IRC, change some functions to take only unsigned
> integer types so as not to branch for the case of negative numbers
> they're never actually called with.
> 
> Best,
> David.

...and part of a pgindent run

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Вложения

v14-0001-Make-int4-and-int8-operations-more-efficent.patch

Re: Efficient output for integer types

От

Tomas Vondra

Дата:

11 января 2020 г., 13:31:59

Hi,

Any plans regarding committing this patch? I see the thread is silent
since September 24, when the last patch version was posted. The patch is
already marked as RFC since December, when David changed the status. I
don't have any opinion whether the patch is RFC or not (it might well
be), but IMHO it should have been mentioned in this thread.

I did a quick test to see how much more efficient this is, and for a
table with 10 bigint columns and 5M random rows the COPY to /dev/null
went from 3000 ms to ~2700 ms. That's not the 30% speedup mentioned by
David in the first message, but 10% is still pretty nice.

Of course, for real-world use cases the speedup will be lower because of
using other data types too, I/O etc. But it's still nice.

So, is anyone opposed to pushing this? If not, who'll to do that?  I see
Andrew Gierth was involved in the discussions on IRC and it's related to
the Ryu patch, so maybe he want's to take care of this. Andrew?


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Efficient output for integer types

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения