Обсуждение: encode/decode support for base64url

Поиск
Список
Период
Сортировка

encode/decode support for base64url

От
Przemysław Sztoch
Дата:

Hello,

Sometimes support for base64url from RFC 4648 would be useful.

Does anyone else need a patch like this?

--
Przemysław Sztoch | Mobile +48 509 99 00 66


Re: encode/decode support for base64url

От
Daniel Gustafsson
Дата:
> On 4 Mar 2025, at 09:54, Przemysław Sztoch <przemyslaw@sztoch.pl> wrote:

> Sometimes support for base64url from RFC 4648 would be useful.
> Does anyone else need a patch like this?

While not a frequent ask, it has been mentioned in the past.  I think it would
make sense to add so please do submit a patch for it for consideration.

--
Daniel Gustafsson




Re: encode/decode support for base64url

От
Aleksander Alekseev
Дата:
Hi,

> > Sometimes support for base64url from RFC 4648 would be useful.
> > Does anyone else need a patch like this?
>
> While not a frequent ask, it has been mentioned in the past.  I think it would
> make sense to add so please do submit a patch for it for consideration.

IMO it would be nice to have.

Would you like to submit such a patch or are you merely suggesting an
idea for others to implement?

-- 
Best regards,
Aleksander Alekseev



Re: encode/decode support for base64url

От
Florents Tselai
Дата:


On 7 Mar 2025, at 4:40 PM, Aleksander Alekseev <aleksander@timescale.com> wrote:

Hi,

Sometimes support for base64url from RFC 4648 would be useful.
Does anyone else need a patch like this?

While not a frequent ask, it has been mentioned in the past.  I think it would
make sense to add so please do submit a patch for it for consideration.

IMO it would be nice to have.

Would you like to submit such a patch or are you merely suggesting an
idea for others to implement?

--
Best regards,
Aleksander Alekseev



Just to confirm: 

In a plan SQL flavor, we’re talking about something like this, correct?

CREATE FUNCTION base64url_encode(input bytea) RETURNS text AS $$
SELECT regexp_replace(
    replace(replace(encode(input, 'base64'), '+', '-'), '/', '_'),
    '=+$', '', 'g'
);
$$ LANGUAGE sql IMMUTABLE;

CREATE FUNCTION base64url_decode(input text) RETURNS bytea AS $$
SELECT decode(
    rpad(replace(replace(input, '-', '+'), '_', '/'), (length(input) + 3) & ~3, '='),
    'base64'
);
$$ LANGUAGE sql IMMUTABLE;

With minimal testing, this yields the same results with https://base64.guru/standards/base64url/encode

select base64url_encode('post+gres')
 base64url_encode 
------------------
 cG9zdCtncmVz
(1 row)

Re: encode/decode support for base64url

От
Florents Tselai
Дата:





On Sun, Mar 9, 2025 at 12:28 AM Florents Tselai <florents.tselai@gmail.com> wrote:


On 7 Mar 2025, at 4:40 PM, Aleksander Alekseev <aleksander@timescale.com> wrote:

Hi,

Sometimes support for base64url from RFC 4648 would be useful.
Does anyone else need a patch like this?

While not a frequent ask, it has been mentioned in the past.  I think it would
make sense to add so please do submit a patch for it for consideration.

IMO it would be nice to have.

Would you like to submit such a patch or are you merely suggesting an
idea for others to implement?

--
Best regards,
Aleksander Alekseev



Just to confirm: 

In a plan SQL flavor, we’re talking about something like this, correct?

CREATE FUNCTION base64url_encode(input bytea) RETURNS text AS $$
SELECT regexp_replace(
    replace(replace(encode(input, 'base64'), '+', '-'), '/', '_'),
    '=+$', '', 'g'
);
$$ LANGUAGE sql IMMUTABLE;

CREATE FUNCTION base64url_decode(input text) RETURNS bytea AS $$
SELECT decode(
    rpad(replace(replace(input, '-', '+'), '_', '/'), (length(input) + 3) & ~3, '='),
    'base64'
);
$$ LANGUAGE sql IMMUTABLE;

With minimal testing, this yields the same results with https://base64.guru/standards/base64url/encode

select base64url_encode('post+gres')
 base64url_encode 
------------------
 cG9zdCtncmVz
(1 row)


Here's a C implementation for this, along with some tests and documentation.
Tests are copied from cpython's implementation of urlsafe_b64encode and urlsafe_b64decode.

The signatures look like this:

SELECT base64url_encode('www.postgresql.org'::bytea) → d3d3LnBvc3RncmVzcWwub3Jn 
SELECT convert_from(base64url_decode('d3d3LnBvc3RncmVzcWwub3Jn'), 'UTF8') → http://www.postgresql.org

 
Вложения

Re: encode/decode support for base64url

От
Daniel Gustafsson
Дата:
> On 10 Mar 2025, at 12:28, Florents Tselai <florents.tselai@gmail.com> wrote:

> Here's a C implementation for this, along with some tests and documentation.
> Tests are copied from cpython's implementation of urlsafe_b64encode and urlsafe_b64decode.

+         <function>base64url_encode</function> ( <parameter>input</parameter> <type>bytea</type> )

Shouldn't this be modelled around how base64 works with the encode() and
decode() functions, ie encode('123\001', 'base64')?

    https://www.postgresql.org/docs/devel/functions-binarystring.html

--
Daniel Gustafsson




Re: encode/decode support for base64url

От
Florents Tselai
Дата:



On Mon, Mar 10, 2025, 14:32 Daniel Gustafsson <daniel@yesql.se> wrote:
> On 10 Mar 2025, at 12:28, Florents Tselai <florents.tselai@gmail.com> wrote:

> Here's a C implementation for this, along with some tests and documentation.
> Tests are copied from cpython's implementation of urlsafe_b64encode and urlsafe_b64decode.

+         <function>base64url_encode</function> ( <parameter>input</parameter> <type>bytea</type> )

Shouldn't this be modelled around how base64 works with the encode() and
decode() functions, ie encode('123\001', 'base64')?

        https://www.postgresql.org/docs/devel/functions-binarystring.html

--
Daniel Gustafsson

Oh well - you're probably right.
I guess I was blinded by my convenience.
Adding a 'base64url' option there is more appropriate.


Re: encode/decode support for base64url

От
Cary Huang
Дата:
> Oh well - you're probably right.
> I guess I was blinded by my convenience.
> Adding a 'base64url' option there is more appropriate.

I agree with it too. It is neater to add "base64url" as a new option for
encode() and decode() SQL functions in encode.c.

In addition, you may also want to add the C versions of base64rul encode
and decode functions to "src/common/base64.c" as new API calls so that
the frontend, backend applications and extensions can also have access
to these base64url conversions. 


Cary Huang
-------------
HighGo Software Inc. (Canada)
cary.huang@highgo.ca
www.highgo.ca






Re: encode/decode support for base64url

От
Przemysław Sztoch
Дата:
On 07.03.2025 15:40, Aleksander Alekseev wrote:
Hi,

Sometimes support for base64url from RFC 4648 would be useful.
Does anyone else need a patch like this?
While not a frequent ask, it has been mentioned in the past.  I think it would
make sense to add so please do submit a patch for it for consideration.
IMO it would be nice to have.

Would you like to submit such a patch or are you merely suggesting an
idea for others to implement?

1. It is my current workaround:

SELECT convert_from(decode(rpad(translate(jwt_data, E'-_\n', '+/'), (ceil(length(translate(jwt_data, E'-_\n', '+/')) / 4::float) * 4)::integer, '='::text), 'base64'), 'UTF-8')::jsonb AS jwt_json

But it's not very elegant. I won't propose my own patch, but if someone does it, I'll be very grateful for it. :-)

2. My colleagues also have a proposal to add hex_space, dec and dec_space.

hex_space and dec_space for obvious readability in some conditions.

dec and dec_space are also sometimes much more convenient for debugging and interpreting binary data by humans. 3. In addition to base64, sometimes base32 would be useful (both from rfc4648), which doesn't have such problems:

The resulting character set is all one case, which can often be beneficial when using a case-insensitive filesystem, DNS names, spoken language, or human memory. The result can be used as a file name because it cannot possibly contain the '/' symbol, which is the Unix path separator.

--
Przemysław Sztoch | Mobile +48 509 99 00 66


Re: encode/decode support for base64url

От
Florents Tselai
Дата:


On Tue, Mar 11, 2025 at 12:51 AM Cary Huang <cary.huang@highgo.ca> wrote:
> Oh well - you're probably right.
> I guess I was blinded by my convenience.
> Adding a 'base64url' option there is more appropriate.

I agree with it too. It is neater to add "base64url" as a new option for
encode() and decode() SQL functions in encode.c.

Attaching a v2 with that.  

In addition, you may also want to add the C versions of base64rul encode
and decode functions to "src/common/base64.c" as new API calls so that
the frontend, backend applications and extensions can also have access
to these base64url conversions.


We could expose this in base64.c - it'll need some more checking 
A few more test cases, especially around padding, are necessary.
I'll come back to this. 
Вложения

Re: encode/decode support for base64url

От
Florents Tselai
Дата:


On Tue, Mar 11, 2025 at 10:08 AM Florents Tselai <florents.tselai@gmail.com> wrote:


On Tue, Mar 11, 2025 at 12:51 AM Cary Huang <cary.huang@highgo.ca> wrote:
> Oh well - you're probably right.
> I guess I was blinded by my convenience.
> Adding a 'base64url' option there is more appropriate.

I agree with it too. It is neater to add "base64url" as a new option for
encode() and decode() SQL functions in encode.c.

Attaching a v2 with that.  

In addition, you may also want to add the C versions of base64rul encode
and decode functions to "src/common/base64.c" as new API calls so that
the frontend, backend applications and extensions can also have access
to these base64url conversions.


We could expose this in base64.c - it'll need some more checking 
A few more test cases, especially around padding, are necessary.
I'll come back to this. 

Here's a v3 with some (hopefully) better test cases. 
 
Вложения

Re: encode/decode support for base64url

От
Aleksander Alekseev
Дата:
Hi Florents,

> Here's a v3 with some (hopefully) better test cases.

Thanks for the new version of the patch.

```
+    encoded_len = pg_base64_encode(src, len, dst);
+
+    /* Convert Base64 to Base64URL */
+    for (uint64 i = 0; i < encoded_len; i++) {
+        if (dst[i] == '+')
+            dst[i] = '-';
+        else if (dst[i] == '/')
+            dst[i] = '_';
+    }
```

Although it is a possible implementation, wouldn't it be better to
parametrize pg_base64_encode instead of traversing the string twice?
Same for pg_base64_decode. You can refactor pg_base64_encode and make
it a wrapper for pg_base64_encode_impl if needed.

```
+-- Flaghsip Test case against base64.
+-- Notice the = padding removed at the end and special chars.
+SELECT encode('\x69b73eff', 'base64');  -- Expected: abc+/w==
+  encode
+----------
+ abc+/w==
+(1 row)
+
+SELECT encode('\x69b73eff', 'base64url');  -- Expected: abc-_w
+ encode
+--------
+ abc-_w
+(1 row)
```

I get the idea, but calling base64 is redundant IMO. It only takes
several CPU cycles during every test run without much value. I suggest
removing it and testing corner cases for base64url instead, which is
missing at the moment. Particularly there should be tests for
encoding/decoding strings of 0/1/2/3/4 characters and making sure that
decode(encode(x)) = x, always. On top of that you should cover with
tests the cases of invalid output for decode().

-- 
Best regards,
Aleksander Alekseev



Re: encode/decode support for base64url

От
Pavel Seleznev
Дата:
Hi,

In the strings.sql file there is such code
SELECT encode('\x69b73eff', 'base64'); -- Expected: abc+/w==

In the strings.out file
+SELECT encode('\x69b73eff', 'base64'); -- Expected: abc+/w==
+ encode
+----------
+ abc+/w==
+(1 row)
+

maybe you should remove the additional description of the expected value in this way?

strings.sql
SELECT encode('\x69b73eff', 'base64') = "abc+/w=="

strings.out
SELECT encode('\x69b73eff', 'base64') = "abc+/w=="
----------
t
(1 row)

Regards,
Pavel

Re: encode/decode support for base64url

От
Florents Tselai
Дата:
Thanks for the review Aleksander;

On Mon, Mar 31, 2025 at 5:37 PM Aleksander Alekseev <aleksander@timescale.com> wrote:
Hi Florents,

> Here's a v3 with some (hopefully) better test cases.

Thanks for the new version of the patch.

```
+    encoded_len = pg_base64_encode(src, len, dst);
+
+    /* Convert Base64 to Base64URL */
+    for (uint64 i = 0; i < encoded_len; i++) {
+        if (dst[i] == '+')
+            dst[i] = '-';
+        else if (dst[i] == '/')
+            dst[i] = '_';
+    }
```

Although it is a possible implementation, wouldn't it be better to
parametrize pg_base64_encode instead of traversing the string twice?
Same for pg_base64_decode. You can refactor pg_base64_encode and make
it a wrapper for pg_base64_encode_impl if needed.

```
+-- Flaghsip Test case against base64.
+-- Notice the = padding removed at the end and special chars.
+SELECT encode('\x69b73eff', 'base64');  -- Expected: abc+/w==
+  encode
+----------
+ abc+/w==
+(1 row)
+
+SELECT encode('\x69b73eff', 'base64url');  -- Expected: abc-_w
+ encode
+--------
+ abc-_w
+(1 row)
```

I get the idea, but calling base64 is redundant IMO. It only takes
several CPU cycles during every test run without much value. I suggest
removing it and testing corner cases for base64url instead, which is
missing at the moment. Particularly there should be tests for
encoding/decoding strings of 0/1/2/3/4 characters and making sure that
decode(encode(x)) = x, always. On top of that you should cover with
tests the cases of invalid output for decode().

--
Best regards,
Aleksander Alekseev

here's a v4 patch set 

- Extracted pg_base64_{en,de}_internal with an  additional bool url param, to be used by other functions 
- Added a few more test cases

Cary mentioned above 

> In addition, you may also want to add the C versions of base64rul encode
and decode functions to "src/common/base64.c" as new API calls

Haven't done that, but I could;
Although I think it'd probably be best to do it in a separate patch.

Вложения

Re: encode/decode support for base64url

От
Aleksander Alekseev
Дата:
Hi Florents,

Thanks for the update!

> here's a v4 patch set
>
> - Extracted pg_base64_{en,de}_internal with an  additional bool url param, to be used by other functions
> - Added a few more test cases
>
> Cary mentioned above
>
> > In addition, you may also want to add the C versions of base64rul encode
> and decode functions to "src/common/base64.c" as new API calls
>
> Haven't done that, but I could;
> Although I think it'd probably be best to do it in a separate patch.

I reviewed and tested v4. To me it looks as good as it will get.
Personally I would change a few minor things here and there and
probably merge all three patches into a single commit. This however is
up to the committer to decide.

Changing the CF entry status to "RfC".



Re: encode/decode support for base64url

От
Florents Tselai
Дата:
Thanks for the review Aleksander,

> On 9 Jul 2025, at 10:45 PM, Aleksander Alekseev <aleksander@tigerdata.com> wrote:
>
> Hi Florents,
>
> Thanks for the update!
>
>> here's a v4 patch set
>>
>> - Extracted pg_base64_{en,de}_internal with an  additional bool url param, to be used by other functions
>> - Added a few more test cases
>>
>> Cary mentioned above
>>
>>> In addition, you may also want to add the C versions of base64rul encode
>> and decode functions to "src/common/base64.c" as new API calls
>>
>> Haven't done that, but I could;
>> Although I think it'd probably be best to do it in a separate patch.
>
> I reviewed and tested v4. To me it looks as good as it will get.
> Personally I would change a few minor things here and there and
> probably merge all three patches into a single commit. This however is
> up to the committer to decide.

Attaching a single-file patch


> 
> Changing the CF entry status to "RfC".


Вложения

Re: encode/decode support for base64url

От
"David E. Wheeler"
Дата:
Hi Florents,

On Jul 9, 2025, at 23:25, Florents Tselai <florents.tselai@gmail.com> wrote:

>> I reviewed and tested v4. To me it looks as good as it will get.
>> Personally I would change a few minor things here and there and
>> probably merge all three patches into a single commit. This however is
>> up to the committer to decide.
>
> Attaching a single-file patch

Somehow missed this thread previously. Had a quick look and had the same question Aleksander asked up-thread:

> Although it is a possible implementation, wouldn't it be better to
> parametrize pg_base64_encode instead of traversing the string twice?
> Same for pg_base64_decode. You can refactor pg_base64_encode and make
> it a wrapper for pg_base64_encode_impl if needed.

It looks as though there could be complements to _base64 and b64urllookup:

```patch
diff --git a/src/backend/utils/adt/encode.c b/src/backend/utils/adt/encode.c
@@ -273,6 +273,9 @@ hex_dec_len(const char *src, size_t srclen)
 static const char _base64[] =
 "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";

+static const char _base64url[] =
+"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_";
+
 static const int8 b64lookup[128] = {
     -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
     -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
@@ -284,6 +287,18 @@ static const int8 b64lookup[128] = {
     41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, -1, -1, -1, -1, -1,
 };

+static const int8 b64urllookup[128] = {
+    -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
+    -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
+    -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 62, -1, -1,
+    52, 53, 54, 55, 56, 57, 58, 59, 60, 61, -1, -1, -1, -1, -1, -1,
+    -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,
+    15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, -1, -1, -1, -1, 62,
+    -1, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
+    41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, -1, -1, -1, -1, -1,
+};
+
+
 static uint64
 pg_base64_encode(const char *src, size_t len, char *dst)
 {
```

And then add the implementation functions that take argument with the proper lookup tables.

Best,

David


Вложения

Re: encode/decode support for base64url

От
Florents Tselai
Дата:


On 10 Jul 2025, at 10:07 PM, David E. Wheeler <david@justatheory.com> wrote:

Hi Florents,

On Jul 9, 2025, at 23:25, Florents Tselai <florents.tselai@gmail.com> wrote:

I reviewed and tested v4. To me it looks as good as it will get.
Personally I would change a few minor things here and there and
probably merge all three patches into a single commit. This however is
up to the committer to decide.

Attaching a single-file patch

Somehow missed this thread previously. Had a quick look and had the same question Aleksander asked up-thread:

Although it is a possible implementation, wouldn't it be better to
parametrize pg_base64_encode instead of traversing the string twice?
Same for pg_base64_decode. You can refactor pg_base64_encode and make
it a wrapper for pg_base64_encode_impl if needed.

It looks as though there could be complements to _base64 and b64urllookup:

```patch
diff --git a/src/backend/utils/adt/encode.c b/src/backend/utils/adt/encode.c
@@ -273,6 +273,9 @@ hex_dec_len(const char *src, size_t srclen)
static const char _base64[] =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";

+static const char _base64url[] =
+"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_";
+
static const int8 b64lookup[128] = {
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
@@ -284,6 +287,18 @@ static const int8 b64lookup[128] = {
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, -1, -1, -1, -1, -1,
};

+static const int8 b64urllookup[128] = {
+ -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
+ -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
+ -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 62, -1, -1,
+ 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, -1, -1, -1, -1, -1, -1,
+ -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,
+ 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, -1, -1, -1, -1, 62,
+ -1, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
+ 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, -1, -1, -1, -1, -1,
+};
+
+
static uint64
pg_base64_encode(const char *src, size_t len, char *dst)
{
```

And then add the implementation functions that take argument with the proper lookup tables.

Best,

David


Why isn’t this sufficient? 

static uint64
pg_base64_encode_internal(const char *src, size_t len, char *dst, bool url)
{
const char *alphabet = url ? _base64url : _base64;
There’s already a a bool url param and the alphabet is toggled based on that 

Re: encode/decode support for base64url

От
"David E. Wheeler"
Дата:
On Jul 10, 2025, at 16:38, Florents Tselai <florents.tselai@gmail.com> wrote:

> Why isn’t this sufficient?
>
> static uint64
> pg_base64_encode_internal(const char *src, size_t len, char *dst, bool url)
> {
> const char *alphabet = url ? _base64url : _base64;

Ah, it is. I hadn’t got that far. I was tripped up to see this in your patch:

```patch
+static uint64
+pg_base64url_encode(const char *src, size_t len, char *dst)
+{
+    uint64 encoded_len;
+    if (len == 0)
+        return 0;
+
+    encoded_len = pg_base64_encode(src, len, dst);
+
+    /* Convert Base64 to Base64URL */
+    for (uint64 i = 0; i < encoded_len; i++) {
+        if (dst[i] == '+')
+            dst[i] = '-';
+        else if (dst[i] == '/')
+            dst[i] = '_';
+    }
+
+    /* Trim '=' padding */
+    while (encoded_len > 0 && dst[encoded_len - 1] == '=')
+        encoded_len--;
+
+    return encoded_len;
+}
```

I didn’t realize it was a set of patches for stuff you did and then later undid. Could you flatten the patch into just
what’schanged at the end? 

Best,

David



Вложения

Re: encode/decode support for base64url

От
Florents Tselai
Дата:


On Thu, Jul 10, 2025 at 11:55 PM David E. Wheeler <david@justatheory.com> wrote:
On Jul 10, 2025, at 16:38, Florents Tselai <florents.tselai@gmail.com> wrote:

> Why isn’t this sufficient?
>
> static uint64
> pg_base64_encode_internal(const char *src, size_t len, char *dst, bool url)
> {
> const char *alphabet = url ? _base64url : _base64;

Ah, it is. I hadn’t got that far. I was tripped up to see this in your patch:

```patch
+static uint64
+pg_base64url_encode(const char *src, size_t len, char *dst)
+{
+       uint64 encoded_len;
+       if (len == 0)
+               return 0;
+
+       encoded_len = pg_base64_encode(src, len, dst);
+
+       /* Convert Base64 to Base64URL */
+       for (uint64 i = 0; i < encoded_len; i++) {
+               if (dst[i] == '+')
+                       dst[i] = '-';
+               else if (dst[i] == '/')
+                       dst[i] = '_';
+       }
+
+       /* Trim '=' padding */
+       while (encoded_len > 0 && dst[encoded_len - 1] == '=')
+               encoded_len--;
+
+       return encoded_len;
+}
```

I didn’t realize it was a set of patches for stuff you did and then later undid. Could you flatten the patch into just what’s changed at the end?

 Attached 
Вложения

Re: encode/decode support for base64url

От
"David E. Wheeler"
Дата:
On Jul 11, 2025, at 04:26, Florents Tselai <florents.tselai@gmail.com> wrote:

>  Attached

Thank you! This looks great. The attached revision makes a a couple of minor changes:

* Change the line wrap of the docs to be more like the rest of func.sgml
* Remove an unnecessary nested if statement
* Removed `==` from one of the test comments
* Ran pgindent to create the attached patch

A few other brief comments, entirely stylistic:

* I kind of expected pg_base64url_encode to appear immediate after pg_base64_encode. In other words, to see the two
usesof pg_base64_encode_internal adjacent to each other. I think that’s more typical of the project standard. Same for
thefunctions that call pg_base64_decode_internal. 

* There are a few places where variable definition has been changed without changing the meaning, for example:

- const char *srcend = src + len,
- *s = src;
+ const char *srcend = src + len;
+ const char *s = src;

  Even if this is desirable, it might make sense to defer pure formatting changes to a separate patch.

* You define return variables in functions like pg_base64url_enc_len rather than just returning the outcome of an
expression.The latter is what I see in pg_base64_enc_len, so I think would be more consistent. Io other words: 

```patch
--- a/src/backend/utils/adt/encode.c
+++ b/src/backend/utils/adt/encode.c
@@ -470,8 +470,6 @@ pg_base64_dec_len(const char *src, size_t srclen)
 static uint64
 pg_base64url_enc_len(const char *src, size_t srclen)
 {
-    uint64        result;
-
     /*
      * Base64 encoding converts 3 bytes into 4 characters Formula: ceil(srclen
      * / 3) * 4
@@ -479,10 +477,8 @@ pg_base64url_enc_len(const char *src, size_t srclen)
      * Unlike standard base64, base64url doesn't use padding characters when
      * the input length is not divisible by 3
      */
-    result = (srclen + 2) / 3 * 4;    /* ceiling division by 3, then multiply by
+    return (srclen + 2) / 3 * 4;    /* ceiling division by 3, then multiply by
                                      * 4 */
-
-    return result;
 }

 static uint64
```

I suspect these are the sorts of things a committer would tweak/adjust before committing, just thinking about getting
aheadof that. I think it’s ready. 

Best,

David


Вложения

Re: encode/decode support for base64url

От
Daniel Gustafsson
Дата:
> On 12 Jul 2025, at 21:40, David E. Wheeler <david@justatheory.com> wrote:

> Thank you! This looks great. The attached revision makes a a couple of minor changes:

I also had a look at this today and agree that it looks pretty close to being
done, and a feature we IMHO would like to have.

> * I kind of expected pg_base64url_encode to appear immediate after pg_base64_encode. In other words, to see the two
usesof pg_base64_encode_internal adjacent to each other. I think that’s more typical of the project standard. Same for
thefunctions that call pg_base64_decode_internal. 

+1, done in the attached.

> * There are a few places where variable definition has been changed without changing the meaning, for example:
> ...
>  Even if this is desirable, it might make sense to defer pure formatting changes to a separate patch.

Agreed, the attached reverts stylistic changes.

> * You define return variables in functions like pg_base64url_enc_len rather than just returning the outcome of an
expression.The latter is what I see in pg_base64_enc_len, so I think would be more consistent. 

+1, done in the attached.

The attached version also adds a commit message, tweaks the documentation along
with a few small changes to error message handling etc.

The base64 code this extends is the RFC 2045 variant while base64url is based
on base64 from RFC 3548 (obsoleted by RFC 4648).  AFAICT this is not a problem
here but has anyone else verified this?

--
Daniel Gustafsson


Вложения

Re: encode/decode support for base64url

От
"David E. Wheeler"
Дата:
On Jul 29, 2025, at 08:25, Daniel Gustafsson <daniel@yesql.se> wrote:

> The attached version also adds a commit message, tweaks the documentation along
> with a few small changes to error message handling etc.

This looks great. One nit my editor noticed: This line:

+-- Round-trip test for all lengths from 0–4

Uses  U+2013 "–" but maybe we want ASCII character U+002d "-".

Best,

David



Вложения

Re: encode/decode support for base64url

От
Florents Tselai
Дата:

On Tue, Jul 29, 2025 at 3:25 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> On 12 Jul 2025, at 21:40, David E. Wheeler <david@justatheory.com> wrote:

> Thank you! This looks great. The attached revision makes a a couple of minor changes:

I also had a look at this today and agree that it looks pretty close to being
done, and a feature we IMHO would like to have.

Thanks for having a look Daniel!
 
 
The attached version also adds a commit message, tweaks the documentation along
with a few small changes to error message handling etc.

In the doc snippet  

> The base64url alphabet use '-' instead of '+' and '_' instead of '/' and also omits the '=' padding character.

Should be 

> The base64url alphabet uses '-' instead of '+' and '_' instead of '/', and also omits the '=' padding character.

I'd also add a comma before "and also" 
 
The base64 code this extends is the RFC 2045 variant while base64url is based
on base64 from RFC 3548 (obsoleted by RFC 4648).  AFAICT this is not a problem
here but has anyone else verified this?

I don't see how this can be a problem in practice.
The conversions are straightforward, 
and the codepath used with url=true is a new one and doesn't change past behavior.

Re: encode/decode support for base64url

От
Florents Tselai
Дата:


On 1 Aug 2025, at 1:13 PM, Florents Tselai <florents.tselai@gmail.com> wrote:


On Tue, Jul 29, 2025 at 3:25 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> On 12 Jul 2025, at 21:40, David E. Wheeler <david@justatheory.com> wrote:

> Thank you! This looks great. The attached revision makes a a couple of minor changes:

I also had a look at this today and agree that it looks pretty close to being
done, and a feature we IMHO would like to have.

Thanks for having a look Daniel!
 
 
The attached version also adds a commit message, tweaks the documentation along
with a few small changes to error message handling etc.

In the doc snippet  

> The base64url alphabet use '-' instead of '+' and '_' instead of '/' and also omits the '=' padding character.

Should be 

> The base64url alphabet uses '-' instead of '+' and '_' instead of '/', and also omits the '=' padding character.

I'd also add a comma before "and also" 
 
The base64 code this extends is the RFC 2045 variant while base64url is based
on base64 from RFC 3548 (obsoleted by RFC 4648).  AFAICT this is not a problem
here but has anyone else verified this?

I don't see how this can be a problem in practice.
The conversions are straightforward, 
and the codepath used with url=true is a new one and doesn't change past behavior.

Here’s a v6; necessary because func.sgml was split .
No other changes compared to v5.

 
Вложения

Re: encode/decode support for base64url

От
Florents Tselai
Дата:
Attaching v6 again because it wasn't picked up the last time. 
Trying from Gmail's web page this time. 



On Tue, Aug 5, 2025 at 12:40 PM Florents Tselai <florents.tselai@gmail.com> wrote:


On 1 Aug 2025, at 1:13 PM, Florents Tselai <florents.tselai@gmail.com> wrote:


On Tue, Jul 29, 2025 at 3:25 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> On 12 Jul 2025, at 21:40, David E. Wheeler <david@justatheory.com> wrote:

> Thank you! This looks great. The attached revision makes a a couple of minor changes:

I also had a look at this today and agree that it looks pretty close to being
done, and a feature we IMHO would like to have.

Thanks for having a look Daniel!
 
 
The attached version also adds a commit message, tweaks the documentation along
with a few small changes to error message handling etc.

In the doc snippet  

> The base64url alphabet use '-' instead of '+' and '_' instead of '/' and also omits the '=' padding character.

Should be 

> The base64url alphabet uses '-' instead of '+' and '_' instead of '/', and also omits the '=' padding character.

I'd also add a comma before "and also" 
 
The base64 code this extends is the RFC 2045 variant while base64url is based
on base64 from RFC 3548 (obsoleted by RFC 4648).  AFAICT this is not a problem
here but has anyone else verified this?

I don't see how this can be a problem in practice.
The conversions are straightforward, 
and the codepath used with url=true is a new one and doesn't change past behavior.

Here’s a v6; necessary because func.sgml was split .
No other changes compared to v5.

 
Вложения

Re: encode/decode support for base64url

От
Florents Tselai
Дата:


On Wed, Aug 6, 2025 at 4:34 PM Florents Tselai <florents.tselai@gmail.com> wrote:
Attaching v6 again because it wasn't picked up the last time. 
Trying from Gmail's web page this time. 



On Tue, Aug 5, 2025 at 12:40 PM Florents Tselai <florents.tselai@gmail.com> wrote:


On 1 Aug 2025, at 1:13 PM, Florents Tselai <florents.tselai@gmail.com> wrote:


On Tue, Jul 29, 2025 at 3:25 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> On 12 Jul 2025, at 21:40, David E. Wheeler <david@justatheory.com> wrote:

> Thank you! This looks great. The attached revision makes a a couple of minor changes:

I also had a look at this today and agree that it looks pretty close to being
done, and a feature we IMHO would like to have.

Thanks for having a look Daniel!
 
 
The attached version also adds a commit message, tweaks the documentation along
with a few small changes to error message handling etc.

In the doc snippet  

> The base64url alphabet use '-' instead of '+' and '_' instead of '/' and also omits the '=' padding character.

Should be 

> The base64url alphabet uses '-' instead of '+' and '_' instead of '/', and also omits the '=' padding character.

I'd also add a comma before "and also" 
 
The base64 code this extends is the RFC 2045 variant while base64url is based
on base64 from RFC 3548 (obsoleted by RFC 4648).  AFAICT this is not a problem
here but has anyone else verified this?

I don't see how this can be a problem in practice.
The conversions are straightforward, 
and the codepath used with url=true is a new one and doesn't change past behavior.

Here’s a v6; necessary because func.sgml was split .
No other changes compared to v5.

v6 introduced some whitespace errors in the regression files.

Here's a v7 that fixes that 



 
Вложения

Re: encode/decode support for base64url

От
Masahiko Sawada
Дата:
On Wed, Aug 6, 2025 at 12:43 PM Florents Tselai
<florents.tselai@gmail.com> wrote:
>
>
>
> On Wed, Aug 6, 2025 at 4:34 PM Florents Tselai <florents.tselai@gmail.com> wrote:
>>
>> Attaching v6 again because it wasn't picked up the last time.
>> Trying from Gmail's web page this time.
>>
>>
>>
>> On Tue, Aug 5, 2025 at 12:40 PM Florents Tselai <florents.tselai@gmail.com> wrote:
>>>
>>>
>>>
>>> On 1 Aug 2025, at 1:13 PM, Florents Tselai <florents.tselai@gmail.com> wrote:
>>>
>>>
>>> On Tue, Jul 29, 2025 at 3:25 PM Daniel Gustafsson <daniel@yesql.se> wrote:
>>>>
>>>> > On 12 Jul 2025, at 21:40, David E. Wheeler <david@justatheory.com> wrote:
>>>>
>>>> > Thank you! This looks great. The attached revision makes a a couple of minor changes:
>>>>
>>>> I also had a look at this today and agree that it looks pretty close to being
>>>> done, and a feature we IMHO would like to have.
>>>
>>>
>>> Thanks for having a look Daniel!
>>>
>>>>
>>>>
>>>>
>>>> The attached version also adds a commit message, tweaks the documentation along
>>>> with a few small changes to error message handling etc.
>>>
>>>
>>> In the doc snippet
>>>
>>> > The base64url alphabet use '-' instead of '+' and '_' instead of '/' and also omits the '=' padding character.
>>>
>>> Should be
>>>
>>> > The base64url alphabet uses '-' instead of '+' and '_' instead of '/', and also omits the '=' padding character.
>>>
>>> I'd also add a comma before "and also"
>>>
>>>>
>>>> The base64 code this extends is the RFC 2045 variant while base64url is based
>>>> on base64 from RFC 3548 (obsoleted by RFC 4648).  AFAICT this is not a problem
>>>> here but has anyone else verified this?
>>>
>>>
>>> I don't see how this can be a problem in practice.
>>> The conversions are straightforward,
>>> and the codepath used with url=true is a new one and doesn't change past behavior.
>>>
>>>
>>> Here’s a v6; necessary because func.sgml was split .
>>> No other changes compared to v5.
>
>
> v6 introduced some whitespace errors in the regression files.
>
> Here's a v7 that fixes that

While the patch looks good to me I have one question:

-                        errmsg("invalid symbol \"%.*s\" found while
decoding base64 sequence",
-                               pg_mblen(s - 1), s - 1)));
+                        errmsg("invalid symbol \"%.*s\" found while
decoding %s sequence",
+                               pg_mblen(s - 1), s - 1,
+                               url ? "base64url" : "base64")));

The above change makes the error message mention the encoding name
properly. On the other hand, in pg_base64_decode_internal() there are
two places where we report invalid data and always mention 'based64'
in the error message:

ereport(ERROR,
        (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
         errmsg("unexpected \"=\" while decoding base64 sequence")));

and

ereport(ERROR,
        (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
         errmsg("invalid base64 end sequence"),
         errhint("Input data is missing padding, is truncated, or is
otherwise corrupted.")));

Do we need to have a similar change for these messages?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: encode/decode support for base64url

От
Florents Tselai
Дата:


On Wed, Sep 17, 2025 at 12:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Aug 6, 2025 at 12:43 PM Florents Tselai
<florents.tselai@gmail.com> wrote:
>
>
>
> On Wed, Aug 6, 2025 at 4:34 PM Florents Tselai <florents.tselai@gmail.com> wrote:
>>
>> Attaching v6 again because it wasn't picked up the last time.
>> Trying from Gmail's web page this time.
>>
>>
>>
>> On Tue, Aug 5, 2025 at 12:40 PM Florents Tselai <florents.tselai@gmail.com> wrote:
>>>
>>>
>>>
>>> On 1 Aug 2025, at 1:13 PM, Florents Tselai <florents.tselai@gmail.com> wrote:
>>>
>>>
>>> On Tue, Jul 29, 2025 at 3:25 PM Daniel Gustafsson <daniel@yesql.se> wrote:
>>>>
>>>> > On 12 Jul 2025, at 21:40, David E. Wheeler <david@justatheory.com> wrote:
>>>>
>>>> > Thank you! This looks great. The attached revision makes a a couple of minor changes:
>>>>
>>>> I also had a look at this today and agree that it looks pretty close to being
>>>> done, and a feature we IMHO would like to have.
>>>
>>>
>>> Thanks for having a look Daniel!
>>>
>>>>
>>>>
>>>>
>>>> The attached version also adds a commit message, tweaks the documentation along
>>>> with a few small changes to error message handling etc.
>>>
>>>
>>> In the doc snippet
>>>
>>> > The base64url alphabet use '-' instead of '+' and '_' instead of '/' and also omits the '=' padding character.
>>>
>>> Should be
>>>
>>> > The base64url alphabet uses '-' instead of '+' and '_' instead of '/', and also omits the '=' padding character.
>>>
>>> I'd also add a comma before "and also"
>>>
>>>>
>>>> The base64 code this extends is the RFC 2045 variant while base64url is based
>>>> on base64 from RFC 3548 (obsoleted by RFC 4648).  AFAICT this is not a problem
>>>> here but has anyone else verified this?
>>>
>>>
>>> I don't see how this can be a problem in practice.
>>> The conversions are straightforward,
>>> and the codepath used with url=true is a new one and doesn't change past behavior.
>>>
>>>
>>> Here’s a v6; necessary because func.sgml was split .
>>> No other changes compared to v5.
>
>
> v6 introduced some whitespace errors in the regression files.
>
> Here's a v7 that fixes that

While the patch looks good to me I have one question:

-                        errmsg("invalid symbol \"%.*s\" found while
decoding base64 sequence",
-                               pg_mblen(s - 1), s - 1)));
+                        errmsg("invalid symbol \"%.*s\" found while
decoding %s sequence",
+                               pg_mblen(s - 1), s - 1,
+                               url ? "base64url" : "base64")));

The above change makes the error message mention the encoding name
properly. On the other hand, in pg_base64_decode_internal() there are
two places where we report invalid data and always mention 'based64'
in the error message:

ereport(ERROR,
        (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
         errmsg("unexpected \"=\" while decoding base64 sequence")));

and

ereport(ERROR,
        (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
         errmsg("invalid base64 end sequence"),
         errhint("Input data is missing padding, is truncated, or is
otherwise corrupted.")));

Do we need to have a similar change for these messages?

Good catch, Masahiko-san. They shouldn't be hardcoded either.

I've updated that and also the wording in the regression tests, too. 
Вложения

Re: encode/decode support for base64url

От
Masahiko Sawada
Дата:
On Wed, Sep 17, 2025 at 5:57 AM Florents Tselai
<florents.tselai@gmail.com> wrote:
>
>
>
> On Wed, Sep 17, 2025 at 12:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>
>> On Wed, Aug 6, 2025 at 12:43 PM Florents Tselai
>> <florents.tselai@gmail.com> wrote:
>> >
>> >
>> >
>> > On Wed, Aug 6, 2025 at 4:34 PM Florents Tselai <florents.tselai@gmail.com> wrote:
>> >>
>> >> Attaching v6 again because it wasn't picked up the last time.
>> >> Trying from Gmail's web page this time.
>> >>
>> >>
>> >>
>> >> On Tue, Aug 5, 2025 at 12:40 PM Florents Tselai <florents.tselai@gmail.com> wrote:
>> >>>
>> >>>
>> >>>
>> >>> On 1 Aug 2025, at 1:13 PM, Florents Tselai <florents.tselai@gmail.com> wrote:
>> >>>
>> >>>
>> >>> On Tue, Jul 29, 2025 at 3:25 PM Daniel Gustafsson <daniel@yesql.se> wrote:
>> >>>>
>> >>>> > On 12 Jul 2025, at 21:40, David E. Wheeler <david@justatheory.com> wrote:
>> >>>>
>> >>>> > Thank you! This looks great. The attached revision makes a a couple of minor changes:
>> >>>>
>> >>>> I also had a look at this today and agree that it looks pretty close to being
>> >>>> done, and a feature we IMHO would like to have.
>> >>>
>> >>>
>> >>> Thanks for having a look Daniel!
>> >>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> The attached version also adds a commit message, tweaks the documentation along
>> >>>> with a few small changes to error message handling etc.
>> >>>
>> >>>
>> >>> In the doc snippet
>> >>>
>> >>> > The base64url alphabet use '-' instead of '+' and '_' instead of '/' and also omits the '=' padding character.
>> >>>
>> >>> Should be
>> >>>
>> >>> > The base64url alphabet uses '-' instead of '+' and '_' instead of '/', and also omits the '=' padding
character.
>> >>>
>> >>> I'd also add a comma before "and also"
>> >>>
>> >>>>
>> >>>> The base64 code this extends is the RFC 2045 variant while base64url is based
>> >>>> on base64 from RFC 3548 (obsoleted by RFC 4648).  AFAICT this is not a problem
>> >>>> here but has anyone else verified this?
>> >>>
>> >>>
>> >>> I don't see how this can be a problem in practice.
>> >>> The conversions are straightforward,
>> >>> and the codepath used with url=true is a new one and doesn't change past behavior.
>> >>>
>> >>>
>> >>> Here’s a v6; necessary because func.sgml was split .
>> >>> No other changes compared to v5.
>> >
>> >
>> > v6 introduced some whitespace errors in the regression files.
>> >
>> > Here's a v7 that fixes that
>>
>> While the patch looks good to me I have one question:
>>
>> -                        errmsg("invalid symbol \"%.*s\" found while
>> decoding base64 sequence",
>> -                               pg_mblen(s - 1), s - 1)));
>> +                        errmsg("invalid symbol \"%.*s\" found while
>> decoding %s sequence",
>> +                               pg_mblen(s - 1), s - 1,
>> +                               url ? "base64url" : "base64")));
>>
>> The above change makes the error message mention the encoding name
>> properly. On the other hand, in pg_base64_decode_internal() there are
>> two places where we report invalid data and always mention 'based64'
>> in the error message:
>>
>> ereport(ERROR,
>>         (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>>          errmsg("unexpected \"=\" while decoding base64 sequence")));
>>
>> and
>>
>> ereport(ERROR,
>>         (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>>          errmsg("invalid base64 end sequence"),
>>          errhint("Input data is missing padding, is truncated, or is
>> otherwise corrupted.")));
>>
>> Do we need to have a similar change for these messages?
>
>
> Good catch, Masahiko-san. They shouldn't be hardcoded either.
>
> I've updated that and also the wording in the regression tests, too.

Thank you for updating the patch! I've done additional tests in my
environment and all test cases passed. One very minor comment is that
we might want to add 'BASE64URL' to:

/*
 * BASE64
 */

Overall, the patch looks good to me. I'll wait for Daniel as he has
polished this patch and might have some comments or want to take it.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: encode/decode support for base64url

От
Daniel Gustafsson
Дата:
> On 17 Sep 2025, at 19:51, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

> Thank you for updating the patch! I've done additional tests in my
> environment and all test cases passed. One very minor comment is that
> we might want to add 'BASE64URL' to:
>
> /*
> * BASE64
> */

I did that, and polished a few comments which had various version of case on
"base64url".  The RFC only mentions it in all lowercase so I went with that
apart from the in the comment mentioned above.

The attached v9 has this and

> Overall, the patch looks good to me. I'll wait for Daniel as he has
> polished this patch and might have some comments or want to take it.

Agreed, I think this is ready to go in.  If you want to push it then feel free,
else I'll take care of it tomorrow.

--
Daniel Gustafsson


Вложения

Re: encode/decode support for base64url

От
Chao Li
Дата:
I reviewed and tested this patch. Overall looks good to me. Actually, I think this patched fixed a bug of current implementation of base64 encoding by moving the logic of handling newline into “if (pos<0)”.

Just a few small comments:

On Sep 19, 2025, at 03:19, Daniel Gustafsson <daniel@yesql.se> wrote:

<v9-0001-Add-support-for-base64url-encoding-and-decoding.patch>

1.
```
+ * Helper for decoding base64 or base64url.  When url is passed as true the
+ * input will be encoded using base64url.  len bytes in src is encoded into
+ * dst.
+ */
```

It’s not common to use two white-spaces after “.”, usually we need only one.

2.
```
+ /* handle remainder */
  if (pos != 2)
```

The comment is understandable, but slightly vague: remainder of what?

Maybe rephrase to “handle remaining bytes in buf”.

3.
```
  ereport(ERROR,
  (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("unexpected \"=\" while decoding base64 sequence")));
+ errmsg("unexpected \"=\" while decoding %s sequence", url ? "base64url" : "base64")));
```

This is a normal usage that injects sub-strings based on condition. However, PG doesn’t like that, see here: https://www.postgresql.org/docs/devel/nls-programmer.html#NLS-GUIDELINES 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/




Re: encode/decode support for base64url

От
Florents Tselai
Дата:


On 19 Sep 2025, at 6:50 AM, Chao Li <li.evan.chao@gmail.com> wrote:

Great to see you around Evan!


I reviewed and tested this patch. Overall looks good to me. Actually, I think this patched fixed a bug of current implementation of base64 encoding by moving the logic of handling newline into “if (pos<0)”.

IIUC what you mean, I can’t confirm that.

Both existing and new implementation handle new lines the same 

SELECT decode(E'QUFB\nQUFB', 'base64url');
     decode     
----------------
 \x414141414141
(1 row)



Just a few small comments:

On Sep 19, 2025, at 03:19, Daniel Gustafsson <daniel@yesql.se> wrote:

<v9-0001-Add-support-for-base64url-encoding-and-decoding.patch>

1.
```
+ * Helper for decoding base64 or base64url.  When url is passed as true the
+ * input will be encoded using base64url.  len bytes in src is encoded into
+ * dst.
+ */
```

It’s not common to use two white-spaces after “.”, usually we need only one.

I agree with this


2.
```
+ /* handle remainder */
  if (pos != 2)
```

The comment is understandable, but slightly vague: remainder of what?

Maybe rephrase to “handle remaining bytes in buf”.

Agree too.


3.
```
  ereport(ERROR,
  (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("unexpected \"=\" while decoding base64 sequence")));
+ errmsg("unexpected \"=\" while decoding %s sequence", url ? "base64url" : "base64")));
```

This is a normal usage that injects sub-strings based on condition. However, PG doesn’t like that, see here: https://www.postgresql.org/docs/devel/nls-programmer.html#NLS-GUIDELINES 

Well, that’s a very interesting catch. 
I’ll let a comitter confirm & advise. 



Re: encode/decode support for base64url

От
Chao Li
Дата:


On Sep 19, 2025, at 14:45, Florents Tselai <florents.tselai@gmail.com> wrote:

I reviewed and tested this patch. Overall looks good to me. Actually, I think this patched fixed a bug of current implementation of base64 encoding by moving the logic of handling newline into “if (pos<0)”.

IIUC what you mean, I can’t confirm that.

Both existing and new implementation handle new lines the same 

SELECT decode(E'QUFB\nQUFB', 'base64url');
     decode     
----------------
 \x414141414141
(1 row)


The current implementation isn’t actually wrong, but at least not optimized as your version. Because we don’t need to check “if (p >= lend)” after p is bumped, and only when “if (pos <0)”, p is bumped.



3.
```
  ereport(ERROR,
  (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-  errmsg("unexpected \"=\" while decoding base64 sequence")));
+  errmsg("unexpected \"=\" while decoding %s sequence", url ? "base64url" : "base64")));
```

This is a normal usage that injects sub-strings based on condition. However, PG doesn’t like that, see here: https://www.postgresql.org/docs/devel/nls-programmer.html#NLS-GUIDELINES 

Well, that’s a very interesting catch. 
I’ll let a comitter confirm & advise. 

I got to know this because once I reviewed a Tom Lane’s patch, it had the similarly situation, but Tom wrote code like:

```
If (something)
    Ereport(“function xxx”)
Else
    Ereport(“procedure xxx”)
```

I raised a comment to suggest avoid duplicate code in the way like your code do, and I got a response with “no” and the link.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/




Re: encode/decode support for base64url

От
Daniel Gustafsson
Дата:
> On 19 Sep 2025, at 08:45, Florents Tselai <florents.tselai@gmail.com> wrote:
>> On 19 Sep 2025, at 6:50 AM, Chao Li <li.evan.chao@gmail.com> wrote:

>> It’s not common to use two white-spaces after “.”, usually we need only one.
>
> I agree with this

This might date me (and others) but double-space after period was the norm for
monospaced typesetting back in the clackety-clack typewriter days, and that
carried over into monospace font text in computers.  The fmt program still use
double-space after period (which is what formatted my reply here, thus the use
of double-space in my emails).  While there is no hard rule in postgres
(AFAIK), a quick regex shows that it's 2.5x more common for sentences in
comments to have two space after punctuation.

>> The comment is understandable, but slightly vague: remainder of what?
>>
>> Maybe rephrase to “handle remaining bytes in buf”.
>
> Agree too.

I don't think the comment was all that vague in the context of reading the
code, but expanding won't hurt so done.

>>   ereport(ERROR,
>>   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>> -  errmsg("unexpected \"=\" while decoding base64 sequence")));
>> +  errmsg("unexpected \"=\" while decoding %s sequence", url ? "base64url" : "base64")));
>> ```
>>
>> This is a normal usage that injects sub-strings based on condition. However, PG doesn’t like that, see here:
https://www.postgresql.org/docs/devel/nls-programmer.html#NLS-GUIDELINES 
>
> Well, that’s a very interesting catch.
> I’ll let a comitter confirm & advise.

Yes and no, the recommendation against constructing sentences at runtime is to
aid translators since the injected string isn't available to them.  In this
(and I hope all other) case the injected string should not be translated as it
is a name of an encoding scheme.  What we can do is to add a /* translator: ..
*/ comment which will end up in the translation file and give the translator
context on what %s will be replaced by.  Done in the attached.

--
Daniel Gustafsson


Вложения

Re: encode/decode support for base64url

От
Daniel Gustafsson
Дата:
> On 19 Sep 2025, at 08:56, Chao Li <li.evan.chao@gmail.com> wrote:
>> On Sep 19, 2025, at 14:45, Florents Tselai <florents.tselai@gmail.com> wrote:

>>> This is a normal usage that injects sub-strings based on condition. However, PG doesn’t like that, see here:
https://www.postgresql.org/docs/devel/nls-programmer.html#NLS-GUIDELINES 
>>
>> Well, that’s a very interesting catch.
>> I’ll let a comitter confirm & advise.
>
> I got to know this because once I reviewed a Tom Lane’s patch, it had the similarly situation, but Tom wrote code
like:
>
> ```
> If (something)
>     Ereport(“function xxx”)
> Else
>     Ereport(“procedure xxx”)
> ```
>
> I raised a comment to suggest avoid duplicate code in the way like your code do, and I got a response with “no” and
thelink. 

Tom is right (unsurprisingly) here, since "function" and "procedure" are terms
which are translated and depending on which is used it may change the sentence
structure in the target language.

In this case we inject a name which isn't to be translated, and that will
instead help the translator since they otherwise need to translate two strings
instead of just one (and they can move the %s to position the injected name
into the right place according to grammar rules).

--
Daniel Gustafsson




Re: encode/decode support for base64url

От
Daniel Gustafsson
Дата:
> On 18 Sep 2025, at 21:19, Daniel Gustafsson <daniel@yesql.se> wrote:

> .. else I'll take care of it tomorrow.

FWIW since there were new reviews and comments I wanted to allow some more time
for additional comments, so will do this over the weekend insteead.

--
Daniel Gustafsson




Re: encode/decode support for base64url

От
Daniel Gustafsson
Дата:
> On 19 Sep 2025, at 23:04, Daniel Gustafsson <daniel@yesql.se> wrote:
>
>> On 18 Sep 2025, at 21:19, Daniel Gustafsson <daniel@yesql.se> wrote:
>
>> .. else I'll take care of it tomorrow.
>
> FWIW since there were new reviews and comments I wanted to allow some more time
> for additional comments, so will do this over the weekend insteead.

And, done.

--
Daniel Gustafsson