Обсуждение: pg_trgm comparison bug on cross-architecture replication due to different char implementation

Поиск
Список
Период
Сортировка

Hi all,

 

I would like to report an issue with the pg_trgm extension on cross-architecture replication scenarios. When an x86_64 standby server is replicating from an aarch64 primary server or vice versa, the gist_trgm_ops opclass returns different results on the primary and standby. Masahiko previously reported a similar issue affecting the pg_bigm extension [1].

 

To reproduce, execute the following on the x86_64 primary server:

 

CREATE EXTENSION pg_trgm;

CREATE TABLE tbl (c text);

CREATE INDEX ON tbl USING gist (c gist_trgm_ops);

INSERT INTO tbl VALUES ('Bóbr');

 

On the x86_64 primary server:

 

postgres=> select * from tbl where c like '%Bób%';

  c

------

Bóbr

(1 row)

 

On the aarch64 replica server:

 

postgres=> select * from tbl where c like '%Bób%';

c

---

(0 rows)

 

The root cause looks the same as the pg_bigm issue that Masahiko reported. To compare trigrams, pg_trgm uses a numerical comparison of chars [2]. On x86_64 a char is signed by default, whereas on aarch64 it is unsigned by default. gist_trgm_ops expects the trigram list to be sorted, but due to the different signedness of chars, the sort order is broken when replicating the values across architectures.

 

The different sort behaviour can be demonstrated using show_trgm.

 

On the x86_64 primary server:

 

postgres=> SELECT show_trgm('Bóbr');

                show_trgm                

------------------------------------------

{0x89194c,"  b","br ",0x707c72,0x7f7849}

(1 row)

 

On the aarch64 replica server:

 

postgres=> SELECT show_trgm('Bóbr');

                show_trgm                

------------------------------------------

{"  b","br ",0x707c72,0x7f7849,0x89194c}

(1 row)

 

One simple solution for this specific case is to declare the char signedness in the CMPPCHAR macro.

 

--- a/contrib/pg_trgm/trgm.h

+++ b/contrib/pg_trgm/trgm.h

@@ -42,7 +42,7 @@

typedef char trgm[3];

 #define CMPCHAR(a,b) ( ((a)==(b)) ? 0 : ( ((a)<(b)) ? -1 : 1 ) )

-#define CMPPCHAR(a,b,i)  CMPCHAR( *(((const char*)(a))+i), *(((const char*)(b))+i) )

+#define CMPPCHAR(a,b,i)  CMPCHAR( *(((unsigned char*)(a))+i), *(((unsigned char*)(b))+i) )

#define CMPTRGM(a,b) ( CMPPCHAR(a,b,0) ? CMPPCHAR(a,b,0) : ( CMPPCHAR(a,b,1) ? CMPPCHAR(a,b,1) : CMPPCHAR(a,b,2) ) )

 #define CPTRGM(a,b) do {                                                           \

 

Alternatively, Postgres can be compiled with -funsigned-char or -fsigned-char. I came across a code comment suggesting that this may not be a good idea in general [3].

 

Given that this has problem has come up before and seems likely to come up again, I'm curious what other broad solutions there might be to resolve it? Looking forward to any feedback, thanks!

 

Best,

 

Adam Guo

Amazon Web Services: https://aws.amazon.com

 

[1] https://osdn.net/projects/pgbigm/lists/archive/hackers/2024-February/000370.html

[2] https://github.com/postgres/postgres/blob/480bc6e3ed3a5719cdec076d4943b119890e8171/contrib/pg_trgm/trgm.h#L45

[3] https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/cash.c#L114-L123

"Guo, Adam" <adamguo@amazon.com> writes:
> I would like to report an issue with the pg_trgm extension on
> cross-architecture replication scenarios. When an x86_64 standby
> server is replicating from an aarch64 primary server or vice versa,
> the gist_trgm_ops opclass returns different results on the primary
> and standby.

I do not think that is a supported scenario.  Hash functions and
suchlike are not guaranteed to produce the same results on different
CPU architectures.  As a quick example, I get

regression=# select hashfloat8(34);
 hashfloat8 
------------
   21570837
(1 row)

on x86_64 but

postgres=# select hashfloat8(34);
 hashfloat8 
------------
 -602898821
(1 row)

on ppc32 thanks to the endianness difference.

> Given that this has problem has come up before and seems likely to
> come up again, I'm curious what other broad solutions there might be
> to resolve it?

Reject as not a bug.  Discourage people from thinking that physical
replication will work across architectures.

            regards, tom lane



On Tue, Apr 23, 2024 at 11:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> "Guo, Adam" <adamguo@amazon.com> writes:
> > I would like to report an issue with the pg_trgm extension on
> > cross-architecture replication scenarios. When an x86_64 standby
> > server is replicating from an aarch64 primary server or vice versa,
> > the gist_trgm_ops opclass returns different results on the primary
> > and standby.
>
> I do not think that is a supported scenario.  Hash functions and
> suchlike are not guaranteed to produce the same results on different
> CPU architectures.  As a quick example, I get
>
> regression=# select hashfloat8(34);
>  hashfloat8
> ------------
>    21570837
> (1 row)
>
> on x86_64 but
>
> postgres=# select hashfloat8(34);
>  hashfloat8
> ------------
>  -602898821
> (1 row)
>
> on ppc32 thanks to the endianness difference.
>
> > Given that this has problem has come up before and seems likely to
> > come up again, I'm curious what other broad solutions there might be
> > to resolve it?
>
> Reject as not a bug.  Discourage people from thinking that physical
> replication will work across architectures.

While cross-arch physical replication is not supported, I think having
architecture dependent differences is not good and It's legitimate to
fix it. FYI the 'char' data type comparison is done as though char is
unsigned. I've attached a small patch to fix it. What do you think?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения
Masahiko Sawada <sawada.mshk@gmail.com> writes:
> On Tue, Apr 23, 2024 at 11:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Reject as not a bug.  Discourage people from thinking that physical
>> replication will work across architectures.

> While cross-arch physical replication is not supported, I think having
> architecture dependent differences is not good and It's legitimate to
> fix it. FYI the 'char' data type comparison is done as though char is
> unsigned. I've attached a small patch to fix it. What do you think?

I think this will break existing indexes that are working fine.
Yeah, it would have been better to avoid the difference, but
it's too late now.

            regards, tom lane



On Tue, Apr 30, 2024 at 12:37 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Masahiko Sawada <sawada.mshk@gmail.com> writes:
> > On Tue, Apr 23, 2024 at 11:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Reject as not a bug.  Discourage people from thinking that physical
> >> replication will work across architectures.
>
> > While cross-arch physical replication is not supported, I think having
> > architecture dependent differences is not good and It's legitimate to
> > fix it. FYI the 'char' data type comparison is done as though char is
> > unsigned. I've attached a small patch to fix it. What do you think?
>
> I think this will break existing indexes that are working fine.
> Yeah, it would have been better to avoid the difference, but
> it's too late now.

True. So it will be a PG18 item.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Masahiko Sawada <sawada.mshk@gmail.com> writes:
> On Tue, Apr 30, 2024 at 12:37 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think this will break existing indexes that are working fine.
>> Yeah, it would have been better to avoid the difference, but
>> it's too late now.

> True. So it will be a PG18 item.

How will it be any better in v18?  It's still an on-disk
compatibility break for affected platforms.

Now, people could recover by reindexing affected indexes,
but I think we need to have a better justification than this
for making them do so.

            regards, tom lane



On Tue, Apr 23, 2024 at 5:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Guo, Adam" <adamguo@amazon.com> writes:
> > I would like to report an issue with the pg_trgm extension on
> > cross-architecture replication scenarios. When an x86_64 standby
> > server is replicating from an aarch64 primary server or vice versa,
> > the gist_trgm_ops opclass returns different results on the primary
> > and standby.
>
> I do not think that is a supported scenario.  Hash functions and
> suchlike are not guaranteed to produce the same results on different
> CPU architectures.  As a quick example, I get
>
> regression=# select hashfloat8(34);
>  hashfloat8
> ------------
>    21570837
> (1 row)
>
> on x86_64 but
>
> postgres=# select hashfloat8(34);
>  hashfloat8
> ------------
>  -602898821
> (1 row)
>
> on ppc32 thanks to the endianness difference.

Given this, should we try to do better with binary compatibility
checks using ControlFileData?  AFAICS they are supposed to check if
the database cluster is binary compatible with the running
architecture.  But it obviously allows incompatibilities.

------
Regards,
Alexander Korotkov



Alexander Korotkov <aekorotkov@gmail.com> writes:
> Given this, should we try to do better with binary compatibility
> checks using ControlFileData?  AFAICS they are supposed to check if
> the database cluster is binary compatible with the running
> architecture.  But it obviously allows incompatibilities.

Perhaps.  pg_control already covers endianness, which I think
is the root of the hashing differences I showed.  Adding a field
for char signedness feels a little weird, since it's not directly
a property of the bits-on-disk, but maybe we should.

            regards, tom lane



On Tue, Apr 30, 2024 at 7:54 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Korotkov <aekorotkov@gmail.com> writes:
> > Given this, should we try to do better with binary compatibility
> > checks using ControlFileData?  AFAICS they are supposed to check if
> > the database cluster is binary compatible with the running
> > architecture.  But it obviously allows incompatibilities.
>
> Perhaps.  pg_control already covers endianness, which I think
> is the root of the hashing differences I showed.  Adding a field
> for char signedness feels a little weird, since it's not directly
> a property of the bits-on-disk, but maybe we should.

I agree that storing char signedness might seem weird.  But it appears
that we already store indexes that depend on char signedness.  So,
it's effectively property of bits-on-disk even though it affects
indirectly.  Then I see two options to make the picture consistent.
1) Assume that char signedness is somehow a property of bits-on-disk
even though it's weird.  Then pg_trgm indexes are correct, but we need
to store char signedness in pg_control.
2) Assume that char signedness is not a property of bits-on-disk.
Then pg_trgm indexes are buggy and need to be fixed.
What do you think?

------
Regards,
Alexander Korotkov



Alexander Korotkov <aekorotkov@gmail.com> writes:
> I agree that storing char signedness might seem weird.  But it appears
> that we already store indexes that depend on char signedness.  So,
> it's effectively property of bits-on-disk even though it affects
> indirectly.  Then I see two options to make the picture consistent.
> 1) Assume that char signedness is somehow a property of bits-on-disk
> even though it's weird.  Then pg_trgm indexes are correct, but we need
> to store char signedness in pg_control.
> 2) Assume that char signedness is not a property of bits-on-disk.
> Then pg_trgm indexes are buggy and need to be fixed.
> What do you think?

The problem with option (2) is the assumption that pg_trgm's behavior
is the only bug of this kind, either now or in the future.  I think
that's just about an impossible standard to meet, because there's no
realistic way to test whether char signedness is affecting things.
(Sure, you can compare results across platforms, but maybe you
just didn't test the right case.)

Also, the bigger picture here is the seeming assumption that "if
we change pg_trgm then it will be safe to replicate from x86 to
arm".  I don't believe that that's a good idea and I'm unwilling
to promise that it will work, regardless of what we do about
char signedness.  That being the case, I don't want to invest a
lot of effort in the signedness issue.  Option (1) is clearly
a small change with little if any risk of future breakage.
Option (2) ... not so much.

            regards, tom lane



On Wed, May 1, 2024 at 2:29 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Alexander Korotkov <aekorotkov@gmail.com> writes:
> > I agree that storing char signedness might seem weird.  But it appears
> > that we already store indexes that depend on char signedness.  So,
> > it's effectively property of bits-on-disk even though it affects
> > indirectly.  Then I see two options to make the picture consistent.
> > 1) Assume that char signedness is somehow a property of bits-on-disk
> > even though it's weird.  Then pg_trgm indexes are correct, but we need
> > to store char signedness in pg_control.
> > 2) Assume that char signedness is not a property of bits-on-disk.
> > Then pg_trgm indexes are buggy and need to be fixed.
> > What do you think?
>
> The problem with option (2) is the assumption that pg_trgm's behavior
> is the only bug of this kind, either now or in the future.  I think
> that's just about an impossible standard to meet, because there's no
> realistic way to test whether char signedness is affecting things.
> (Sure, you can compare results across platforms, but maybe you
> just didn't test the right case.)
>
> Also, the bigger picture here is the seeming assumption that "if
> we change pg_trgm then it will be safe to replicate from x86 to
> arm".  I don't believe that that's a good idea and I'm unwilling
> to promise that it will work, regardless of what we do about
> char signedness.  That being the case, I don't want to invest a
> lot of effort in the signedness issue.

I think that the char signedness issue is an issue also for developers
(and extension authors) since it could lead to confusion and potential
bugs in the future due to that. x86 developers would think of char as
always being signed and write code that will misbehave on arm
machines. For example, since logical replication should behave
correctly even in cross-arch replication all developers need to be
aware of that. I thought of using the -funsigned-char (or
-fsigned-char) compiler flag to avoid that but it would have a broader
impact not only on indexes.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



On 30.04.24 19:29, Tom Lane wrote:
>> 1) Assume that char signedness is somehow a property of bits-on-disk
>> even though it's weird.  Then pg_trgm indexes are correct, but we need
>> to store char signedness in pg_control.
>> 2) Assume that char signedness is not a property of bits-on-disk.
>> Then pg_trgm indexes are buggy and need to be fixed.
>> What do you think?
> Also, the bigger picture here is the seeming assumption that "if
> we change pg_trgm then it will be safe to replicate from x86 to
> arm".  I don't believe that that's a good idea and I'm unwilling
> to promise that it will work, regardless of what we do about
> char signedness.  That being the case, I don't want to invest a
> lot of effort in the signedness issue.  Option (1) is clearly
> a small change with little if any risk of future breakage.

But note that option 1 would prevent some replication that is currently 
working.




Peter Eisentraut <peter@eisentraut.org> writes:
> On 30.04.24 19:29, Tom Lane wrote:
>> Also, the bigger picture here is the seeming assumption that "if
>> we change pg_trgm then it will be safe to replicate from x86 to
>> arm".  I don't believe that that's a good idea and I'm unwilling
>> to promise that it will work, regardless of what we do about
>> char signedness.  That being the case, I don't want to invest a
>> lot of effort in the signedness issue.  Option (1) is clearly
>> a small change with little if any risk of future breakage.

> But note that option 1 would prevent some replication that is currently 
> working.

The point of this thread though is that it's working only for small
values of "work".  People are rightfully unhappy if it seems to work
and then later they get bitten by compatibility problems.

Treating char signedness as a machine property in pg_control would
signal that we don't intend to make it work, and would ensure that
even the most minimal testing would find out that it doesn't work.

If we do not do that, it seems to me we have to buy into making
it work.  That would mean dealing with the consequences of an
incompatible change in pg_trgm indexes, and then going through
the same dance again the next time(s) similar problems are found.

            regards, tom lane



On 03.05.24 16:13, Tom Lane wrote:
> Peter Eisentraut <peter@eisentraut.org> writes:
>> On 30.04.24 19:29, Tom Lane wrote:
>>> Also, the bigger picture here is the seeming assumption that "if
>>> we change pg_trgm then it will be safe to replicate from x86 to
>>> arm".  I don't believe that that's a good idea and I'm unwilling
>>> to promise that it will work, regardless of what we do about
>>> char signedness.  That being the case, I don't want to invest a
>>> lot of effort in the signedness issue.  Option (1) is clearly
>>> a small change with little if any risk of future breakage.
> 
>> But note that option 1 would prevent some replication that is currently
>> working.
> 
> The point of this thread though is that it's working only for small
> values of "work".  People are rightfully unhappy if it seems to work
> and then later they get bitten by compatibility problems.
> 
> Treating char signedness as a machine property in pg_control would
> signal that we don't intend to make it work, and would ensure that
> even the most minimal testing would find out that it doesn't work.
> 
> If we do not do that, it seems to me we have to buy into making
> it work.  That would mean dealing with the consequences of an
> incompatible change in pg_trgm indexes, and then going through
> the same dance again the next time(s) similar problems are found.

Yes, that is understood.  But anecdotally, replicating between x86-64 
arm64 is occasionally used for upgrades or migrations.  In practice, 
this appears to have mostly worked.  If we now discover that it won't 
work with certain index extension modules, it's usable for most users. 
Even if we say, you have to reindex everything afterwards, it's probably 
still useful for these scenarios.

The way I understand the original report, the issue has to do 
specifically with how signed and unsigned chars compare differently.  I 
don't imagine this is used anywhere in the table/heap code.  So it's 
plausible that this issue is really contained to indexes.

On the other hand, if we put in a check against this, then at least we 
can answer any user questions about this with more certainty: No, won't 
work, here is why.




On 5/3/24 11:44, Peter Eisentraut wrote:
> On 03.05.24 16:13, Tom Lane wrote:
>> Peter Eisentraut <peter@eisentraut.org> writes:
>>> On 30.04.24 19:29, Tom Lane wrote:
>>>> Also, the bigger picture here is the seeming assumption that "if
>>>> we change pg_trgm then it will be safe to replicate from x86 to
>>>> arm".  I don't believe that that's a good idea and I'm unwilling
>>>> to promise that it will work, regardless of what we do about
>>>> char signedness.  That being the case, I don't want to invest a
>>>> lot of effort in the signedness issue.  Option (1) is clearly
>>>> a small change with little if any risk of future breakage.
>>
>>> But note that option 1 would prevent some replication that is currently
>>> working.
>>
>> The point of this thread though is that it's working only for small
>> values of "work".  People are rightfully unhappy if it seems to work
>> and then later they get bitten by compatibility problems.
>>
>> Treating char signedness as a machine property in pg_control would
>> signal that we don't intend to make it work, and would ensure that
>> even the most minimal testing would find out that it doesn't work.
>>
>> If we do not do that, it seems to me we have to buy into making
>> it work.  That would mean dealing with the consequences of an
>> incompatible change in pg_trgm indexes, and then going through
>> the same dance again the next time(s) similar problems are found.
> 
> Yes, that is understood.  But anecdotally, replicating between x86-64 arm64 is 
> occasionally used for upgrades or migrations.  In practice, this appears to have 
> mostly worked.  If we now discover that it won't work with certain index 
> extension modules, it's usable for most users. Even if we say, you have to 
> reindex everything afterwards, it's probably still useful for these scenarios.

+1

I have heard similar anecdotes, and the reported experience goes even further -- 
many such upgrade/migration uses, with exceedingly rare reported failures.

-- 
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




On Sat, May 4, 2024 at 7:36 AM Joe Conway <mail@joeconway.com> wrote:
>
> On 5/3/24 11:44, Peter Eisentraut wrote:
> > On 03.05.24 16:13, Tom Lane wrote:
> >> Peter Eisentraut <peter@eisentraut.org> writes:
> >>> On 30.04.24 19:29, Tom Lane wrote:
> >>>> Also, the bigger picture here is the seeming assumption that "if
> >>>> we change pg_trgm then it will be safe to replicate from x86 to
> >>>> arm".  I don't believe that that's a good idea and I'm unwilling
> >>>> to promise that it will work, regardless of what we do about
> >>>> char signedness.  That being the case, I don't want to invest a
> >>>> lot of effort in the signedness issue.  Option (1) is clearly
> >>>> a small change with little if any risk of future breakage.
> >>
> >>> But note that option 1 would prevent some replication that is currently
> >>> working.
> >>
> >> The point of this thread though is that it's working only for small
> >> values of "work".  People are rightfully unhappy if it seems to work
> >> and then later they get bitten by compatibility problems.
> >>
> >> Treating char signedness as a machine property in pg_control would
> >> signal that we don't intend to make it work, and would ensure that
> >> even the most minimal testing would find out that it doesn't work.
> >>
> >> If we do not do that, it seems to me we have to buy into making
> >> it work.  That would mean dealing with the consequences of an
> >> incompatible change in pg_trgm indexes, and then going through
> >> the same dance again the next time(s) similar problems are found.
> >
> > Yes, that is understood.  But anecdotally, replicating between x86-64 arm64 is
> > occasionally used for upgrades or migrations.  In practice, this appears to have
> > mostly worked.  If we now discover that it won't work with certain index
> > extension modules, it's usable for most users. Even if we say, you have to
> > reindex everything afterwards, it's probably still useful for these scenarios.
>
> +1

+1

How about extending amcheck to support GIN and GIst indexes so that it
can detect potential data incompatibility due to changing 'char' to
'unsigned char'? I think these new tests would be useful also for
users to check if they really need to reindex indexes due to such
changes. Also we fix pg_trgm so that it uses 'unsigned char' in PG18.
Users who upgraded to PG18 can run the new amcheck tests on the
primary as well as the standby.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



On Wed, May 15, 2024 at 02:56:54PM +0900, Masahiko Sawada wrote:
> On Sat, May 4, 2024 at 7:36 AM Joe Conway <mail@joeconway.com> wrote:
> > On 5/3/24 11:44, Peter Eisentraut wrote:
> > > On 03.05.24 16:13, Tom Lane wrote:
> > >> Peter Eisentraut <peter@eisentraut.org> writes:
> > >>> On 30.04.24 19:29, Tom Lane wrote:
> > >>>> Also, the bigger picture here is the seeming assumption that "if
> > >>>> we change pg_trgm then it will be safe to replicate from x86 to
> > >>>> arm".  I don't believe that that's a good idea and I'm unwilling
> > >>>> to promise that it will work, regardless of what we do about
> > >>>> char signedness.  That being the case, I don't want to invest a
> > >>>> lot of effort in the signedness issue.  Option (1) is clearly
> > >>>> a small change with little if any risk of future breakage.
> > >>
> > >>> But note that option 1 would prevent some replication that is currently
> > >>> working.
> > >>
> > >> The point of this thread though is that it's working only for small
> > >> values of "work".  People are rightfully unhappy if it seems to work
> > >> and then later they get bitten by compatibility problems.
> > >>
> > >> Treating char signedness as a machine property in pg_control would
> > >> signal that we don't intend to make it work, and would ensure that
> > >> even the most minimal testing would find out that it doesn't work.
> > >>
> > >> If we do not do that, it seems to me we have to buy into making
> > >> it work.  That would mean dealing with the consequences of an
> > >> incompatible change in pg_trgm indexes, and then going through
> > >> the same dance again the next time(s) similar problems are found.
> > >
> > > Yes, that is understood.  But anecdotally, replicating between x86-64 arm64 is
> > > occasionally used for upgrades or migrations.  In practice, this appears to have
> > > mostly worked.  If we now discover that it won't work with certain index
> > > extension modules, it's usable for most users. Even if we say, you have to
> > > reindex everything afterwards, it's probably still useful for these scenarios.

Like you, I would not introduce a ControlFileData block for sign-of-char,
given the signs of breakage in extension indexing only.  That would lose much
useful migration capability.  I'm sympathetic to Tom's point, which I'll
attempt to summarize as: a ControlFileData block is a promise we know how to
keep, whereas we should expect further bug discoveries without it.  At the
same time, I put more weight on the apparently-wide span of functionality that
works fine.  (I wonder whether any static analyzer does or practically could
detect char sign dependence with a decent false positive rate.)

> +1
> 
> How about extending amcheck to support GIN and GIst indexes so that it
> can detect potential data incompatibility due to changing 'char' to
> 'unsigned char'? I think these new tests would be useful also for
> users to check if they really need to reindex indexes due to such
> changes. Also we fix pg_trgm so that it uses 'unsigned char' in PG18.
> Users who upgraded to PG18 can run the new amcheck tests on the
> primary as well as the standby.

If I were standardizing pg_trgm on one or the other notion of "char", I would
choose signed char, since I think it's still the majority.  More broadly, I
see these options to fix pg_trgm:

1. Change to signed char.  Every arm64 system needs to scan pg_trgm indexes.
2. Change to unsigned char.  Every x86 system needs to scan pg_trgm indexes.
3. Offer both, as an upgrade path.  For example, pg_trgm could have separate
   operator classes gin_trgm_ops and gin_trgm_ops_unsigned.  Running
   pg_upgrade on an unsigned-char system would automatically map v17
   gin_trgm_ops to v18 gin_trgm_ops_unsigned.  This avoids penalizing any
   architecture with upgrade-time scans.

Independently, having amcheck for GIN and/or GiST would be great.



On Sun, May 19, 2024 at 6:46 AM Noah Misch <noah@leadboat.com> wrote:
>
> On Wed, May 15, 2024 at 02:56:54PM +0900, Masahiko Sawada wrote:
> >
> > How about extending amcheck to support GIN and GIst indexes so that it
> > can detect potential data incompatibility due to changing 'char' to
> > 'unsigned char'? I think these new tests would be useful also for
> > users to check if they really need to reindex indexes due to such
> > changes. Also we fix pg_trgm so that it uses 'unsigned char' in PG18.
> > Users who upgraded to PG18 can run the new amcheck tests on the
> > primary as well as the standby.
>
> If I were standardizing pg_trgm on one or the other notion of "char", I would
> choose signed char, since I think it's still the majority.  More broadly, I
> see these options to fix pg_trgm:
>
> 1. Change to signed char.  Every arm64 system needs to scan pg_trgm indexes.
> 2. Change to unsigned char.  Every x86 system needs to scan pg_trgm indexes.

Even though it's true that signed char systems are the majority, it
would not be acceptable to force the need to scan pg_trgm indexes on
unsigned char systems.

> 3. Offer both, as an upgrade path.  For example, pg_trgm could have separate
>    operator classes gin_trgm_ops and gin_trgm_ops_unsigned.  Running
>    pg_upgrade on an unsigned-char system would automatically map v17
>    gin_trgm_ops to v18 gin_trgm_ops_unsigned.  This avoids penalizing any
>    architecture with upgrade-time scans.

Very interesting idea. How can new v18 users use the correct operator
class? I don't want to require users to specify the correct signed or
unsigned operator classes when creating a GIN index. Maybe we need to
dynamically use the correct compare function for the same operator
class depending on the char signedness. But is it possible to do it on
the extension (e.g. pg_trgm) side?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



On Thu, Aug 29, 2024 at 03:48:53PM -0500, Masahiko Sawada wrote:
> On Sun, May 19, 2024 at 6:46 AM Noah Misch <noah@leadboat.com> wrote:
> > If I were standardizing pg_trgm on one or the other notion of "char", I would
> > choose signed char, since I think it's still the majority.  More broadly, I
> > see these options to fix pg_trgm:
> >
> > 1. Change to signed char.  Every arm64 system needs to scan pg_trgm indexes.
> > 2. Change to unsigned char.  Every x86 system needs to scan pg_trgm indexes.
> 
> Even though it's true that signed char systems are the majority, it
> would not be acceptable to force the need to scan pg_trgm indexes on
> unsigned char systems.
> 
> > 3. Offer both, as an upgrade path.  For example, pg_trgm could have separate
> >    operator classes gin_trgm_ops and gin_trgm_ops_unsigned.  Running
> >    pg_upgrade on an unsigned-char system would automatically map v17
> >    gin_trgm_ops to v18 gin_trgm_ops_unsigned.  This avoids penalizing any
> >    architecture with upgrade-time scans.
> 
> Very interesting idea. How can new v18 users use the correct operator
> class? I don't want to require users to specify the correct signed or
> unsigned operator classes when creating a GIN index. Maybe we need to

In brief, it wouldn't matter which operator class new v18 indexes use.  The
documentation would focus on gin_trgm_ops and also say something like:

  There's an additional operator class, gin_trgm_ops_unsigned.  It behaves
  exactly like gin_trgm_ops, but it uses a deprecated on-disk representation.
  Use gin_trgm_ops in new indexes, but there's no disadvantage from continuing
  to use gin_trgm_ops_unsigned.  Before PostgreSQL 18, gin_trgm_ops used a
  platform-dependent representation.  pg_upgrade automatically uses
  gin_trgm_ops_unsigned when upgrading from source data that used the
  deprecated representation.

What concerns might users have, then?  (Neither operator class would use plain
"char" in a context that affects on-disk state.  They'll use "signed char" and
"unsigned char".)

> dynamically use the correct compare function for the same operator
> class depending on the char signedness. But is it possible to do it on
> the extension (e.g. pg_trgm) side?

No, I don't think the extension can do that cleanly.  It would need to store
the signedness in the index somehow, and GIN doesn't call the opclass at a
time facilitating that.  That would need help from the core server.



On Fri, Aug 30, 2024 at 8:10 PM Noah Misch <noah@leadboat.com> wrote:
>
> On Thu, Aug 29, 2024 at 03:48:53PM -0500, Masahiko Sawada wrote:
> > On Sun, May 19, 2024 at 6:46 AM Noah Misch <noah@leadboat.com> wrote:
> > > If I were standardizing pg_trgm on one or the other notion of "char", I would
> > > choose signed char, since I think it's still the majority.  More broadly, I
> > > see these options to fix pg_trgm:
> > >
> > > 1. Change to signed char.  Every arm64 system needs to scan pg_trgm indexes.
> > > 2. Change to unsigned char.  Every x86 system needs to scan pg_trgm indexes.
> >
> > Even though it's true that signed char systems are the majority, it
> > would not be acceptable to force the need to scan pg_trgm indexes on
> > unsigned char systems.
> >
> > > 3. Offer both, as an upgrade path.  For example, pg_trgm could have separate
> > >    operator classes gin_trgm_ops and gin_trgm_ops_unsigned.  Running
> > >    pg_upgrade on an unsigned-char system would automatically map v17
> > >    gin_trgm_ops to v18 gin_trgm_ops_unsigned.  This avoids penalizing any
> > >    architecture with upgrade-time scans.
> >
> > Very interesting idea. How can new v18 users use the correct operator
> > class? I don't want to require users to specify the correct signed or
> > unsigned operator classes when creating a GIN index. Maybe we need to
>
> In brief, it wouldn't matter which operator class new v18 indexes use.  The
> documentation would focus on gin_trgm_ops and also say something like:
>
>   There's an additional operator class, gin_trgm_ops_unsigned.  It behaves
>   exactly like gin_trgm_ops, but it uses a deprecated on-disk representation.
>   Use gin_trgm_ops in new indexes, but there's no disadvantage from continuing
>   to use gin_trgm_ops_unsigned.  Before PostgreSQL 18, gin_trgm_ops used a
>   platform-dependent representation.  pg_upgrade automatically uses
>   gin_trgm_ops_unsigned when upgrading from source data that used the
>   deprecated representation.
>
> What concerns might users have, then?  (Neither operator class would use plain
> "char" in a context that affects on-disk state.  They'll use "signed char" and
> "unsigned char".)

I think I understand your idea now. Since gin_trgm_ops will use
"signed char", there is no impact for v18 users -- they can continue
using gin_trgm_ops.

But how does pg_upgrade use gin_trgm_ops_unsigned? This opclass will
be created by executing the pg_trgm script for v18, but it isn't
executed during pg_upgrade. Another way would be to do these opclass
replacement when executing the pg_trgm's update script (i.e., 1.6 to
1.7).

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



On Fri, Sep 06, 2024 at 02:07:10PM -0700, Masahiko Sawada wrote:
> On Fri, Aug 30, 2024 at 8:10 PM Noah Misch <noah@leadboat.com> wrote:
> > On Thu, Aug 29, 2024 at 03:48:53PM -0500, Masahiko Sawada wrote:
> > > On Sun, May 19, 2024 at 6:46 AM Noah Misch <noah@leadboat.com> wrote:
> > > > If I were standardizing pg_trgm on one or the other notion of "char", I would
> > > > choose signed char, since I think it's still the majority.  More broadly, I
> > > > see these options to fix pg_trgm:
> > > >
> > > > 1. Change to signed char.  Every arm64 system needs to scan pg_trgm indexes.
> > > > 2. Change to unsigned char.  Every x86 system needs to scan pg_trgm indexes.
> > >
> > > Even though it's true that signed char systems are the majority, it
> > > would not be acceptable to force the need to scan pg_trgm indexes on
> > > unsigned char systems.
> > >
> > > > 3. Offer both, as an upgrade path.  For example, pg_trgm could have separate
> > > >    operator classes gin_trgm_ops and gin_trgm_ops_unsigned.  Running
> > > >    pg_upgrade on an unsigned-char system would automatically map v17
> > > >    gin_trgm_ops to v18 gin_trgm_ops_unsigned.  This avoids penalizing any
> > > >    architecture with upgrade-time scans.
> > >
> > > Very interesting idea. How can new v18 users use the correct operator
> > > class? I don't want to require users to specify the correct signed or
> > > unsigned operator classes when creating a GIN index. Maybe we need to
> >
> > In brief, it wouldn't matter which operator class new v18 indexes use.  The
> > documentation would focus on gin_trgm_ops and also say something like:
> >
> >   There's an additional operator class, gin_trgm_ops_unsigned.  It behaves
> >   exactly like gin_trgm_ops, but it uses a deprecated on-disk representation.
> >   Use gin_trgm_ops in new indexes, but there's no disadvantage from continuing
> >   to use gin_trgm_ops_unsigned.  Before PostgreSQL 18, gin_trgm_ops used a
> >   platform-dependent representation.  pg_upgrade automatically uses
> >   gin_trgm_ops_unsigned when upgrading from source data that used the
> >   deprecated representation.
> >
> > What concerns might users have, then?  (Neither operator class would use plain
> > "char" in a context that affects on-disk state.  They'll use "signed char" and
> > "unsigned char".)
> 
> I think I understand your idea now. Since gin_trgm_ops will use
> "signed char", there is no impact for v18 users -- they can continue
> using gin_trgm_ops.

Right.

> But how does pg_upgrade use gin_trgm_ops_unsigned? This opclass will
> be created by executing the pg_trgm script for v18, but it isn't
> executed during pg_upgrade. Another way would be to do these opclass
> replacement when executing the pg_trgm's update script (i.e., 1.6 to
> 1.7).

Yes, that's one way to make it work.  If we do it that way, it would be
critical to make the ALTER EXTENSION UPDATE run before anything uses the
index.  Otherwise, we'll run new v18 "signed char" code on a v17 "unsigned
char" data file.  Running ALTER EXTENSION UPDATE early enough should be
feasible, so that's fine.  Some other options:

- If v18 "pg_dump -b" decides to emit CREATE INDEX ... gin_trgm_ops_unsigned,
  then make it also emit the statements to create the opclass.

- Ship pg_trgm--1.6--1.7.sql in back branches.  If the upgrade wants to use
  gin_trgm_ops_unsigned, require the user to ALTER EXTENSION UPDATE first.
  (In back branches, the C code behind gin_trgm_ops_unsigned could just raise
  an error if called.)

What other options exist?  I bet there are more.



Noah Misch <noah@leadboat.com> writes:
> Yes, that's one way to make it work.  If we do it that way, it would be
> critical to make the ALTER EXTENSION UPDATE run before anything uses the
> index.  Otherwise, we'll run new v18 "signed char" code on a v17 "unsigned
> char" data file.  Running ALTER EXTENSION UPDATE early enough should be
> feasible, so that's fine.  Some other options:

> - If v18 "pg_dump -b" decides to emit CREATE INDEX ... gin_trgm_ops_unsigned,
>   then make it also emit the statements to create the opclass.

> - Ship pg_trgm--1.6--1.7.sql in back branches.  If the upgrade wants to use
>   gin_trgm_ops_unsigned, require the user to ALTER EXTENSION UPDATE first.
>   (In back branches, the C code behind gin_trgm_ops_unsigned could just raise
>   an error if called.)

I feel like all of these are leaning heavily on users to get it right,
and therefore have a significant chance of breaking use-cases that
were perfectly fine before.

Remind me of why we can't do something like this:

* Add APIs that allow opclasses to read/write some space in the GIN
metapage.  (We could get ambitious and add such APIs for other AMs
too, but doing it just for GIN is probably a prudent start.)  Ensure
that this space is initially zeroed.

* In gin_trgm_ops, read a byte of this space and interpret it as
    0 = unset
    1 = signed chars
    2 = unsigned chars
If the value is zero, set the byte on the basis of the native
char-signedness of the current platform.  (Obviously, a
secondary server couldn't write the byte, and would have to
wait for the primary to update the value.  In the meantime,
it's no more broken than today.)

* Subsequently, use either signed or unsigned comparisons
based on that value.

This would automatically do the right thing in all cases that
work today, and it'd provide the ability for cross-platform
replication to work in future.  You can envision cases where
transferring a pre-existing index to a platform of the other
stripe would misbehave, but they're the same cases that fail
today, and the fix remains the same: reindex.

            regards, tom lane



On Fri, Sep 06, 2024 at 06:36:53PM -0400, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> > Yes, that's one way to make it work.  If we do it that way, it would be
> > critical to make the ALTER EXTENSION UPDATE run before anything uses the
> > index.  Otherwise, we'll run new v18 "signed char" code on a v17 "unsigned
> > char" data file.  Running ALTER EXTENSION UPDATE early enough should be
> > feasible, so that's fine.  Some other options:
> 
> > - If v18 "pg_dump -b" decides to emit CREATE INDEX ... gin_trgm_ops_unsigned,
> >   then make it also emit the statements to create the opclass.
> 
> > - Ship pg_trgm--1.6--1.7.sql in back branches.  If the upgrade wants to use
> >   gin_trgm_ops_unsigned, require the user to ALTER EXTENSION UPDATE first.
> >   (In back branches, the C code behind gin_trgm_ops_unsigned could just raise
> >   an error if called.)
> 
> I feel like all of these are leaning heavily on users to get it right,

What do you have in mind?  I see things for the pg_upgrade implementation to
get right, but I don't see things for pg_upgrade users to get right.

The last option is disruptive for users of unsigned char platforms, since it
requires a two-step upgrade if upgrading from v11 or earlier.  The other two
don't have that problem.

> and therefore have a significant chance of breaking use-cases that
> were perfectly fine before.
> 
> Remind me of why we can't do something like this:
> 
> * Add APIs that allow opclasses to read/write some space in the GIN
> metapage.  (We could get ambitious and add such APIs for other AMs
> too, but doing it just for GIN is probably a prudent start.)  Ensure
> that this space is initially zeroed.
> 
> * In gin_trgm_ops, read a byte of this space and interpret it as
>     0 = unset
>     1 = signed chars
>     2 = unsigned chars
> If the value is zero, set the byte on the basis of the native
> char-signedness of the current platform.  (Obviously, a
> secondary server couldn't write the byte, and would have to
> wait for the primary to update the value.  In the meantime,
> it's no more broken than today.)
> 
> * Subsequently, use either signed or unsigned comparisons
> based on that value.
> 
> This would automatically do the right thing in all cases that
> work today, and it'd provide the ability for cross-platform
> replication to work in future.  You can envision cases where
> transferring a pre-existing index to a platform of the other
> stripe would misbehave, but they're the same cases that fail
> today, and the fix remains the same: reindex.

No reason you couldn't do it that way.  Works for me.




Noah Misch <noah@leadboat.com> writes:
> On Fri, Sep 06, 2024 at 06:36:53PM -0400, Tom Lane wrote:
>> I feel like all of these are leaning heavily on users to get it right,

> What do you have in mind?  I see things for the pg_upgrade implementation to
> get right, but I don't see things for pg_upgrade users to get right.

Well, yeah, if you are willing to put pg_upgrade in charge of
executing ALTER EXTENSION UPDATE, then that would be a reasonably
omission-proof path.  But we have always intended the pg_upgrade
process to *not* auto-update extensions, so I'm concerned about
the potential side-effects of drilling a hole in that policy.
As an example: even if we ensure that pg_trgm 1.6 to 1.7 is totally
transparent except for this fix, what happens if the user's old
database is still on some pre-1.6 version?  Is it okay to force an
update that includes feature upgrades?

            regards, tom lane



On Fri, Sep 06, 2024 at 07:37:09PM -0400, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> > On Fri, Sep 06, 2024 at 06:36:53PM -0400, Tom Lane wrote:
> >> I feel like all of these are leaning heavily on users to get it right,
> 
> > What do you have in mind?  I see things for the pg_upgrade implementation to
> > get right, but I don't see things for pg_upgrade users to get right.
> 
> Well, yeah, if you are willing to put pg_upgrade in charge of
> executing ALTER EXTENSION UPDATE, then that would be a reasonably
> omission-proof path.  But we have always intended the pg_upgrade
> process to *not* auto-update extensions, so I'm concerned about
> the potential side-effects of drilling a hole in that policy.
> As an example: even if we ensure that pg_trgm 1.6 to 1.7 is totally
> transparent except for this fix, what happens if the user's old
> database is still on some pre-1.6 version?  Is it okay to force an
> update that includes feature upgrades?

Those are disadvantages of some of the options.  I think it could be okay to
inject the mandatory upgrade here or just tell the user to update to 1.7
before attempting the upgrade (if not at 1.7, pg_upgrade would fail with an
error message to that effect).  Your counterproposal avoids the issue, and I'm
good with that design.



On Fri, Sep 6, 2024 at 3:36 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Noah Misch <noah@leadboat.com> writes:
> > Yes, that's one way to make it work.  If we do it that way, it would be
> > critical to make the ALTER EXTENSION UPDATE run before anything uses the
> > index.  Otherwise, we'll run new v18 "signed char" code on a v17 "unsigned
> > char" data file.  Running ALTER EXTENSION UPDATE early enough should be
> > feasible, so that's fine.  Some other options:
>
> > - If v18 "pg_dump -b" decides to emit CREATE INDEX ... gin_trgm_ops_unsigned,
> >   then make it also emit the statements to create the opclass.
>
> > - Ship pg_trgm--1.6--1.7.sql in back branches.  If the upgrade wants to use
> >   gin_trgm_ops_unsigned, require the user to ALTER EXTENSION UPDATE first.
> >   (In back branches, the C code behind gin_trgm_ops_unsigned could just raise
> >   an error if called.)
>
> I feel like all of these are leaning heavily on users to get it right,
> and therefore have a significant chance of breaking use-cases that
> were perfectly fine before.
>
> Remind me of why we can't do something like this:
>
> * Add APIs that allow opclasses to read/write some space in the GIN
> metapage.  (We could get ambitious and add such APIs for other AMs
> too, but doing it just for GIN is probably a prudent start.)  Ensure
> that this space is initially zeroed.
>
> * In gin_trgm_ops, read a byte of this space and interpret it as
>         0 = unset
>         1 = signed chars
>         2 = unsigned chars
> If the value is zero, set the byte on the basis of the native
> char-signedness of the current platform.  (Obviously, a
> secondary server couldn't write the byte, and would have to
> wait for the primary to update the value.  In the meantime,
> it's no more broken than today.)

When do we set the byte on the primary server? If it's the first time
to use the GIN index, secondary servers would have to wait for the
primary to use the GIN index, which could be an unpredictable time or
it may never come depending on index usages. Probably we can make
pg_upgrade set the byte in the meta page of GIN indexes that use the
gin_trgm_ops.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Masahiko Sawada <sawada.mshk@gmail.com> writes:
> When do we set the byte on the primary server? If it's the first time
> to use the GIN index, secondary servers would have to wait for the
> primary to use the GIN index, which could be an unpredictable time or
> it may never come depending on index usages. Probably we can make
> pg_upgrade set the byte in the meta page of GIN indexes that use the
> gin_trgm_ops.

Hmm, perhaps.  That plus set-it-during-index-create would remove the
need for dynamic update like I suggested.  So very roughly the amount
of complexity would balance out.  Do you have an idea for how we'd get
this to happen during pg_upgrade, exactly?

            regards, tom lane



On Mon, Sep 9, 2024 at 4:42 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Masahiko Sawada <sawada.mshk@gmail.com> writes:
> > When do we set the byte on the primary server? If it's the first time
> > to use the GIN index, secondary servers would have to wait for the
> > primary to use the GIN index, which could be an unpredictable time or
> > it may never come depending on index usages. Probably we can make
> > pg_upgrade set the byte in the meta page of GIN indexes that use the
> > gin_trgm_ops.
>
> Hmm, perhaps.  That plus set-it-during-index-create would remove the
> need for dynamic update like I suggested.  So very roughly the amount
> of complexity would balance out.

Yes, I think your set-it-during-index-create would be necessary.

>  Do you have an idea for how we'd get
> this to happen during pg_upgrade, exactly?

What I was thinking is that we have "pg_dump --binary-upgrade" emit a
function call, say "SELECT binary_upgrade_update_gin_meta_page()" for
each GIN index, and the meta pages are updated when restoring the
schemas.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Masahiko Sawada <sawada.mshk@gmail.com> writes:
> On Mon, Sep 9, 2024 at 4:42 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Do you have an idea for how we'd get
>> this to happen during pg_upgrade, exactly?

> What I was thinking is that we have "pg_dump --binary-upgrade" emit a
> function call, say "SELECT binary_upgrade_update_gin_meta_page()" for
> each GIN index, and the meta pages are updated when restoring the
> schemas.

Hmm, but ...

1. IIRC we don't move the relation files into the new cluster until
after we've run the schema dump/restore step.  I think this'd have to
be driven in some other way than from the pg_dump output.  I guess we
could have pg_upgrade start up the new postmaster and call a function
in each DB, which would have to scan for GIN indexes by itself.

2. How will this interact with --link mode?  I don't see how it
doesn't involve scribbling on files that are shared with the old
cluster, leading to possible problems if the pg_upgrade fails later
and the user tries to go back to using the old cluster.  It's not so
much the metapage update that is worrisome --- we're assuming that
that will modify storage that's unused in old versions.  But the
change would be unrecorded in the old cluster's WAL, which sounds
risky.

Maybe we could get away with forcing --copy mode for affected
indexes, but that feels a bit messy.  We'd not want to do it
for unaffected indexes, so the copying code would need to know
a great deal about this problem.

            regards, tom lane



On Mon, Sep 9, 2024 at 11:25 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Masahiko Sawada <sawada.mshk@gmail.com> writes:
> > On Mon, Sep 9, 2024 at 4:42 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Do you have an idea for how we'd get
> >> this to happen during pg_upgrade, exactly?
>
> > What I was thinking is that we have "pg_dump --binary-upgrade" emit a
> > function call, say "SELECT binary_upgrade_update_gin_meta_page()" for
> > each GIN index, and the meta pages are updated when restoring the
> > schemas.
>
> Hmm, but ...
>
> 1. IIRC we don't move the relation files into the new cluster until
> after we've run the schema dump/restore step.  I think this'd have to
> be driven in some other way than from the pg_dump output.  I guess we
> could have pg_upgrade start up the new postmaster and call a function
> in each DB, which would have to scan for GIN indexes by itself.

You're right.

>
> 2. How will this interact with --link mode?  I don't see how it
> doesn't involve scribbling on files that are shared with the old
> cluster, leading to possible problems if the pg_upgrade fails later
> and the user tries to go back to using the old cluster.  It's not so
> much the metapage update that is worrisome --- we're assuming that
> that will modify storage that's unused in old versions.  But the
> change would be unrecorded in the old cluster's WAL, which sounds
> risky.
>
> Maybe we could get away with forcing --copy mode for affected
> indexes, but that feels a bit messy.  We'd not want to do it
> for unaffected indexes, so the copying code would need to know
> a great deal about this problem.

Good point. I agree that it would not be a very good idea to force --copy mode.

An alternative way would be that we store the char signedness in the
control file, and gin_trgm_ops opclass reads it if the bytes in the
meta page shows 'unset'. The char signedness in the control file
doesn't mean to be used for the compatibility check for physical
replication but used as a hint. But it also could be a bit messy,
though.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Masahiko Sawada <sawada.mshk@gmail.com> writes:
> An alternative way would be that we store the char signedness in the
> control file, and gin_trgm_ops opclass reads it if the bytes in the
> meta page shows 'unset'. The char signedness in the control file
> doesn't mean to be used for the compatibility check for physical
> replication but used as a hint. But it also could be a bit messy,
> though.

Yeah, that seems like it could work.  But are we sure that replicas
get a copy of the primary's control file rather than creating their
own?

            regards, tom lane



On Tue, Sep 10, 2024 at 11:57 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Masahiko Sawada <sawada.mshk@gmail.com> writes:
> > An alternative way would be that we store the char signedness in the
> > control file, and gin_trgm_ops opclass reads it if the bytes in the
> > meta page shows 'unset'. The char signedness in the control file
> > doesn't mean to be used for the compatibility check for physical
> > replication but used as a hint. But it also could be a bit messy,
> > though.
>
> Yeah, that seems like it could work.  But are we sure that replicas
> get a copy of the primary's control file rather than creating their
> own?

Yes, I think so. Since at least the system identifiers of primary and
replicas must be identical for physical replication, if replicas use
their own control files then they cannot start the replication.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Masahiko Sawada <sawada.mshk@gmail.com> writes:
> On Tue, Sep 10, 2024 at 11:57 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Yeah, that seems like it could work.  But are we sure that replicas
>> get a copy of the primary's control file rather than creating their
>> own?

> Yes, I think so. Since at least the system identifiers of primary and
> replicas must be identical for physical replication, if replicas use
> their own control files then they cannot start the replication.

Got it.  So now I'm wondering if we need all the complexity of storing
stuff in the GIN metapages.  Could we simply read the (primary's)
signedness out of pg_control and use that?  We might need some caching
mechanism to make that cheap enough, but accessing the current index's
metapage is far from free either.

            regards, tom lane



On Tue, Sep 10, 2024 at 05:56:47PM -0400, Tom Lane wrote:
> Masahiko Sawada <sawada.mshk@gmail.com> writes:
> > On Tue, Sep 10, 2024 at 11:57 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Yeah, that seems like it could work.  But are we sure that replicas
> >> get a copy of the primary's control file rather than creating their
> >> own?
> 
> > Yes, I think so. Since at least the system identifiers of primary and
> > replicas must be identical for physical replication, if replicas use
> > their own control files then they cannot start the replication.
> 
> Got it.  So now I'm wondering if we need all the complexity of storing
> stuff in the GIN metapages.  Could we simply read the (primary's)
> signedness out of pg_control and use that?

Yes.



On Thu, Sep 12, 2024 at 03:42:48PM -0700, Masahiko Sawada wrote:
> On Tue, Sep 10, 2024 at 3:05 PM Noah Misch <noah@leadboat.com> wrote:
> > On Tue, Sep 10, 2024 at 05:56:47PM -0400, Tom Lane wrote:
> > > Got it.  So now I'm wondering if we need all the complexity of storing
> > > stuff in the GIN metapages.  Could we simply read the (primary's)
> > > signedness out of pg_control and use that?

> I've attached a PoC patch for this idea. We write  the default char
> signedness to the control file at initdb time. Then when comparing two
> trgms, pg_trgm opclasses use a comparison function based on the char
> signedness of the cluster. I've confirmed that the patch fixes the
> reported case at least.

I agree that proves the concept.



On Mon, Sep 16, 2024 at 9:24 AM Noah Misch <noah@leadboat.com> wrote:
>
> On Thu, Sep 12, 2024 at 03:42:48PM -0700, Masahiko Sawada wrote:
> > On Tue, Sep 10, 2024 at 3:05 PM Noah Misch <noah@leadboat.com> wrote:
> > > On Tue, Sep 10, 2024 at 05:56:47PM -0400, Tom Lane wrote:
> > > > Got it.  So now I'm wondering if we need all the complexity of storing
> > > > stuff in the GIN metapages.  Could we simply read the (primary's)
> > > > signedness out of pg_control and use that?
>
> > I've attached a PoC patch for this idea. We write  the default char
> > signedness to the control file at initdb time. Then when comparing two
> > trgms, pg_trgm opclasses use a comparison function based on the char
> > signedness of the cluster. I've confirmed that the patch fixes the
> > reported case at least.
>
> I agree that proves the concept.

Thanks. I like the simplicity of this approach. If we agree with this
approach, I'd like to proceed with it.

Regardless of what approach we take, I wanted to provide some
regression tests for these changes, but I could not come up with a
reasonable idea. It would be great if we could do something like
027_stream_regress.pl on cross-architecture replication. But just
executing 027_stream_regress.pl on cross-architecture replication
could not be sufficient since we would like to confirm query results
as well. If we could have a reasonable tool or way, it would also help
find other similar bugs related architectural differences.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



On Tue, Sep 17, 2024 at 09:43:41PM -0700, Masahiko Sawada wrote:
> On Mon, Sep 16, 2024 at 9:24 AM Noah Misch <noah@leadboat.com> wrote:
> > On Thu, Sep 12, 2024 at 03:42:48PM -0700, Masahiko Sawada wrote:
> > > On Tue, Sep 10, 2024 at 3:05 PM Noah Misch <noah@leadboat.com> wrote:
> > > > On Tue, Sep 10, 2024 at 05:56:47PM -0400, Tom Lane wrote:
> > > > > Got it.  So now I'm wondering if we need all the complexity of storing
> > > > > stuff in the GIN metapages.  Could we simply read the (primary's)
> > > > > signedness out of pg_control and use that?
> >
> > > I've attached a PoC patch for this idea. We write  the default char
> > > signedness to the control file at initdb time. Then when comparing two
> > > trgms, pg_trgm opclasses use a comparison function based on the char
> > > signedness of the cluster. I've confirmed that the patch fixes the
> > > reported case at least.
> >
> > I agree that proves the concept.
> 
> Thanks. I like the simplicity of this approach. If we agree with this
> approach, I'd like to proceed with it.

Works for me.

> Regardless of what approach we take, I wanted to provide some
> regression tests for these changes, but I could not come up with a
> reasonable idea. It would be great if we could do something like
> 027_stream_regress.pl on cross-architecture replication. But just
> executing 027_stream_regress.pl on cross-architecture replication
> could not be sufficient since we would like to confirm query results
> as well. If we could have a reasonable tool or way, it would also help
> find other similar bugs related architectural differences.

Perhaps add a regress.c function that changes the control file flag and
flushes the change to disk?



On Wed, Sep 25, 2024 at 03:46:27PM -0700, Masahiko Sawada wrote:
> On Wed, Sep 18, 2024 at 6:46 PM Noah Misch <noah@leadboat.com> wrote:
> > On Tue, Sep 17, 2024 at 09:43:41PM -0700, Masahiko Sawada wrote:
> > > On Mon, Sep 16, 2024 at 9:24 AM Noah Misch <noah@leadboat.com> wrote:
> > > > On Thu, Sep 12, 2024 at 03:42:48PM -0700, Masahiko Sawada wrote:
> > > > > On Tue, Sep 10, 2024 at 3:05 PM Noah Misch <noah@leadboat.com> wrote:
> > > > > > On Tue, Sep 10, 2024 at 05:56:47PM -0400, Tom Lane wrote:
> > > > > > > Got it.  So now I'm wondering if we need all the complexity of storing
> > > > > > > stuff in the GIN metapages.  Could we simply read the (primary's)
> > > > > > > signedness out of pg_control and use that?

> > > Thanks. I like the simplicity of this approach. If we agree with this
> > > approach, I'd like to proceed with it.
> >
> > Works for me.
> >
> > > Regardless of what approach we take, I wanted to provide some
> > > regression tests for these changes, but I could not come up with a
> > > reasonable idea. It would be great if we could do something like
> > > 027_stream_regress.pl on cross-architecture replication. But just
> > > executing 027_stream_regress.pl on cross-architecture replication
> > > could not be sufficient since we would like to confirm query results
> > > as well. If we could have a reasonable tool or way, it would also help
> > > find other similar bugs related architectural differences.
> >
> > Perhaps add a regress.c function that changes the control file flag and
> > flushes the change to disk?
> 
> I've tried this idea but found out that extensions can flush the
> controlfile on the disk after flipping the char signedness value, they
> cannot update the controlfile data on the shared memory. And, when the
> server shuts down, the on-disk controlfile is updated with the
> on-memory controlfile data, so the char signedness is reverted.
> 
> We can add a function to set the char signedness of on-memory
> controlfile data or add a new option to pg_resetwal to change the char
> signedness on-disk controlfile data but they seem to be overkill to me
> and I'm concerned about misusing these option and function.

Before the project is over, pg_upgrade will have some way to set the control
file signedness to the source cluster signedness.  I think pg_upgrade should
also take an option for overriding the source cluster signedness for source
clusters v17 and older.  That way, a user knowing they copied the v17 source
cluster from x86 to ARM can pg_upgrade properly without the prerequisite of
acquiring an x86 VM.  Once that all exists, perhaps it will be clearer how
best to change signedness for testing.

> While this change does not encourage the use of 'char' without
> explicitly specifying its signedness, it provides a hint to ensure
> consistent behavior for indexes (e.g., GIN and GiST) and tables that
> already store data sorted by the 'char' type on disk, especially in
> cross-platform replication scenarios.

Let's put that sort of information in the GetDefaultCharSignedness() header
comment.  New code should use explicit "signed char" or "unsigned char" when
it matters for data file compatibility.  GetDefaultCharSignedness() exists for
code that deals with pre-v18 data files, where we accidentally let C
implementation signedness affect persistent data.

> @@ -4256,6 +4257,18 @@ WriteControlFile(void)
>  
>      ControlFile->float8ByVal = FLOAT8PASSBYVAL;
>  
> +    /*
> +     * The signedness of the char type is implementation-defined. For example
> +     * on x86 architecture CPUs, the char data type is typically treated as
> +     * signed by default whereas on aarch architecture CPUs it is typically
> +     * treated as unsigned by default.
> +     */
> +#if CHAR_MIN != 0
> +    ControlFile->default_char_signedness = true;
> +#else
> +    ControlFile->default_char_signedness = false;
> +#endif

This has initdb set the field to the current C implementation's signedness.  I
wonder if, instead, initdb should set signedness=true unconditionally.  Then
the only source of signedness=false will be pg_upgrade.

Advantage: signedness=false will get rarer over time.  If we had known about
this problem during the last development cycle that forced initdb (v8.3), we
would have made all clusters signed or all clusters unsigned.  Making
pg_upgrade the only source of signedness=false will cause the population of
database clusters to converge toward that retrospective ideal.

Disadvantage: testing signedness=false will require more planning.

What other merits should we consider as part of deciding?



On Mon, Sep 30, 2024 at 11:49 AM Noah Misch <noah@leadboat.com> wrote:
>
> On Wed, Sep 25, 2024 at 03:46:27PM -0700, Masahiko Sawada wrote:
> > On Wed, Sep 18, 2024 at 6:46 PM Noah Misch <noah@leadboat.com> wrote:
> > > On Tue, Sep 17, 2024 at 09:43:41PM -0700, Masahiko Sawada wrote:
> > > > On Mon, Sep 16, 2024 at 9:24 AM Noah Misch <noah@leadboat.com> wrote:
> > > > > On Thu, Sep 12, 2024 at 03:42:48PM -0700, Masahiko Sawada wrote:
> > > > > > On Tue, Sep 10, 2024 at 3:05 PM Noah Misch <noah@leadboat.com> wrote:
> > > > > > > On Tue, Sep 10, 2024 at 05:56:47PM -0400, Tom Lane wrote:
> > > > > > > > Got it.  So now I'm wondering if we need all the complexity of storing
> > > > > > > > stuff in the GIN metapages.  Could we simply read the (primary's)
> > > > > > > > signedness out of pg_control and use that?
>
> > > > Thanks. I like the simplicity of this approach. If we agree with this
> > > > approach, I'd like to proceed with it.
> > >
> > > Works for me.
> > >
> > > > Regardless of what approach we take, I wanted to provide some
> > > > regression tests for these changes, but I could not come up with a
> > > > reasonable idea. It would be great if we could do something like
> > > > 027_stream_regress.pl on cross-architecture replication. But just
> > > > executing 027_stream_regress.pl on cross-architecture replication
> > > > could not be sufficient since we would like to confirm query results
> > > > as well. If we could have a reasonable tool or way, it would also help
> > > > find other similar bugs related architectural differences.
> > >
> > > Perhaps add a regress.c function that changes the control file flag and
> > > flushes the change to disk?
> >
> > I've tried this idea but found out that extensions can flush the
> > controlfile on the disk after flipping the char signedness value, they
> > cannot update the controlfile data on the shared memory. And, when the
> > server shuts down, the on-disk controlfile is updated with the
> > on-memory controlfile data, so the char signedness is reverted.
> >
> > We can add a function to set the char signedness of on-memory
> > controlfile data or add a new option to pg_resetwal to change the char
> > signedness on-disk controlfile data but they seem to be overkill to me
> > and I'm concerned about misusing these option and function.
>
> Before the project is over, pg_upgrade will have some way to set the control
> file signedness to the source cluster signedness.  I think pg_upgrade should
> also take an option for overriding the source cluster signedness for source
> clusters v17 and older.  That way, a user knowing they copied the v17 source
> cluster from x86 to ARM can pg_upgrade properly without the prerequisite of
> acquiring an x86 VM.

Good idea.

> Once that all exists, perhaps it will be clearer how
> best to change signedness for testing.

Agreed.

>
> > While this change does not encourage the use of 'char' without
> > explicitly specifying its signedness, it provides a hint to ensure
> > consistent behavior for indexes (e.g., GIN and GiST) and tables that
> > already store data sorted by the 'char' type on disk, especially in
> > cross-platform replication scenarios.
>
> Let's put that sort of information in the GetDefaultCharSignedness() header
> comment.  New code should use explicit "signed char" or "unsigned char" when
> it matters for data file compatibility.  GetDefaultCharSignedness() exists for
> code that deals with pre-v18 data files, where we accidentally let C
> implementation signedness affect persistent data.

Agreed, will fix.

>
> > @@ -4256,6 +4257,18 @@ WriteControlFile(void)
> >
> >       ControlFile->float8ByVal = FLOAT8PASSBYVAL;
> >
> > +     /*
> > +      * The signedness of the char type is implementation-defined. For example
> > +      * on x86 architecture CPUs, the char data type is typically treated as
> > +      * signed by default whereas on aarch architecture CPUs it is typically
> > +      * treated as unsigned by default.
> > +      */
> > +#if CHAR_MIN != 0
> > +     ControlFile->default_char_signedness = true;
> > +#else
> > +     ControlFile->default_char_signedness = false;
> > +#endif
>
> This has initdb set the field to the current C implementation's signedness.  I
> wonder if, instead, initdb should set signedness=true unconditionally.  Then
> the only source of signedness=false will be pg_upgrade.
>
> Advantage: signedness=false will get rarer over time.  If we had known about
> this problem during the last development cycle that forced initdb (v8.3), we
> would have made all clusters signed or all clusters unsigned.  Making
> pg_upgrade the only source of signedness=false will cause the population of
> database clusters to converge toward that retrospective ideal.

I think it's a good idea. Being able to treat one case as a rarer one
rather than treating both cases equally may have various advantages in
the future, for example when developing or investigating a problem.

> Disadvantage: testing signedness=false will require more planning.

If testing signedness=false always requires pg_upgrade, there might be
some cumbersomeness. Inventing a testing-purpose-only tool (e.g., a
CLI program) that changes the signedness might make tests easier.

> What other merits should we consider as part of deciding?

Considering that the population of database cluster signedness will
converge to signedness=true in the future, we can consider using
-fsigned-char to prevent similar problems for the future. We need to
think about possible side-effects as well, though.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



On Tue, Oct 01, 2024 at 11:55:48AM -0700, Masahiko Sawada wrote:
> On Mon, Sep 30, 2024 at 11:49 AM Noah Misch <noah@leadboat.com> wrote:
> > This has initdb set the field to the current C implementation's signedness.  I
> > wonder if, instead, initdb should set signedness=true unconditionally.  Then
> > the only source of signedness=false will be pg_upgrade.
> >
> > Advantage: signedness=false will get rarer over time.  If we had known about
> > this problem during the last development cycle that forced initdb (v8.3), we
> > would have made all clusters signed or all clusters unsigned.  Making
> > pg_upgrade the only source of signedness=false will cause the population of
> > database clusters to converge toward that retrospective ideal.
> 
> I think it's a good idea. Being able to treat one case as a rarer one
> rather than treating both cases equally may have various advantages in
> the future, for example when developing or investigating a problem.
> 
> > Disadvantage: testing signedness=false will require more planning.
> 
> If testing signedness=false always requires pg_upgrade, there might be
> some cumbersomeness. Inventing a testing-purpose-only tool (e.g., a
> CLI program) that changes the signedness might make tests easier.
> 
> > What other merits should we consider as part of deciding?
> 
> Considering that the population of database cluster signedness will
> converge to signedness=true in the future, we can consider using
> -fsigned-char to prevent similar problems for the future. We need to
> think about possible side-effects as well, though.

It's good to think about -fsigned-char.  While I find it tempting, several
things would need to hold for us to benefit from it:

- Every supported compiler has to offer it or an equivalent.
- The non-compiler parts of every supported C implementation need to
  cooperate.  For example, CHAR_MIN must change in response to the flag.  See
  the first comment in cash_in().
- Libraries we depend on can't do anything incompatible with it.

Given that, I would lean toward not using -fsigned-char.  It's unlikely all
three things will hold.  Even if they do, the benefit is not large.



Noah Misch <noah@leadboat.com> writes:
> On Tue, Oct 01, 2024 at 11:55:48AM -0700, Masahiko Sawada wrote:
>> Considering that the population of database cluster signedness will
>> converge to signedness=true in the future, we can consider using
>> -fsigned-char to prevent similar problems for the future. We need to
>> think about possible side-effects as well, though.

> It's good to think about -fsigned-char.  While I find it tempting, several
> things would need to hold for us to benefit from it:

> - Every supported compiler has to offer it or an equivalent.
> - The non-compiler parts of every supported C implementation need to
>   cooperate.  For example, CHAR_MIN must change in response to the flag.  See
>   the first comment in cash_in().
> - Libraries we depend on can't do anything incompatible with it.

> Given that, I would lean toward not using -fsigned-char.  It's unlikely all
> three things will hold.  Even if they do, the benefit is not large.

I am very, very strongly against deciding that Postgres will only
support one setting of char signedness.  It's a step on the way to
hardware monoculture, and we know where that eventually leads.
(In other words, I categorically reject Sawada-san's assertion
that signed chars will become universal.  I'd reject the opposite
assertion as well.)

            regards, tom lane



On Tue, Oct 1, 2024 at 8:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Noah Misch <noah@leadboat.com> writes:
> > On Tue, Oct 01, 2024 at 11:55:48AM -0700, Masahiko Sawada wrote:
> >> Considering that the population of database cluster signedness will
> >> converge to signedness=true in the future, we can consider using
> >> -fsigned-char to prevent similar problems for the future. We need to
> >> think about possible side-effects as well, though.
>
> > It's good to think about -fsigned-char.  While I find it tempting, several
> > things would need to hold for us to benefit from it:
>
> > - Every supported compiler has to offer it or an equivalent.
> > - The non-compiler parts of every supported C implementation need to
> >   cooperate.  For example, CHAR_MIN must change in response to the flag.  See
> >   the first comment in cash_in().
> > - Libraries we depend on can't do anything incompatible with it.
>
> > Given that, I would lean toward not using -fsigned-char.  It's unlikely all
> > three things will hold.  Even if they do, the benefit is not large.
>
> I am very, very strongly against deciding that Postgres will only
> support one setting of char signedness.  It's a step on the way to
> hardware monoculture, and we know where that eventually leads.
> (In other words, I categorically reject Sawada-san's assertion
> that signed chars will become universal.  I'd reject the opposite
> assertion as well.)

Thank you for pointing this out. I agree with both of you.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com