Обсуждение: PgStat_HashKey padding issue when passed by reference
Hi, I was experimenting with adding a new OID field to PgStat_HashKey: ``` typedef struct PgStat_HashKey { PgStat_Kind kind; /* statistics entry kind */ Oid dboid; /* database ID. InvalidOid for shared objects. */ Oid newfield; uint64 objid; /* object ID (table, function, etc.), or * identifier. */ } PgStat_HashKey; ``` It's important to observe that hole padding is added with this new field: ``` (gdb) ptype /o PgStat_HashKey type = struct PgStat_HashKey { /* 0 | 4 */ uint32 kind; /* 4 | 4 */ Oid dboid; /* 8 | 4 */ Oid newfield; /* XXX 4-byte hole */ /* 16 | 8 */ uint64 objid; /* total size (bytes): 24 */ } ``` With "newfield" added, I started seeing the error "entry ref vanished before deletion" when creating the cluster (or selecting from pg_stat_database, etc.). This error occurs when the local reference to a pgstat entry is deleted and the entry can't be found in the simplehash pgStatEntryRefHash, based on the key. ``` if (!pgstat_entry_ref_hash_delete(pgStatEntryRefHash, key)) elog(ERROR, "entry ref vanished before deletion"); ``` This is surprising because this entry with the key is created earlier via pgstat_get_entry_ref_cached -> pgstat_entry_ref_hash_insert. When I brought this up to Bertrand (CC'd), he was not able to reproduce the ERROR by adding a new field, and we quickly realized that it is related to the compiler optimization (-O) flags we were using. He was building with -O0 (no optimization) and I was building with -O2. So something with optimization was causing the issue. Starting a problematic cluster with Valgrind: ``` valgrind --leak-check=full --track-origins=yes \ $PGHOME/bin/postgres -D $PGDATA --single -P ``` shows messages like "Use of uninitialised value of size 8": ``` ==26625== Use of uninitialised value of size 8 ==26625== at 0x61D04F: pgstat_get_entry_ref_cached (pgstat_shmem.c:405) ==26625== by 0x61D04F: pgstat_get_entry_ref (pgstat_shmem.c:488) ==26625== by 0x6177B9: pgstat_fetch_entry (pgstat.c:977) ==26625== by 0x5433B2: rebuild_database_list (autovacuum.c:1002) ``` Further debugging led us to the fact that the key is passed-by-value to ``` pgstat_get_entry_ref_cached(PgStat_HashKey key, PgStat_EntryRef **entry_ref_p) ``` and in that routine the entry is created: ``` cache_entry = pgstat_entry_ref_hash_insert(pgStatEntryRefHash, key, &found); `` The hash for the key is created with fasthash32(key, size, 0) and compared against another key using memcmp(a, b, sizeof(PgStat_HashKey)) in the "pgstat_hash_hash_key" and "pgstat_cmp_hash_key" routines, respectively. With -O2, we observed that (on our testing environment): pgstat_get_entry_ref_cached() is inlined the inlined code does not copy the hole padding content when the key is passed-by-value So that while the hole initially contains zeroes: ``` /* clear padding */ memset(&key, 0, sizeof(struct PgStat_HashKey)); ``` We observed that the zeroes in the hole are not copied to pgstat_get_entry_ref_cached() when passed-by-value with -O2. GDB when setting a breakpoint at pgstat_entry_ref_hash_insert (which is called inside pgstat_get_entry_ref_cached) shows that the padding has some unexpected values, see the 12-15th byte "0x00 0x23 0x21 0x21" which based on the memset done earlier should be 0. ``` (gdb) x/24xb d 0x7fff68cc16f0: 0x02 0x00 0x00 0x00 0x05 0x00 0x00 0x00 0x7fff68cc16f8: 0x00 0x00 0x00 0x00 0x00 0x23 0x21 0x21 0x7fff68cc1700: 0x67 0x0a 0x00 0x00 0x00 0x00 0x00 0x00 (gdb) ``` Now, if we pass the key by reference to pgstat_entry_ref_hash_insert as is patched, we can see the padding is in fact cleared. ``` (gdb) x/24xb d 0x7ffe1bf6c640: 0x01 0x00 0x00 0x00 0x05 0x00 0x00 0x00 0x7ffe1bf6c648: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x7ffe1bf6c650: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 ``` This looks like a compiler bug, we tested multiple ways to workaround: 1/ pg_noinline pgstat_get_entry_ref_cached 2/ making volatile the PgStat_HashKey key in the pgstat_get_entry_ref_cached parameter 3/ Passing the key by reference instead of by value 4/ set SH_SCOPE to "external" rather than "static inline" in pgstat_shmem.c it seems, at least for this case, the best option is to pass the key by reference, like in the attached patch. FWIW, quickly looking at other areas that pass keys around, we can also otherhash keys being passed by value here: ``` spcachekey_hash(); spcachekey_equal(): ``` but does not look problematic, as only the string field of the struct is hashed. -- Sami Imseih Amazon Web Services (AWS)
Вложения
I forgot to CC Bertrand as mentioned above. -- Sami Imseih Amazon Web Services (AWS)
Hi, On 2025-09-04 11:14:53 -0500, Sami Imseih wrote: > GDB when setting a breakpoint at pgstat_entry_ref_hash_insert (which > is called inside > pgstat_get_entry_ref_cached) shows that the padding has some unexpected > values, see the 12-15th byte "0x00 0x23 0x21 0x21" which based on the > memset done earlier should be 0. > > ``` > (gdb) x/24xb d > 0x7fff68cc16f0: 0x02 0x00 0x00 0x00 0x05 0x00 0x00 0x00 > 0x7fff68cc16f8: 0x00 0x00 0x00 0x00 0x00 0x23 0x21 0x21 > 0x7fff68cc1700: 0x67 0x0a 0x00 0x00 0x00 0x00 0x00 0x00 > (gdb) > ``` > > Now, if we pass the key by reference to pgstat_entry_ref_hash_insert as is > patched, we can see the padding is in fact cleared. > > ``` > (gdb) x/24xb d > 0x7ffe1bf6c640: 0x01 0x00 0x00 0x00 0x05 0x00 0x00 0x00 > 0x7ffe1bf6c648: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 > 0x7ffe1bf6c650: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 > ``` > > This looks like a compiler bug, we tested multiple ways to workaround: Padding bytes aren't guaranteed to be zeroed, unless you take care to zero them out with memset or such. That's not a compiler bug. Greetings, Andres Freund
> > This looks like a compiler bug, we tested multiple ways to workaround: > > Padding bytes aren't guaranteed to be zeroed, unless you take care to zero > them out with memset or such. That's not a compiler bug. Perhaps calling this a compiler bug is not appropriate. However, memset is in fact called when the key is created ``` /* clear padding */ memset(&key, 0, sizeof(struct PgStat_HashKey)); ``` but the zeroed out padding bytes are not retained when the key is passed by value. This is why the proposal is to pass the key by reference. -- Sami Imseih Amazon Web Services (AWS)
Em qui., 4 de set. de 2025 às 13:15, Sami Imseih <samimseih@gmail.com> escreveu:
Hi,
I was experimenting with adding a new OID field to PgStat_HashKey:
```
typedef struct PgStat_HashKey
{
PgStat_Kind kind; /* statistics entry kind */
Oid dboid; /* database ID. InvalidOid for shared objects. */
Oid newfield;
uint64 objid; /* object ID (table, function, etc.), or
* identifier. */
} PgStat_HashKey;
```
It's important to observe that hole padding is added with this new
field:
```
(gdb) ptype /o PgStat_HashKey
type = struct PgStat_HashKey {
/* 0 | 4 */ uint32 kind;
/* 4 | 4 */ Oid dboid;
/* 8 | 4 */ Oid newfield;
/* XXX 4-byte hole */
/* 16 | 8 */ uint64 objid;
/* total size (bytes): 24 */
}
```
With "newfield" added, I started seeing the error "entry ref vanished
before deletion" when creating the cluster (or selecting from
pg_stat_database, etc.).
This error occurs when the local reference to a pgstat entry is deleted
and the entry can't be found in the simplehash pgStatEntryRefHash,
based on the key.
```
if (!pgstat_entry_ref_hash_delete(pgStatEntryRefHash, key))
elog(ERROR, "entry ref vanished before deletion");
```
This is surprising because this entry with the key is created earlier
via pgstat_get_entry_ref_cached -> pgstat_entry_ref_hash_insert.
When I brought this up to Bertrand (CC'd), he was not able to reproduce the
ERROR by adding a new field, and we quickly realized that it is related
to the compiler optimization (-O) flags we were using. He was building
with -O0 (no optimization) and I was building with -O2. So something
with optimization was causing the issue.
Starting a problematic cluster with Valgrind:
```
valgrind --leak-check=full --track-origins=yes \
$PGHOME/bin/postgres -D $PGDATA --single -P
```
shows messages like "Use of uninitialised value of size 8":
```
==26625== Use of uninitialised value of size 8
==26625== at 0x61D04F: pgstat_get_entry_ref_cached (pgstat_shmem.c:405)
==26625== by 0x61D04F: pgstat_get_entry_ref (pgstat_shmem.c:488)
==26625== by 0x6177B9: pgstat_fetch_entry (pgstat.c:977)
==26625== by 0x5433B2: rebuild_database_list (autovacuum.c:1002)
```
Further debugging led us to the fact that the key is passed-by-value to
```
pgstat_get_entry_ref_cached(PgStat_HashKey key, PgStat_EntryRef **entry_ref_p)
```
and in that routine the entry is created:
```
cache_entry = pgstat_entry_ref_hash_insert(pgStatEntryRefHash, key, &found);
``
The hash for the key is created with fasthash32(key, size, 0) and
compared against another key using memcmp(a, b, sizeof(PgStat_HashKey))
in the "pgstat_hash_hash_key" and "pgstat_cmp_hash_key" routines,
respectively.
With -O2, we observed that (on our testing environment):
pgstat_get_entry_ref_cached() is inlined
the inlined code does not copy the hole padding content when the key
is passed-by-value
So that while the hole initially contains zeroes:
```
/* clear padding */
memset(&key, 0, sizeof(struct PgStat_HashKey));
```
We observed that the zeroes in the hole are not copied to
pgstat_get_entry_ref_cached()
when passed-by-value with -O2.
GDB when setting a breakpoint at pgstat_entry_ref_hash_insert (which
is called inside
pgstat_get_entry_ref_cached) shows that the padding has some unexpected
values, see the 12-15th byte "0x00 0x23 0x21 0x21" which based on the
memset done earlier should be 0.
```
(gdb) x/24xb d
0x7fff68cc16f0: 0x02 0x00 0x00 0x00 0x05 0x00 0x00 0x00
0x7fff68cc16f8: 0x00 0x00 0x00 0x00 0x00 0x23 0x21 0x21
0x7fff68cc1700: 0x67 0x0a 0x00 0x00 0x00 0x00 0x00 0x00
(gdb)
```
Now, if we pass the key by reference to pgstat_entry_ref_hash_insert as is
patched, we can see the padding is in fact cleared.
```
(gdb) x/24xb d
0x7ffe1bf6c640: 0x01 0x00 0x00 0x00 0x05 0x00 0x00 0x00
0x7ffe1bf6c648: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7ffe1bf6c650: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
```
This looks like a compiler bug, we tested multiple ways to workaround:
1/ pg_noinline pgstat_get_entry_ref_cached
2/ making volatile the PgStat_HashKey key in the
pgstat_get_entry_ref_cached parameter
3/ Passing the key by reference instead of by value
4/ set SH_SCOPE to "external" rather than "static inline" in pgstat_shmem.c
it seems, at least for this case, the best option is to pass the key
by reference, like
in the attached patch.
+1
Not to mention that it is faster to pass the parameter by reference.
best regards,
Ranier Vilela
> +1 > Not to mention that it is faster to pass the parameter by reference. yes, there is potentially better performance. BTW. My mistake. The subject of this thread should be "PgStat_HashKey padding issue when passed by value" -- Sami Imseih Amazon Web Services (AWS)
On Thu, Sep 04, 2025 at 11:50:15AM -0500, Sami Imseih wrote: > Perhaps calling this a compiler bug is not appropriate. > However, memset is in fact called when the key is created > > ``` > /* clear padding */ > memset(&key, 0, sizeof(struct PgStat_HashKey)); > ``` > > but the zeroed out padding bytes are not retained when the > key is passed by value. This is why the proposal is to pass the > key by reference. So your point is that this impacts the hash lookups and the fact that we rely on passing PgStat_HashKey around due to the internals of how pgstat_shmem.c is shaped for its dshash. Did you run a full installcheck under valgrind to have more confidence that this is the only code path where we care about the padding of the passed-in values for the hash lookups? There are a lot of things that use simplehash.h, so one could question if we should think about that from a higher point of view, and if there could be some compiler magic that could be used to enforce a policy (don't think there is any magic for the padding of passed-in values, just wondering if there is something I don't know about that could justify a more global policy that can be applied to everything that uses simplehash.h). One idea would be, for example, that keys used with simplehash should never have any padding at all, and we could force a check in the shape of a static assertion to force a failure when attempting to compile code that attempts to do so. That would give us a way to check in a broader way if some code path do that currently, scaling better with the expectations we could have in the whole tree or even out-of-core extension code. -- Michael
Вложения
On Sat, Sep 06, 2025 at 10:35:58AM +0900, Michael Paquier wrote: > One idea would be, for example, that keys used with simplehash should > never have any padding at all, and we could force a check in the shape > of a static assertion to force a failure when attempting to compile > code that attempts to do so. That would give us a way to check in a > broader way if some code path do that currently, scaling better with > the expectations we could have in the whole tree or even out-of-core > extension code. Doing some research here, I have noticed this one: https://en.cppreference.com/w/cpp/types/has_unique_object_representations.html I was also wondering about some use of pg_attribute_packed() here, or perhaps enforce a check based on offsetof() and the structure size, but I doubt that any of that would be really portable across the buildfarm. Another idea would be to make sure that the sizeof() of the structure matches with the sum of the sizeof() for the individual fields in it. That's cumbersome to rely on, still simpler. Perhaps we could do something among these lines for pgstat_shmem.c, or just document that the structure should never have any padding. -- Michael
Вложения
Hi, On 2025-09-08 10:25:13 +0900, Michael Paquier wrote: > On Sat, Sep 06, 2025 at 10:35:58AM +0900, Michael Paquier wrote: > > One idea would be, for example, that keys used with simplehash should > > never have any padding at all, and we could force a check in the shape > > of a static assertion to force a failure when attempting to compile > > code that attempts to do so. That would give us a way to check in a > > broader way if some code path do that currently, scaling better with > > the expectations we could have in the whole tree or even out-of-core > > extension code. > > Doing some research here, I have noticed this one: > https://en.cppreference.com/w/cpp/types/has_unique_object_representations.html > > I was also wondering about some use of pg_attribute_packed() here, or > perhaps enforce a check based on offsetof() and the structure size, > but I doubt that any of that would be really portable across the > buildfarm. > > Another idea would be to make sure that the sizeof() of the structure > matches with the sum of the sizeof() for the individual fields in it. > That's cumbersome to rely on, still simpler. Perhaps we could do > something among these lines for pgstat_shmem.c, or just document that > the structure should never have any padding. I'd just add a comment mentioning that any padding bytes should be avoided. Greetings, Andres
> Did you run a full installcheck under valgrind to have more confidence > that this is the only code path where we care about the padding of the > passed-in values for the hash lookups? I did and could not reproduce "Use of uninitialised value..." on HEAD. I will mention that it's not a simple reproducible exercise, at least I could not repro with a simple C program that compiled with -O2. >> Another idea would be to make sure that the sizeof() of the structure >> matches with the sum of the sizeof() for the individual fields in it. >> That's cumbersome to rely on, still simpler. Perhaps we could do >> something among these lines for pgstat_shmem.c, or just document that >> the structure should never have any padding. > I'd just add a comment mentioning that any padding bytes should be avoided. or instead of avoiding padding, we continue to do what we do which is zeroing out the key when creating, but to make sure we pass it by reference so the hash_hash/hash_compare functions are operating on the same memor.I think that is all we really need. in v2 I added a comment to clarify why pass-by-reference is needed. Thoughts? -- Sami
Вложения
On Mon, Sep 08, 2025 at 12:08:12PM -0400, Andres Freund wrote: > On 2025-09-08 10:25:13 +0900, Michael Paquier wrote: >> Another idea would be to make sure that the sizeof() of the structure >> matches with the sum of the sizeof() for the individual fields in it. >> That's cumbersome to rely on, still simpler. Perhaps we could do >> something among these lines for pgstat_shmem.c, or just document that >> the structure should never have any padding. > > I'd just add a comment mentioning that any padding bytes should be avoided. Agreed on this part. I am wondering also about the addition of a static assertion as well, as it seems that this thread and the opinions on it point to such an addition having value, and requiring valgrind to detect that is annoying, especially if people propose more patches in the future that may affect the way we pass the hash key values in this area of the code. An idea is attached. -- Michael
Вложения
> > I'd just add a comment mentioning that any padding bytes should be avoided. > > Agreed on this part. > > I am wondering also about the addition of a static assertion as well, > as it seems that this thread and the opinions on it point to such an > addition having value, and requiring valgrind to detect that is > annoying, especially if people propose more patches in the future that > may affect the way we pass the hash key values in this area of the > code. An idea is attached. hmm, so if we decide to add a member that has a type that requires padding, the assert will fail? I am not sure I like this very much, since we lose flexibility on the struct member types that can be used. See the examples below on top of pgstat-hashkey-padding.patch The following will fail the assert since padding is needed for the new Oid member. ``` index d5869299fa8..4cdd41fda3a 100644 --- a/src/include/utils/pgstat_internal.h +++ b/src/include/utils/pgstat_internal.h @@ -53,6 +53,7 @@ typedef struct PgStat_HashKey { PgStat_Kind kind; /* statistics entry kind */ Oid dboid; /* database ID. InvalidOid for shared objects. */ + Oid field1; uint64 objid; /* object ID (table, function, etc.), or * identifier. */ } PgStat_HashKey; @@ -62,7 +63,7 @@ typedef struct PgStat_HashKey * size matches with the sum of each field is a check simple enough to * enforce this policy. */ -StaticAssertDecl((sizeof(PgStat_Kind) + sizeof(uint64) + sizeof(Oid)) == +StaticAssertDecl((sizeof(PgStat_Kind) + sizeof(uint64) + sizeof(Oid) + sizeof(Oid)) == sizeof(PgStat_HashKey), "PgStat_HashKey should have no padding"); ``` This will not fail the assert, since no padding is needed for the new member. ``` diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h index d5869299fa8..5732c89219d 100644 --- a/src/include/utils/pgstat_internal.h +++ b/src/include/utils/pgstat_internal.h @@ -53,6 +53,7 @@ typedef struct PgStat_HashKey { PgStat_Kind kind; /* statistics entry kind */ Oid dboid; /* database ID. InvalidOid for shared objects. */ + uint64 field1; uint64 objid; /* object ID (table, function, etc.), or * identifier. */ } PgStat_HashKey; @@ -62,7 +63,7 @@ typedef struct PgStat_HashKey * size matches with the sum of each field is a check simple enough to * enforce this policy. */ -StaticAssertDecl((sizeof(PgStat_Kind) + sizeof(uint64) + sizeof(Oid)) == +StaticAssertDecl((sizeof(PgStat_Kind) + sizeof(uint64) + sizeof(Oid) + sizeof(uint64)) == sizeof(PgStat_HashKey), "PgStat_HashKey should have no padding"); ``` -- Sami
On Mon, Sep 08, 2025 at 09:20:14PM -0500, Sami Imseih wrote: > The following will fail the assert since padding is needed for the > new Oid member. > > @@ -53,6 +53,7 @@ typedef struct PgStat_HashKey > { > PgStat_Kind kind; /* statistics entry kind */ > Oid dboid; /* database > ID. InvalidOid for shared objects. */ > + Oid field1; > uint64 objid; /* object ID (table, > function, etc.), or > > This will not fail the assert, since no padding is needed for the new member. > > @@ -53,6 +53,7 @@ typedef struct PgStat_HashKey > { > PgStat_Kind kind; /* statistics entry kind */ > Oid dboid; /* database > ID. InvalidOid for shared objects. */ > + uint64 field1; > uint64 objid; /* object ID (table, > function, etc.), or Exactly. Hence my proposal of static assertion is doing exactly the job I want it to do :) -- Michael
Вложения
Exactly. Hence my proposal of static assertion is doing exactly the
job I want it to do :)
OID field next, they will not be able to as that will be introducing padding.
On the other hand, passing the key by reference and documenting the reason in
pgstat_shmem.c will not lose this flexibility.
--
Sami
On Mon, Sep 08, 2025 at 09:36:52PM -0500, Sami Imseih wrote: > But my concern is the flexibility of this approach. If someone is to add an > OID field next, they will not be able to as that will be introducing > padding. On the other hand, passing the key by reference and > documenting the reason in pgstat_shmem.c will not lose this > flexibility. I don't mind discarding the static assertion idea, but at the end what counts for me here is that I don't want to sacrifice future changes in the pgstats code that would always require passing around the hash key by reference. So I would just do like attached, documenting at the top of PgStat_HashKey that we should not have padding in it, removing three memset(0) calls that expected it. -- Michael
Вложения
> On Mon, Sep 08, 2025 at 09:36:52PM -0500, Sami Imseih wrote: > > But my concern is the flexibility of this approach. If someone is to add an > > OID field next, they will not be able to as that will be introducing > > padding. On the other hand, passing the key by reference and > > documenting the reason in pgstat_shmem.c will not lose this > > flexibility. > > I don't mind discarding the static assertion idea, but at the end what > counts for me here is that I don't want to sacrifice future changes in > the pgstats code that would always require passing around the hash key > by reference. I don't see how this improves the situation, but will just make it more difficult to add a new field that requires padding in the future. If we are documenting either way, I rather that we just document the need to pass a key by reference, which is the pattern used in other areas ( see pgss_store and entry_alloc in pg_stat_statements.c ) Others may have a different opinion. -- Sami
Em qua., 10 de set. de 2025 às 23:53, Michael Paquier <michael@paquier.xyz> escreveu:
On Mon, Sep 08, 2025 at 09:36:52PM -0500, Sami Imseih wrote:
> But my concern is the flexibility of this approach. If someone is to add an
> OID field next, they will not be able to as that will be introducing
> padding. On the other hand, passing the key by reference and
> documenting the reason in pgstat_shmem.c will not lose this
> flexibility.
I don't mind discarding the static assertion idea, but at the end what
counts for me here is that I don't want to sacrifice future changes in
the pgstats code that would always require passing around the hash key
by reference.
So I would just do like attached, documenting at the
top of PgStat_HashKey that we should not have padding in it, removing
three memset(0) calls that expected it.
Currently no compiler guarantees that static initialization will fill possible holes,
something that *memset* guarantees.
I think this change is unsafe.
best regards,
Ranier Vilela
On Thu, Sep 11, 2025 at 10:21:45AM -0500, Sami Imseih wrote: > I don't see how this improves the situation, but will just make it more > difficult to add a new field that requires padding in the future. > > If we are documenting either way, I rather that we just document the need > to pass a key by reference, which is the pattern used in other areas > ( see pgss_store and entry_alloc in pg_stat_statements.c ) > > Others may have a different opinion. Yeah, I do care about the size of the hash key. So if someone goes on and proposes the addition of a new field while we already have 8 bytes for the object ID, that can itself be the hash of something else because we area already set up for life in terms of value friction, we will have an interesting debate. Adding padding is not something I'd be OK with, even if the hash key compaction could be enforced with a compiler attribute. And FWIW, I'm not much a fan of the expectations documented at the top of pgssHashKey with its padding bytes, neither am I convinced that it is a good idea to spread that more than necessary and bloat the size of the hash key. That's just my opinion, other opinions may differ of course, and I'm OK to be outvoted. -- Michael
Вложения
> > I don't see how this improves the situation, but will just make it more > > difficult to add a new field that requires padding in the future. > > > > If we are documenting either way, I rather that we just document the need > > to pass a key by reference, which is the pattern used in other areas > > ( see pgss_store and entry_alloc in pg_stat_statements.c ) > > > > Others may have a different opinion. > > Yeah, I do care about the size of the hash key. So if someone goes on > and proposes the addition of a new field while we already have 8 bytes > for the object ID, that can itself be the hash of something else > because we area already set up for life in terms of value friction, we > will have an interesting debate. Just to confirm, you are saying we are unlikely to ever add a new field to the key. Is that correct? -- Sami Imseih Amazon Web Services (AWS)
On Mon, Sep 15, 2025 at 04:47:27PM -0500, Sami Imseih wrote: > Just to confirm, you are saying we are unlikely to ever add a new field > to the key. Is that correct? I would rather avoid that, yes. -- Michael
Вложения
> On Mon, Sep 15, 2025 at 04:47:27PM -0500, Sami Imseih wrote: > > Just to confirm, you are saying we are unlikely to ever add a new field > > to the key. Is that correct? > > I would rather avoid that, yes. 7d85d87f4d5c35 added code to clear the padding bytes with memset in anticipation that the key could be changed in the future, in a way that padding will be introduced. So, if we are changing thoughts on this, we should add additional comments next to ``` + * NB: We assume that this struct contains no padding. ``` to enforce that the hash stored in objid should be used to support additional fields, rather than adding a field directly into the key. Will help future patch reviews/designs. -- Sami
On Tue, Sep 16, 2025 at 02:38:20PM -0500, Sami Imseih wrote: > 7d85d87f4d5c35 added code to clear the padding bytes with memset > in anticipation that the key could be changed in the future, in a way > that padding will be introduced. Yep. The argument raised on this thread with the requirement of the key being passed by reference has made me change my mind, because I did not thing that valgrind would complain with that. So yes, I'm backpedalling a bit. Sorry for the confusion. > So, if we are changing thoughts on > this, we should add additional comments next to > ``` > + * NB: We assume that this struct contains no padding. > ``` > to enforce that the hash stored in objid should be used to > support additional fields, rather than adding a field directly > into the key. Hmm. Do you have a specific suggestion for enhancement? I can think about something like this wording: "NB: We assume that this struct contains no padding. The 8 bytes allocated for the object ID are good enough to ensure the uniqueness of the hash key, hence the addition of new fields is not recommended." More suggestions or a better sentence are of course welcome. > Will help future patch reviews/designs. Cool, thanks. -- Michael
Вложения
> More suggestions or a better sentence are of course welcome. > "NB: We assume that this struct contains no padding. The 8 bytes > allocated for the object ID are good enough to ensure the uniqueness > of the hash key, hence the addition of new fields is not recommended." That sounds correct. However, I see the no padding and the objid as separate points. One could add a new field that does not introduce padding. So, I added "Also, " for clarity. "NB: We assume that this struct contains no padding. Also, 8 bytes allocated for the object ID are good enough to ensure the uniqueness of the hash key, hence the addition of new fields is not recommended." Also, what about we also add the assert as done earlier in this thread [0] to ensure that the struct indeed does not have padding? +/* + * PgStat_HashKey should not have any padding. Checking that the structure + * size matches with the sum of each field is a check simple enough to + * enforce this policy. + */ +StaticAssertDecl((sizeof(PgStat_Kind) + sizeof(uint64) + sizeof(Oid)) == + sizeof(PgStat_HashKey), + "PgStat_HashKey should have no padding"); + [0] https://www.postgresql.org/message-id/aL9zo5X0MsSxO2pM%40paquier.xyz -- Sami
On Wed, Sep 17, 2025 at 07:04:36PM -0500, Sami Imseih wrote: > "NB: We assume that this struct contains no padding. Also, 8 bytes > allocated for the object ID are good enough to ensure the uniqueness > of the hash key, hence the addition of new fields is not recommended." Works for me. > Also, what about we also add the assert as done earlier in this thread [0] > to ensure that the struct indeed does not have padding? I still want to add it, but it also seemed like you were not much a fan of it, so I did not really want to push forward with something that was not loved. :D Addressing your points, attached is an updated patch labelled v2. -- Michael
Вложения
> I still want to add it, but it also seemed like you were not much a > fan of it, so I did not really want to push forward with something > that was not loved. :D > > Addressing your points, attached is an updated patch labelled v2. I was not a fan of it, when my idea was to allow flexibility in adding more fields to the key. However, I am now convinced objid should be good enough. v2 looks good to me. Not sure if there are edge cases in which the assert will succeed even with padding (I can't think of any way that will be possible), so for now what you have seems good enough. -- Sami
On Thu, Sep 18, 2025 at 11:52:05AM -0500, Sami Imseih wrote: > v2 looks good to me. Not sure if there are edge cases in which > the assert will succeed even with padding (I can't think of any way > that will be possible), so for now what you have seems good enough. Okay, I've applied that on HEAD, then. I didn't see an urge in backpatching it. If there is an ask to do so, I could always do so later. -- Michael
Вложения
> On Thu, Sep 18, 2025 at 11:52:05AM -0500, Sami Imseih wrote: > > v2 looks good to me. Not sure if there are edge cases in which > > the assert will succeed even with padding (I can't think of any way > > that will be possible), so for now what you have seems good enough. > > Okay, I've applied that on HEAD, then. I didn't see an urge in > backpatching it. If there is an ask to do so, I could always do so > later. > -- > Michael Thank you! I update the CF entry [0] as committed. [0] https://commitfest.postgresql.org/patch/6033/ -- Sami
On Fri, Sep 19, 2025 at 08:52:24PM -0500, Sami Imseih wrote: > Thank you! I update the CF entry [0] as committed. > > [0] https://commitfest.postgresql.org/patch/6033/ Oops, missed this part. Thanks. -- Michael
Вложения
Hi, On Thu, Sep 18, 2025 at 11:52:05AM -0500, Sami Imseih wrote: > > I still want to add it, but it also seemed like you were not much a > > fan of it, so I did not really want to push forward with something > > that was not loved. :D > > > > Addressing your points, attached is an updated patch labelled v2. > > I was not a fan of it, when my idea was to allow flexibility in adding > more fields to the key. However, I am now convinced objid should be > good enough. + * NB: We assume that this struct contains no padding. Also, 8 bytes + * allocated for the object ID are good enough to ensure the uniqueness + * of the hash key, hence the addition of new fields is not recommended. yeah that would also work for [1], where the objoid could then store a spcOid and a relNumber (so that all the RelFileLocator fields would be stored) means: dboid (linked to RelFileLocator's dbOid) objoid (linked to RelFileLocator's spcOid and to the RelFileLocator's relNumber) [1]: https://www.postgresql.org/message-id/flat/ZlGYokUIlERemvpB%40ip-10-97-1-34.eu-west-3.compute.internal Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com