Обсуждение: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

Поиск
Список
Период
Сортировка

Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear hackers,

I found that postgres could not be built if a complier option HASH_STATISTICS is
set [1]. Also, I found HASH_DEBUG option caused warnings [2]. Usage of the are
mentioned at the code comments in dynahash.c.

I'm not sure whether we would keep supporting them because no one may not have
used anymore now.

Anyway, I tried to fix the error/warnings. Please see attached.

[1]:
```
dynahash.c: In function ‘hash_update_hash_key’:
dynahash.c:1178:9: error: ‘hctl’ undeclared (first use in this function)
 1178 |         hctl->accesses++;
      |         ^~~~
```

[2]:
```
dynahash.c: In function ‘init_htab’:
dynahash.c:779:68: warning: format ‘%ld’ expects argument of type ‘long int’, but argument 12 has type ‘uint32’ {aka
‘unsignedint’} [-Wformat=] 
  779 |         fprintf(stderr, "init_htab:\n%s%p\n%s%ld\n%s%ld\n%s%d\n%s%ld\n%s%u\n%s%x\n%s%x\n%s%ld\n",
      |                                                                  ~~^
      |                                                                    |
      |                                                                    long int
      |                                                                  %d
......
  784 |                         "MAX BUCKET      ", hctl->max_bucket,
      |                                             ~~~~~~~~~~~~~~~~
      |                                                 |
      |
dynahash.c:779:86: warning: format ‘%x’ expects argument of type ‘unsigned int’, but argument 18 has type ‘long int’
[-Wformat=]
  779 |    fprintf(stderr, "init_htab:\n%s%p\n%s%ld\n%s%ld\n%s%d\n%s%ld\n%s%u\n%s%x\n%s%x\n%s%ld\n",
      |                                                                                ~^
      |                                                                                 |
      |                                                                                 unsigned in
      |                                                                                %lx
......
  787 |                    "NSEGS           ", hctl->nsegs);
      |                                        ~~~~~~~~~~~
      |                                            |
      |                                            long int
dynahash.c:779:90: warning: format ‘%s’ expects a matching ‘char *’ argument [-Wformat=]
  779 |   fprintf(stderr, "init_htab:\n%s%p\n%s%ld\n%s%ld\n%s%d\n%s%ld\n%s%u\n%s%x\n%s%x\n%s%ld\n",
      |                                                                                   ~^
      |                                                                                    |
      |                                                                                    char *

dynahash.c:779:93: warning: format ‘%ld’ expects a matching ‘long int’ argument [-Wformat=]
  779 |   fprintf(stderr, "init_htab:\n%s%p\n%s%ld\n%s%ld\n%s%d\n%s%ld\n%s%u\n%s%x\n%s%x\n%s%ld\n",
      |                                                                                     ~~^
      |                                                                                       |
      |                                                                                       long int
```

Best regards,
Hayato Kuroda
FUJITSU LIMITED


Вложения

Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
David Rowley
Дата:
On Thu, 14 Aug 2025 at 23:48, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
> I found that postgres could not be built if a complier option HASH_STATISTICS is
> set [1]. Also, I found HASH_DEBUG option caused warnings [2]. Usage of the are
> mentioned at the code comments in dynahash.c.

> [1]:
> ```
> dynahash.c: In function ‘hash_update_hash_key’:
> dynahash.c:1178:9: error: ‘hctl’ undeclared (first use in this function)
>  1178 |         hctl->accesses++;
>       |         ^~~~
> ```

Looks to be new to PG17 introduced in cc5ef90ed.

> [2]:
> ```
> dynahash.c: In function ‘init_htab’:
> dynahash.c:779:68: warning: format ‘%ld’ expects argument of type ‘long int’, but argument 12 has type ‘uint32’ {aka
‘unsignedint’} [-Wformat=] 
>   779 |         fprintf(stderr, "init_htab:\n%s%p\n%s%ld\n%s%ld\n%s%d\n%s%ld\n%s%u\n%s%x\n%s%x\n%s%ld\n",

This dates back to PG14 from be0a66666. Looks like the author
miscounted the parameter in the format string and deleted starting 1
parameter too soon.

I'll address these and backpatch once the freeze is lifted from the
backbranches. That should be quite soon.

Thanks

David



Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
Tom Lane
Дата:
"Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> writes:
> I found that postgres could not be built if a complier option HASH_STATISTICS is
> set [1]. Also, I found HASH_DEBUG option caused warnings [2]. Usage of the are
> mentioned at the code comments in dynahash.c.

> I'm not sure whether we would keep supporting them because no one may not have
> used anymore now.

Ugh.  I did a bit of "git bisect" work and discovered that:

1.  HASH_DEBUG got broken at

commit 44ca4022f3f9297bab5cbffdd97973dbba1879ed
Author: Robert Haas <rhaas@postgresql.org>
Date:   Wed Mar 23 10:56:23 2016 -0400

    Partition the freelist for shared dynahash tables.

and was fixed again at

commit 9d4e56699957b261390c50dcda7f947470017484
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Wed Aug 2 12:16:50 2017 -0400

    Remove broken and useless entry-count printing in HASH_DEBUG code.

and then broke again at

commit be0a6666656ec3f68eb7d8e7abab5139fcd47012
Author: Thomas Munro <tmunro@postgresql.org>
Date:   Sat Sep 19 11:28:34 2020 +1200

    Remove large fill factor support from dynahash.c.

2.  HASH_STATISTICS got broken at

commit cc5ef90edd809eaf85e11a0ee251229bbf7ce798
Author: Michael Paquier <michael@paquier.xyz>
Date:   Fri Mar 15 07:57:17 2024 +0900

    Refactor initial hash lookup in dynahash.c


Based on this, I think we should remove the HASH_DEBUG support.
It's been broken for six of the last nine years and only one
person ever noticed.  Moreover, if you were trying to find a
problem in dynahash, you'd probably want different debug logging
than what's here.

There might be a case for removing HASH_STATISTICS too, but
it seems weaker.  I do recall having made use of that code
once years ago ...

            regards, tom lane



Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
Tom Lane
Дата:
David Rowley <dgrowleyml@gmail.com> writes:
> I'll address these and backpatch once the freeze is lifted from the
> backbranches. That should be quite soon.

The freeze was over when we tagged the releases, a day and a half
ago now.

As I just responded to Hayato-san, I think there's a good case
for removing the HASH_DEBUG code rather than fixing it (again).

            regards, tom lane



Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
David Rowley
Дата:
On Fri, 15 Aug 2025 at 02:04, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> As I just responded to Hayato-san, I think there's a good case
> for removing the HASH_DEBUG code rather than fixing it (again).

No objections here. It feels like it must not get used very often if
it's taken almost 5 years for someone to notice the warning.

The reason I thought to backpatch the fix was that I was less sure
about removing it in backbranches and thought that should just be
fixed and any consideration to remove it would be in master only.

David



RE: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Tom,

Thanks for the analysis.

> Based on this, I think we should remove the HASH_DEBUG support.
> It's been broken for six of the last nine years and only one
> person ever noticed.  Moreover, if you were trying to find a
> problem in dynahash, you'd probably want different debug logging
> than what's here.

I didn't realize that it has been broken for a long time. Agreed to remove
the HASH_DEBUG option.

> There might be a case for removing HASH_STATISTICS too, but
> it seems weaker.  I do recall having made use of that code
> once years ago ...

I have never used both options, but one point is that hash_stats() has been
exported and extensions have called it. Is there a possibility that hidden
developer is using?

Anyway, I have no strong opinion. Attached patch does 1) remove HASH_DEBUG and
2) fix broken code with HASH_STATISTICS.

Best regards,
Hayato Kuroda
FUJITSU LIMITED


Вложения

Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
Michael Paquier
Дата:
On Fri, Aug 15, 2025 at 03:22:18AM +0000, Hayato Kuroda (Fujitsu) wrote:
>> Based on this, I think we should remove the HASH_DEBUG support.
>> It's been broken for six of the last nine years and only one
>> person ever noticed.  Moreover, if you were trying to find a
>> problem in dynahash, you'd probably want different debug logging
>> than what's here.
>
> I didn't realize that it has been broken for a long time. Agreed to remove
> the HASH_DEBUG option.

One use case where I've been recently relying on dynahash is the
pgstats shmem code.  While I've not used these symbols, I'm pretty
sure that they could prove useful now that I know about them.  Andres,
an opinion perhaps?

> I have never used both options, but one point is that hash_stats() has been
> exported and extensions have called it. Is there a possibility that hidden
> developer is using?
>
> Anyway, I have no strong opinion. Attached patch does 1) remove HASH_DEBUG and
> 2) fix broken code with HASH_STATISTICS.

It looks like I'm responsible for breaking one of them, at least,
sorry for that.  As far as I can see, the fixes are not complicated,
so I'd rather fix and backpatch things rather than removing both as
both combined can be quite powerful when debugging or improving this
code.  David, were you planning to do so?  I'd vote for keeping these
working.
--
Michael

Вложения

Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
Tom Lane
Дата:
"Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> writes:
>> There might be a case for removing HASH_STATISTICS too, but
>> it seems weaker.  I do recall having made use of that code
>> once years ago ...

> I have never used both options, but one point is that hash_stats() has been
> exported and extensions have called it. Is there a possibility that hidden
> developer is using?

Seems unlikely.  They couldn't count on it doing anything at all
in a standard build.  Moreover, even if HASH_STATISTICS is defined,
what it will do is print to stderr which is hardly a production-
friendly behavior.  So on the whole I'd bet lunch that no one
depends on hash_stats().

In any case, we'd be talking about a change in master only,
not something we'd back-patch.  So there would be time for
people to complain if anyone really cares.

> Anyway, I have no strong opinion. Attached patch does 1) remove HASH_DEBUG and
> 2) fix broken code with HASH_STATISTICS.

Whatever we do with this, let's do it in two separate patches,
just in case someone argues for reverting one or the other.
The issues don't seem to me to be fundamentally connected.

            regards, tom lane



Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
David Rowley
Дата:
On Fri, 15 Aug 2025 at 15:42, Michael Paquier <michael@paquier.xyz> wrote:
> It looks like I'm responsible for breaking one of them, at least,
> sorry for that.  As far as I can see, the fixes are not complicated,
> so I'd rather fix and backpatch things rather than removing both as
> both combined can be quite powerful when debugging or improving this
> code.  David, were you planning to do so?  I'd vote for keeping these
> working.

Yes, I'm about to push the HASH_STATISTICS one and backpatch to v17.

I think we should fix and backpatch the HASH_DEBUG one. We can have a
separate debate on removing it in master only. It's hard to imagine
anyone objecting to fixing it, so let's just do that for the time
being.

FWIW, I've no personal need to keep HASH_DEBUG, but if someone does,
then let's keep it. Maybe we can make it use elog(DEBUG<N>) rather
than fprintf and at least build with it in some BF member so we notice
sooner if someone breaks it again. (I've not checked if there's a good
reason why we can't use elog(). Perhaps something dynahash related
happens to early in backend startup...)

David



Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
Michael Paquier
Дата:
On Fri, Aug 15, 2025 at 04:48:10PM +1200, David Rowley wrote:
> Yes, I'm about to push the HASH_STATISTICS one and backpatch to v17.

Thanks.

> I think we should fix and backpatch the HASH_DEBUG one. We can have a
> separate debate on removing it in master only. It's hard to imagine
> anyone objecting to fixing it, so let's just do that for the time
> being.

Sounds good to me.
--
Michael

Вложения

RE: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear David,

> FWIW, I've no personal need to keep HASH_DEBUG, but if someone does,
> then let's keep it. Maybe we can make it use elog(DEBUG<N>) rather
> than fprintf and at least build with it in some BF member so we notice
> sooner if someone breaks it again. (I've not checked if there's a good
> reason why we can't use elog(). Perhaps something dynahash related
> happens to early in backend startup...)

Just in case: actually, even if the HASH_DEBUG part is fixed on PG17, it could
not pass some tests. One example is initdb test [1].
ISTM, command_like() assumed that there are no outputs in stderr but this option
does. This meant no BF animals cannot set this option as-is.
After I changed them to elog(DEBUG1) (and set debug1 as default) I ran tests
under src/test, and they could pass.

[1]:
```
[13:55:38.836](0.001s) not ok 29 - options --locale-provider=icu --locale=und --lc-*=C: no stderr
[13:55:38.836](0.000s) 
[13:55:38.836](0.000s) #   Failed test 'options --locale-provider=icu --locale=und --lc-*=C: no stderr'
#   at t/001_initdb.pl line 131.
[13:55:38.837](0.001s) #          got: 'init_htab:
# TABLE POINTER   0x180ee70
# DIRECTORY SIZE  256
# SEGMENT SIZE    256
# SEGMENT SHIFT   8
# MAX BUCKET      3
# HIGH MASK       7
# LOW  MASK       3
# NSEGS           1
...
```

Best regards,
Hayato Kuroda
FUJITSU LIMITED


Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
David Rowley
Дата:
On Fri, 15 Aug 2025 at 17:25, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, Aug 15, 2025 at 04:48:10PM +1200, David Rowley wrote:
> > Yes, I'm about to push the HASH_STATISTICS one and backpatch to v17.
>
> Thanks.

Done

> > I think we should fix and backpatch the HASH_DEBUG one. We can have a
> > separate debate on removing it in master only. It's hard to imagine
> > anyone objecting to fixing it, so let's just do that for the time
> > being.
>
> Sounds good to me.

Also done. (I didn't use the proposed patch to fix that as the
proposed format wasn't correct. HIGH MASK was being changed to use %u
instead of %x, which I thought was unintended.)

David



Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
David Rowley
Дата:
On Fri, 15 Aug 2025 at 17:46, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
> Just in case: actually, even if the HASH_DEBUG part is fixed on PG17, it could
> not pass some tests. One example is initdb test [1].
> ISTM, command_like() assumed that there are no outputs in stderr but this option
> does. This meant no BF animals cannot set this option as-is.
> After I changed them to elog(DEBUG1) (and set debug1 as default) I ran tests
> under src/test, and they could pass.

Yeah, I noticed the tests wouldn't pass with it enabled. I think now
that the format bug is fixed in the back branches, we can decide what
we're going to do in master.

I think if we're going to keep HASH_DEBUG, then we should try and do
something to help ensure it does not get broken again for an excessive
period of time. I'm not against switching to elog(DEBUG), but then I
don't have any idea why it was outputting to stderr. If someone wants
to switch their BF animal to define HASH_DEBUG then maybe we can
commit the elog(DEBUG) change to master only and see if anyone
complains. (I suspect nobody will).

I think the votes to keep it should outweigh the votes to get rid of
it, as someone voting to keep it probably has some idea that it'll be
useful to them, and keeping it shouldn't really hinder the people who
don't want it. I think Michael has voted to keep it.

Michael, any thoughts on switching from stderr to elog(DEBUG)? And
also about defining HASH_DEBUG on one of your BF members?

David



Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
Michael Paquier
Дата:
On Fri, Aug 15, 2025 at 06:37:44PM +1200, David Rowley wrote:
> I think the votes to keep it should outweigh the votes to get rid of
> it, as someone voting to keep it probably has some idea that it'll be
> useful to them, and keeping it shouldn't really hinder the people who
> don't want it. I think Michael has voted to keep it.
>
> Michael, any thoughts on switching from stderr to elog(DEBUG)?

You mean to do that for both flags, right?  That would be OK for me.

> And also about defining HASH_DEBUG on one of your BF members?

Sure, I'll do that for both HASH_DEBUG and HASH_STATISTICS if we
decide to keep both and that a fix is applied.  We've proven to be bad
at breaking the two of them.
--
Michael

Вложения

Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
David Rowley
Дата:
On Fri, 15 Aug 2025 at 19:09, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, Aug 15, 2025 at 06:37:44PM +1200, David Rowley wrote:
> > Michael, any thoughts on switching from stderr to elog(DEBUG)?
>
> You mean to do that for both flags, right?  That would be OK for me.

Yeah. Just to aid discussion, let's say the attached patch.

I don't really have an idea on which debug level these should appear
on, however. I picked DEBUG4 for no other reason than DEBUG5 being
quite verbose with AIO related messages.

David

Вложения

Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
Tom Lane
Дата:
David Rowley <dgrowleyml@gmail.com> writes:
> Yeah. Just to aid discussion, let's say the attached patch.

I think if we want to claim that these bits represent actually
usable functionality, we need to do more than s/fprintf/elog/.
Some points:

* Neither printout identifies which hashtable it's talking about
in any usable fashion, which is silly when we could print
hashp->tabname.  HASH_DEBUG prints the pointer to the table,
which is certainly useless unless you've got gdb attached to
the session, and probably useless even then without the tabname.

* The output formats are randomly inconsistent with each other,
and don't look much like other outputs either.  I'm particularly
vexed by the fact that you could not usefully grep the log for
this data.  I think we should switch them to a single log line so
that grepping for a hashtable name could produce something useful.

* As it stands, HASH_DEBUG prints only at hashtable creation and
HASH_STATISTICS prints only at table destruction.  Is that really
enough? I'm thinking in particular of shared and session-lifespan
hashtables, which won't ever receive a hash_destroy call.

* One fairly believable use-case for hash_stats() is to be called
manually from a debugger.  To make that a little easier, I think
we should fix it to not crash if "where" is NULL.

> I don't really have an idea on which debug level these should appear
> on, however. I picked DEBUG4 for no other reason than DEBUG5 being
> quite verbose with AIO related messages.

No opinion here either.

            regards, tom lane



Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
David Rowley
Дата:
On Sat, 16 Aug 2025 at 03:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> * Neither printout identifies which hashtable it's talking about
> in any usable fashion, which is silly when we could print
> hashp->tabname.  HASH_DEBUG prints the pointer to the table,
> which is certainly useless unless you've got gdb attached to
> the session, and probably useless even then without the tabname.
>
> * The output formats are randomly inconsistent with each other,
> and don't look much like other outputs either.  I'm particularly
> vexed by the fact that you could not usefully grep the log for
> this data.  I think we should switch them to a single log line so
> that grepping for a hashtable name could produce something useful.
>
> * As it stands, HASH_DEBUG prints only at hashtable creation and
> HASH_STATISTICS prints only at table destruction.  Is that really
> enough? I'm thinking in particular of shared and session-lifespan
> hashtables, which won't ever receive a hash_destroy call.
>
> * One fairly believable use-case for hash_stats() is to be called
> manually from a debugger.  To make that a little easier, I think
> we should fix it to not crash if "where" is NULL.

I've attached another patch with all these things fixed up.

For the single line format, I made this use a similar format to how we
display various properties in text based EXPLAIN.

I noticed that the existing code has a set of global variables that
keeps track of accesses, collisions and expansions for *all* tables in
the backend. This didn't fit well with the single line elog. I kinda
though the global information was a bit strange, so I just got rid of
it. There was no "expansions" field to track the number of expansions
for a single table, so I added one. I made all these uint64. I felt
long (which can be 32-bit on some platforms) was too small for
tracking the number of hash table accesses. Likely uint64 is too wide
to track the expansions. That likely could be made smaller, but since
these are not enabled by default, they're not taking up any struct
space in normal builds.

This is from using the debugger:
2025-08-17 00:04:52.206 NZST [962296] DEBUG:  hash_stats:  Caller:
(unknown)  Table Name: "Relcache by OID"  Accesses: 194  Collisions:
23  Expansions: 0  Entries: 139  Key Size: 4  Max Bucket: 511  Segment
Count: 2

and some samples from the log:
2025-08-17 00:05:13.241 NZST [962325] DEBUG:  init_htab: Table Name:
"Relcache by OID"  Directory Size: 256  Segment Size: 256  Segment
Shift: 8  Max Bucket: 511  High Mask: 3ff  Low Mask: 1ff  Number
Segments: 2
2025-08-17 00:05:13.241 NZST [962325] DEBUG:  init_htab: Table Name:
"Portal hash"  Directory Size: 256  Segment Size: 256  Segment Shift:
8  Max Bucket: 15  High Mask: 1f  Low Mask: f  Number Segments: 1
2025-08-17 00:05:13.241 NZST [962325] DEBUG:  init_htab: Table Name:
"smgr relation table"  Directory Size: 256  Segment Size: 256  Segment
Shift: 8  Max Bucket: 511  High Mask: 3ff  Low Mask: 1ff  Number
Segments: 2
2025-08-17 00:05:13.243 NZST [962325] DEBUG:  init_htab: Table Name:
"Operator class cache"  Directory Size: 256  Segment Size: 256
Segment Shift: 8  Max Bucket: 63  High Mask: 7f  Low Mask: 3f  Number
Segments: 1

David

Вложения

Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
Michael Paquier
Дата:
On Sun, Aug 17, 2025 at 12:24:29AM +1200, David Rowley wrote:
> I noticed that the existing code has a set of global variables that
> keeps track of accesses, collisions and expansions for *all* tables in
> the backend. This didn't fit well with the single line elog. I kinda
> though the global information was a bit strange, so I just got rid of
> it. There was no "expansions" field to track the number of expansions
> for a single table, so I added one. I made all these uint64. I felt
> long (which can be 32-bit on some platforms) was too small for
> tracking the number of hash table accesses. Likely uint64 is too wide
> to track the expansions. That likely could be made smaller, but since
> these are not enabled by default, they're not taking up any struct
> space in normal builds.

-#ifdef HASH_STATISTICS
-static long hash_accesses,
-            hash_collisions,
-            hash_expansions;
-#endif

These global counters are as old as d31084e9d111.  Removing them
should not be a problem.

Using DEBUG4 looks sensible here for this purpose.

Side thing..  I'm wondering what prevents us from wiping out entirely
the use of long in this file.  long is 8 bytes everywhere, except on
WIN32 where it's 4 bytes (as you say), which is a bad practice as we
have been bitten by overflows because of this dependency in the patch.
Not related to this patch, still seems worth cleaning up while looking
at this code.  There are a few more things like HASHCTL, of course..
--
Michael

Вложения

Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
Michael Paquier
Дата:
On Sun, Aug 17, 2025 at 03:54:16PM +0900, Michael Paquier wrote:
> Side thing..  I'm wondering what prevents us from wiping out entirely
> the use of long in this file.  long is 8 bytes everywhere, except on
> WIN32 where it's 4 bytes (as you say), which is a bad practice as we
> have been bitten by overflows because of this dependency in the patch.

...  "this dependency in the past", not "patch".
--
Michael

Вложения

Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
David Rowley
Дата:
On Sun, 17 Aug 2025 at 18:54, Michael Paquier <michael@paquier.xyz> wrote:
> -#ifdef HASH_STATISTICS
> -static long hash_accesses,
> -                       hash_collisions,
> -                       hash_expansions;
> -#endif
>
> These global counters are as old as d31084e9d111.  Removing them
> should not be a problem.

Ancient!

> Side thing..  I'm wondering what prevents us from wiping out entirely
> the use of long in this file.  long is 8 bytes everywhere, except on
> WIN32 where it's 4 bytes (as you say), which is a bad practice as we
> have been bitten by overflows because of this dependency in the patch.
> Not related to this patch, still seems worth cleaning up while looking
> at this code.  There are a few more things like HASHCTL, of course..

No objections here. I think we're generally chipping away at that
problem anyway. At least, I did some of that recently, and I recall
Tom adjusted things to allow windows to have > 2GB work_mem, which was
a restriction imposed by 32-bit longs.

One last thing, in order to inform people of breakages sooner than a
post-commit report from the buildfarm, I wondered is if we should do:

-#ifdef HASH_DEBUG
+#if defined(HASH_DEBUG) || defined(USE_ASSERT_CHECKING)

The HASH_DEBUG does not add any extra fields, so the overhead only
amounts to the elog(DEBUG4) line.  HASH_STATISTICS adds extra fields
and counter incrementing, so I don't propose the same treatment for
that.

David



Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
Michael Paquier
Дата:
On Mon, Aug 18, 2025 at 12:56:02PM +1200, David Rowley wrote:
> One last thing, in order to inform people of breakages sooner than a
> post-commit report from the buildfarm, I wondered is if we should do:
>
> -#ifdef HASH_DEBUG
> +#if defined(HASH_DEBUG) || defined(USE_ASSERT_CHECKING)
>
> The HASH_DEBUG does not add any extra fields, so the overhead only
> amounts to the elog(DEBUG4) line.  HASH_STATISTICS adds extra fields
> and counter incrementing, so I don't propose the same treatment for
> that.

If we do that, I guess that we could just remove HASH_DEBUG, keeping
only HASH_STATISTICS.
--
Michael

Вложения

Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
David Rowley
Дата:
On Mon, 18 Aug 2025 at 13:10, Michael Paquier <michael@paquier.xyz> wrote:
> On Mon, Aug 18, 2025 at 12:56:02PM +1200, David Rowley wrote:
> > -#ifdef HASH_DEBUG
> > +#if defined(HASH_DEBUG) || defined(USE_ASSERT_CHECKING)
> >
> > The HASH_DEBUG does not add any extra fields, so the overhead only
> > amounts to the elog(DEBUG4) line.  HASH_STATISTICS adds extra fields
> > and counter incrementing, so I don't propose the same treatment for
> > that.
>
> If we do that, I guess that we could just remove HASH_DEBUG, keeping
> only HASH_STATISTICS.

I wondered about that and thought that there might be an above zero
chance that someone would want HASH_DEBUG without USE_ASSERT_CHECKING.
I don't really know if that person exists. It certainly isn't me.

David



Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
Tom Lane
Дата:
David Rowley <dgrowleyml@gmail.com> writes:
> On Mon, 18 Aug 2025 at 13:10, Michael Paquier <michael@paquier.xyz> wrote:
>> If we do that, I guess that we could just remove HASH_DEBUG, keeping
>> only HASH_STATISTICS.

> I wondered about that and thought that there might be an above zero
> chance that someone would want HASH_DEBUG without USE_ASSERT_CHECKING.
> I don't really know if that person exists. It certainly isn't me.

Yeah, it's really quite unclear what the existing HASH_DEBUG printout
is good for.  At least in our usage, it doesn't tell you anything
you can't discover from static code analysis.  I'm +1 for just
dropping it altogether.

            regards, tom lane



Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
David Rowley
Дата:
On Mon, 18 Aug 2025 at 13:26, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <dgrowleyml@gmail.com> writes:
> > I wondered about that and thought that there might be an above zero
> > chance that someone would want HASH_DEBUG without USE_ASSERT_CHECKING.
> > I don't really know if that person exists. It certainly isn't me.
>
> Yeah, it's really quite unclear what the existing HASH_DEBUG printout
> is good for.  At least in our usage, it doesn't tell you anything
> you can't discover from static code analysis.  I'm +1 for just
> dropping it altogether.

I'm starting to lean more towards that myself. I had mostly just been
motivated to finding a way to prevent it from existing in a broken
state again.

HASH_STATISTICS I can imagine is more useful as that information isn't
otherwise recorded anywhere.

David



Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
Michael Paquier
Дата:
On Mon, Aug 18, 2025 at 02:19:06PM +1200, David Rowley wrote:
> On Mon, 18 Aug 2025 at 13:26, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Yeah, it's really quite unclear what the existing HASH_DEBUG printout
>> is good for.  At least in our usage, it doesn't tell you anything
>> you can't discover from static code analysis.  I'm +1 for just
>> dropping it altogether.
>
> I'm starting to lean more towards that myself. I had mostly just been
> motivated to finding a way to prevent it from existing in a broken
> state again.

+1.

> HASH_STATISTICS I can imagine is more useful as that information isn't
> otherwise recorded anywhere.

By the way, once we have reached a conclusion here, I'll go update one
of my animals to use what's remaining of the flags, so as this is
captured in the future.
--
Michael

Вложения

Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
David Rowley
Дата:
On Mon, 18 Aug 2025 at 17:06, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Aug 18, 2025 at 02:19:06PM +1200, David Rowley wrote:
> > On Mon, 18 Aug 2025 at 13:26, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Yeah, it's really quite unclear what the existing HASH_DEBUG printout
> >> is good for.  At least in our usage, it doesn't tell you anything
> >> you can't discover from static code analysis.  I'm +1 for just
> >> dropping it altogether.
> >
> > I'm starting to lean more towards that myself. I had mostly just been
> > motivated to finding a way to prevent it from existing in a broken
> > state again.
>
> +1.

Ok. I've attached 2 patches. 0001 adjusted the HASH_STATISTICS output
to use DEBUG4 and 0002 removes HASH_DEBUG.

I've purposefully left references to HASH_DEBUG in the "Original
comments" section near the top of dynahash.c. That comment also
references function names that no longer exist.

> > HASH_STATISTICS I can imagine is more useful as that information isn't
> > otherwise recorded anywhere.
>
> By the way, once we have reached a conclusion here, I'll go update one
> of my animals to use what's remaining of the flags, so as this is
> captured in the future.

Sounds good. Thanks.

David

Вложения

Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
Michael Paquier
Дата:
On Mon, Aug 18, 2025 at 05:55:33PM +1200, David Rowley wrote:
> I've purposefully left references to HASH_DEBUG in the "Original
> comments" section near the top of dynahash.c. That comment also
> references function names that no longer exist.

Ah, right, the hcreate() and hdestroy() business.  LGTM.
--
Michael

Вложения

Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
Tom Lane
Дата:
David Rowley <dgrowleyml@gmail.com> writes:
> I've purposefully left references to HASH_DEBUG in the "Original
> comments" section near the top of dynahash.c. That comment also
> references function names that no longer exist.

Hm, there is actually no part of that comment para that is accurate
anymore, so I cannot imagine a reason for leaving it alone.  People
who want to do archaeology can consult the git history.  How about
reducing the para to

 * HASH_STATISTICS causes some usage statistics to be maintained,
 * which can be logged by calling hash_stats().

LGTM otherwise.

            regards, tom lane



Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
David Rowley
Дата:
On Tue, 19 Aug 2025 at 01:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <dgrowleyml@gmail.com> writes:
> > I've purposefully left references to HASH_DEBUG in the "Original
> > comments" section near the top of dynahash.c. That comment also
> > references function names that no longer exist.
>
> Hm, there is actually no part of that comment para that is accurate
> anymore, so I cannot imagine a reason for leaving it alone.  People
> who want to do archaeology can consult the git history.  How about
> reducing the para to
>
>  * HASH_STATISTICS causes some usage statistics to be maintained,
>  * which can be logged by calling hash_stats().

Ok. I pushed it like that. Thanks for looking.

David



Re: Compilation issues for HASH_STATISTICS and HASH_DEBUG options

От
Michael Paquier
Дата:
On Tue, Aug 19, 2025 at 11:21:06AM +1200, David Rowley wrote:
> Ok. I pushed it like that. Thanks for looking.

Added a -DHASH_STATISTICS to batta, that runs only the master branch.
--
Michael

Вложения