Floris Van Nee <florisvannee@Optiver.com> writes:
> Hi,
> On a database we have we've recently seen a fatal error occur twice. The error happened on two different physical
replicas(of the same cluster) during a WAL redo action in the recovery process. They're running Postgres 15.5.
> Occurrence 1:
> 2024-02-01 06:55:54.476 CET,,,70290,,65a29b60.11292,6,,2024-01-13 15:17:04 CET,1/0,0,FATAL,XX000,"can only drop stats
once",,,,,"WALredo at A7BD1/D6F9B6C0 for Transaction/COMMIT: 2024-02-01 06:55:54.395851+01; ...
Hmm. This must be coming from pgstat_drop_entry_internal.
I suspect the correct fix is in pgstat_drop_entry, along
the lines of
- if (shent)
+ if (shent && !shent->dropped)
but it's not clear to me how the already-dropped case ought to affect
the function's bool result. Also, how are we getting into a
concurrent-drop situation in recovery?
regards, tom lane