Обсуждение: Re: pgsql: Allow users to limit storage reserved by replication slots
On 2020-Apr-07, Alvaro Herrera wrote: > src/test/recovery/t/019_replslot_limit.pl | 217 +++++++++++++++++++++++++ I fixed the perlcritic complaint from buildfarm member crake, but there's a new one in francolin: # Failed test 'check that the slot state changes to "reserved"' # at t/019_replslot_limit.pl line 125. # got: '0/15000D8|reserved|216 bytes' # expected: '0/1500000|reserved|216 bytes' # Failed test 'check that the slot state changes to "lost"' # at t/019_replslot_limit.pl line 135. # got: '0/15000D8|lost|t' # expected: '0/1500000|lost|t' # Looks like you failed 2 tests of 13. [23:07:28] t/019_replslot_limit.pl .............. where the Perl code is: $start_lsn = $node_master->lsn('write'); $node_master->wait_for_catchup($node_standby, 'replay', $start_lsn); $node_standby->stop; # Advance WAL again without checkpoint, reducing remain by 6 MB. advance_wal($node_master, 6); # Slot gets into 'reserved' state $result = $node_master->safe_psql('postgres', "SELECT restart_lsn, wal_status, pg_size_pretty(restart_lsn - min_safe_lsn)as remain FROM pg_replication_slots WHERE slot_name = 'rep1'"); is($result, "$start_lsn|reserved|216 bytes", 'check that the slot state changes to "reserved"'); 0xD8 is 216, so this seems to be saying that the checkpoint record was skipped by the restart_lsn. I'm not clear exactly why that happened ... is this saying that a checkpoint occurred? One easy fix would be to remove the "restart_lsn" output column from the query, but do we lose test specificity? (I think the answer is no.) However, even with that change, we're still testing that a checkpoint is 216 bytes ... in other words, whenever someone changes the definition of struct CheckPoint, this test will fail. That seems unnecessary and unfriendly. I'm not sure how to improve that without also removing that column. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > I fixed the perlcritic complaint from buildfarm member crake, but > there's a new one in francolin: Other buildfarm members are showing related-but-different failures. I think this test is just plain unstable. regards, tom lane
Hi, On April 7, 2020 6:13:51 PM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote: >Alvaro Herrera <alvherre@2ndquadrant.com> writes: >> I fixed the perlcritic complaint from buildfarm member crake, but >> there's a new one in francolin: > >Other buildfarm members are showing related-but-different failures. >I think this test is just plain unstable. I have not looked at the source, but the error messages show LSNs and bytes. I can't really imagine how that could be madestable. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
On Tue, Apr 07, 2020 at 07:10:07PM -0700, Andres Freund wrote: > I have not looked at the source, but the error messages show LSNs > and bytes. I can't really imagine how that could be made stable. Another bad news is that this is page-size dependent. What if you removed pg_size_pretty() and replaced it with a condition that returns a boolean status in the result itself? -- Michael
Вложения
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > However, even with that change, we're still testing that a checkpoint is > 216 bytes ... in other words, whenever someone changes the definition of > struct CheckPoint, this test will fail. That seems unnecessary and > unfriendly. I'm not sure how to improve that without also removing that > column. I read florican's results as showing that sizeof(CheckPoint) is already different on 32-bit machines than 64-bit; it's repeatably getting this: # Failed test 'check that the slot state changes to "reserved"' # at t/019_replslot_limit.pl line 125. # got: '0/15000C0|reserved|192 bytes' # expected: '0/15000C0|reserved|216 bytes' This test case was *not* well thought out. regards, tom lane