Обсуждение: Re: Assertion failure in smgr.c when using pg_prewarm with partitioned tables
On Thu, May 15, 2025 at 2:22 PM Masahiro Ikeda <ikedamsh@oss.nttdata.com> wrote: > > Hi, > > I encountered an assertion failure when a partitioned table is specified > as an argument to pg_prewarm. Below are the steps to reproduce the > issue: > > $ pgbench -i -s 1 --partitions=3 > $ psql <<EOF > CREATE EXTENSION pg_prewarm; > SELECT pg_prewarm('pgbench_accounts'); > EOF > > The following assertion failure occurs: > > TRAP: failed Assert("RelFileNumberIsValid(rlocator.relNumber)"), File: > "smgr.c", Line: 246, PID: 1246282 > postgres: ikeda postgres [local] > SELECT(ExceptionalCondition+0xbb)[0x55edd16725c1] > postgres: ikeda postgres [local] SELECT(smgropen+0x5e)[0x55edd145c1ff] > > > It looks like this may have been overlooked in commit 049ef33. > What do you think? Yeah, this should be fixed, don't you think that instead of checking the relnumber is valid, we shall check whether it's a partitioned rel and give a separate error that prewarm is not supported for partitioned tables? And in fact, we can think of supporting this for the partitioned tables as well in the future, where we can loop through all the partitions and do prewarm for each of them. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On 2025/05/15 18:20, Dilip Kumar wrote: > On Thu, May 15, 2025 at 2:22 PM Masahiro Ikeda <ikedamsh@oss.nttdata.com> wrote: >> >> Hi, >> >> I encountered an assertion failure when a partitioned table is specified >> as an argument to pg_prewarm. Below are the steps to reproduce the >> issue: Thanks for the report! This assertion failure can also occur when pg_prewarm() is run on objects like foreign tables, plain views, or other relations that don't have storage - not just partitioned tables. >> $ pgbench -i -s 1 --partitions=3 >> $ psql <<EOF >> CREATE EXTENSION pg_prewarm; >> SELECT pg_prewarm('pgbench_accounts'); >> EOF >> >> The following assertion failure occurs: >> >> TRAP: failed Assert("RelFileNumberIsValid(rlocator.relNumber)"), File: >> "smgr.c", Line: 246, PID: 1246282 >> postgres: ikeda postgres [local] >> SELECT(ExceptionalCondition+0xbb)[0x55edd16725c1] >> postgres: ikeda postgres [local] SELECT(smgropen+0x5e)[0x55edd145c1ff] >> >> >> It looks like this may have been overlooked in commit 049ef33. >> What do you think? > > Yeah, this should be fixed, don't you think that instead of checking > the relnumber is valid, +1 How about adding a check to see whether the target relation has storage, using something like RELKIND_HAS_STORAGE()? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On Thu, May 15, 2025 at 3:14 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > > > > On 2025/05/15 18:20, Dilip Kumar wrote: > > On Thu, May 15, 2025 at 2:22 PM Masahiro Ikeda <ikedamsh@oss.nttdata.com> wrote: > >> > >> Hi, > >> > >> I encountered an assertion failure when a partitioned table is specified > >> as an argument to pg_prewarm. Below are the steps to reproduce the > >> issue: > > Thanks for the report! > > This assertion failure can also occur when pg_prewarm() is run on objects like > foreign tables, plain views, or other relations that don't have storage - not just > partitioned tables. > > > >> $ pgbench -i -s 1 --partitions=3 > >> $ psql <<EOF > >> CREATE EXTENSION pg_prewarm; > >> SELECT pg_prewarm('pgbench_accounts'); > >> EOF > >> > >> The following assertion failure occurs: > >> > >> TRAP: failed Assert("RelFileNumberIsValid(rlocator.relNumber)"), File: > >> "smgr.c", Line: 246, PID: 1246282 > >> postgres: ikeda postgres [local] > >> SELECT(ExceptionalCondition+0xbb)[0x55edd16725c1] > >> postgres: ikeda postgres [local] SELECT(smgropen+0x5e)[0x55edd145c1ff] > >> > >> > >> It looks like this may have been overlooked in commit 049ef33. > >> What do you think? > > > > Yeah, this should be fixed, don't you think that instead of checking > > the relnumber is valid, > > +1 > > How about adding a check to see whether the target relation has storage, > using something like RELKIND_HAS_STORAGE()? Yeah, that makes more sense. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On Thu, May 15, 2025 at 6:50 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > On Thu, May 15, 2025 at 3:14 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > > How about adding a check to see whether the target relation has storage, > > using something like RELKIND_HAS_STORAGE()? > Yeah, that makes more sense. +1. FWIW, not long ago we fixed a similar Assert failure in contrib/pg_freespacemap by verifying RELKIND_HAS_STORAGE() before trying to access the storage (see 4623d7144). Wondering if there are other similar issues elsewhere in contrib ... Thanks Richard
On Fri, May 16, 2025 at 11:51 AM Masahiro Ikeda <ikedamsh@oss.nttdata.com> wrote: > > Thank you for your comments! > > I updated the patch to use RELKIND_HAS_STORAGE() as done in > commit 4623d7144. Please see the v2-0001 patch for the changes. > > On 2025-05-15 19:57, Richard Guo wrote: > > +1. FWIW, not long ago we fixed a similar Assert failure in > > contrib/pg_freespacemap by verifying RELKIND_HAS_STORAGE() before > > trying to access the storage (see 4623d7144). Wondering if there are > > other similar issues elsewhere in contrib ... > > I tested all contrib functions that take regclass arguments (see > attached test.sql and test_result.txt). The result shows that only > pg_prewarm() can lead to the assertion failure. Right, even while I was searching, I see this is used in 3 contrib modules, amcheck, autoprewarm, and pg_prewarm. amcheck is already checking for AM type, and Autoprewarm is identifying the relation by block, so there is no chance of getting the relation which do not have storage. > > However, I found that amcheck's error messages can be misleading > when run on partitioned indexes. > > For example, on the master branch, amcheck (e.g., bt_index_check > function) > shows this error: > > > ERROR: expected "btree" index as targets for verification > > DETAIL: Relation "pgbench_accounts_pkey" is a btree index. > > This message says it expects a "btree" index, yet also states the > relation > is a "btree" index, which can seem contradictory. The actual issue is > that > the index is a btree partitioned index, but this detail is missing, > causing > possible confusion. Yes, I found the error message confusing during testing as well, so it makes sense to improve it. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On 2025/05/16 15:33, Dilip Kumar wrote: > On Fri, May 16, 2025 at 11:51 AM Masahiro Ikeda > <ikedamsh@oss.nttdata.com> wrote: >> >> Thank you for your comments! >> >> I updated the patch to use RELKIND_HAS_STORAGE() as done in >> commit 4623d7144. Please see the v2-0001 patch for the changes. Thanks for updating the patch! You added a test for pg_prewarm() with a partitioned table in the TAP tests. But, wouldn't it be simpler and easier to maintain if we added this as a regular regression test (i.e., using .sql and .out files) instead of TAP tests? That should be sufficient for this case? Also, since the issue was introduced in v17, this patch should be back-patched to v17, right? >> On 2025-05-15 19:57, Richard Guo wrote: >>> +1. FWIW, not long ago we fixed a similar Assert failure in >>> contrib/pg_freespacemap by verifying RELKIND_HAS_STORAGE() before >>> trying to access the storage (see 4623d7144). Wondering if there are >>> other similar issues elsewhere in contrib ... >> >> I tested all contrib functions that take regclass arguments (see >> attached test.sql and test_result.txt). The result shows that only >> pg_prewarm() can lead to the assertion failure. > > Right, even while I was searching, I see this is used in 3 contrib > modules, amcheck, autoprewarm, and pg_prewarm. amcheck is already > checking for AM type, and Autoprewarm is identifying the relation by > block, so there is no chance of getting the relation which do not have > storage. > >> >> However, I found that amcheck's error messages can be misleading >> when run on partitioned indexes. >> >> For example, on the master branch, amcheck (e.g., bt_index_check >> function) >> shows this error: >> >> > ERROR: expected "btree" index as targets for verification >> > DETAIL: Relation "pgbench_accounts_pkey" is a btree index. >> >> This message says it expects a "btree" index, yet also states the >> relation >> is a "btree" index, which can seem contradictory. The actual issue is >> that >> the index is a btree partitioned index, but this detail is missing, >> causing >> possible confusion. > > Yes, I found the error message confusing during testing as well, so it > makes sense to improve it. +1 Regarding the patch, how about also adding a regression test for amcheck with a partitioned index? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On Mon, May 19, 2025 at 5:18 PM Masahiro Ikeda <ikedamsh@oss.nttdata.com> wrote: > > Thanks for your work and feedback! > > I've updated the patches and added regular regression tests for > both pg_prewarm and amcheck. Thanks for updating the patches! Regarding the 0001 patch: +CREATE TABLE test_part1 PARTITION OF test FOR VALUES FROM (1) TO (1000); +INSERT INTO test SELECT generate_series(1, 100); These lines don't seem necessary for the test. How about removing them? It would simplify the test and slightly reduce the execution time - though the time savings are very minimal. +-- To specify the relation which does not have storage should fail. This comment could be clearer as: "pg_prewarm() should fail if the target relation has no storage." + /* Check that the storage exists. */ Maybe rephrase to: "Check that the relation has storage."? Regarding the 0002 patch: - errdetail("Relation \"%s\" is a %s index.", - RelationGetRelationName(rel), NameStr(((Form_pg_am) GETSTRUCT(amtuprel))->amname)))); + errdetail("Relation \"%s\" is a %s %sindex.", + RelationGetRelationName(rel), NameStr(((Form_pg_am) GETSTRUCT(amtuprel))->amname), + (rel->rd_rel->relkind == RELKIND_PARTITIONED_INDEX) ? "partitioned " : ""))); Instead of manually building the message, how about using errdetail_relkind_not_supported()? It would be more consistent with similar error reporting elsewhere. > I agree with back-patching v3-0001. I was able to reproduce the issue > on the REL_17_STABLE branch. Thanks for the confirmation. > One concern is that this patch changes > the error message in production: > > * v17.5 (without --enable-cassert) > > ERROR: fork "main" does not exist for this relation > > * REL_17_STABLE with the v3-0001 patch (without --enable-cassert) > > ERROR: relation "test" does not have storage > > However, I think preventing the assertion failure should take priority. Yes. Regards, -- Fujii Masao