Re: BUG #18377: Assert false in "partdesc->nparts >= pinfo->nparts", fileName="execPartition.c", lineNumber=1943
От | Alvaro Herrera |
---|---|
Тема | Re: BUG #18377: Assert false in "partdesc->nparts >= pinfo->nparts", fileName="execPartition.c", lineNumber=1943 |
Дата | |
Msg-id | 202406121953.gfdukghim5d2@alvherre.pgsql обсуждение исходный текст |
Ответ на | Re: BUG #18377: Assert false in "partdesc->nparts >= pinfo->nparts", fileName="execPartition.c", lineNumber=1943 (Alvaro Herrera <alvherre@alvh.no-ip.org>) |
Ответы |
Re: BUG #18377: Assert false in "partdesc->nparts >= pinfo->nparts", fileName="execPartition.c", lineNumber=1943
Re: BUG #18377: Assert false in "partdesc->nparts >= pinfo->nparts", fileName="execPartition.c", lineNumber=1943 |
Список | pgsql-bugs |
On 2024-Jun-11, Alvaro Herrera wrote: > ... and actually, the code that maps partitions when these arrays don't > match is all wrong, because it assumes that they are in OID order, which > is not true, so I'm busy rewriting it. More soon. So I came up with the algorithm in the attached patch. As far as I can tell, it works okay; I've been trying to produce a test that would stress it some more, but I noticed a pgbench shortcoming that I've been trying to solve, unsuccessfully. Anyway, the idea here is that we match the partdesc->oids entries to the pinfo->relid_map entries with a distance allowance -- that is, we search for some OIDs a few elements ahead of the current position. This allows us to skip some elements that do not match, without losing sync of the correct position in the array. This works because 1) the arrays must be in the same order, that is, bound order; and 2) the amount of elements that might be missing is bounded by the difference in array lengths. As I said, I've been hammering it with some modified pgbench scripts; mainly I did this to set up: drop table if exists p; do $$ declare i int; begin for i in 0..99 loop execute format('drop table if exists p%s', i); end loop; end $$; drop sequence if exists detaches; create table p (a int, b int) partition by list (a); set client_min_messages=warning; do $$ declare i int; declare modulus int; begin for modulus in 0 .. 4 loop for i in 0..99 loop if i % 5 <> modulus then continue; end if; execute format('create table p%s partition of p for values in (%s)', i, i); end loop; end loop; end $$; reset client_min_messages; create sequence detaches; which ensures the partitions are not in OID order, and then used this pgbench script \set part random(0, 89) select pg_try_advisory_lock(:part)::integer AS gotlock \gset \if :gotlock select pg_advisory_lock(142857); alter table p detach partition p:part concurrently; select pg_advisory_unlock(142857); \set slp random(100, 200) \sleep :slp us alter table p attach partition p:part for values in (:part); select pg_advisory_unlock(:part), nextval('detaches'); \endif which detaches some partitions randomly, together with the other one \set id random(0,99) select * from p where a = :id; script which reads from the partitioned table. This setup would fail really quickly with the original code, and with the patched code it can run a total of some 6700 detach/attach cycles in 60 seconds. This seems quite slow, and in fact looking at the total number of partitions in pg_inherits, we have either 99 or 100 partitions almost the whole time. That's why I'm trying to modify pgbench ... I think the problem is that pg_advisory_lock() holds a snapshot which causes a concurrent detach partition to wait for it, or something like that. I added a \lock command, but it doesn't seem to work the way I want it to. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "I'm always right, but sometimes I'm more right than other times." (Linus Torvalds) https://lore.kernel.org/git/Pine.LNX.4.58.0504150753440.7211@ppc970.osdl.org/
Вложения
В списке pgsql-bugs по дате отправления: