"SMgrRelation hashtable corrupted" failure identified
От | Tom Lane |
---|---|
Тема | "SMgrRelation hashtable corrupted" failure identified |
Дата | |
Msg-id | 13484.1105373398@sss.pgh.pa.us обсуждение исходный текст |
Ответы |
Re: "SMgrRelation hashtable corrupted" failure identified
|
Список | pgsql-hackers |
We've seen a few reports of the above-mentioned error message from PG 8.0 testers, but up till now no one had come up with a reproducible test case. I've now found a trivial example: session 1: create table a1 (f1 varchar(128)); session 2: insert into a1 values('abc'); session 1: alter table a1 alter column f1 type varchar(256); session 2: insert into a1 values('abcd'); session 2 fails with ERROR: SMgrRelation hashtable corrupted continued use of session 2 leads to a crash Many if not all scenarios involving a rewriting ALTER TABLE on a table in active use by other backends will fail like this. I believe there are probably similar failures involving CLUSTER, though a quick try didn't show it. This seems clearly to be a "must fix for 8.0" bug. The basic problem is that when ALTER TABLE tries to swap the physical files associated with the original table and the temp version of the table, it sends out relcache inval events for all four combinations of table OID and relfilenode. Because inval.c is a bit cavalier about the ordering of inval events, the one that session 2 sees first is the one for <temp table OID, old relfilenode>. It does not find a relcache entry for the temp table OID, but it does find an smgr table entry for the relfilenode, which it proceeds to drop. Now there is a dangling smgr reference in its relcache, so when it next gets hit with a relcache clear event for the original table OID, boom! I fooled around with trying to patch this by enforcing the "right" processing order of inval events, but that doesn't work (it just moves the failure into the sending backend, which it turns out would need a different processing order to avoid crashing). It would be a horribly fragile solution anyway. I now think that the only reasonable fix is to directly attack the problem of dangling relcache references to smgr table entries. What we can do is add a concept of an "owning pointer" to an smgr entry, that is an "SMgrRelation *myowner" field, and have smgrclose do something likeif (reln->myowner) *(reln->myowner) = NULL; For smgr table entries associated with a relcache entry, the relcache code would set this field as a back link to its rel->rd_smgr pointer. With this setup, an smgr-level clear would correctly unhook from the relcache even if the clear did not come directly through the relcache. This would simplify RelationCacheInvalidateEntry and LocalExecuteInvalidationMessage, which could then treat relcache clear and smgr clear as independent operations. Comments? regards, tom lane
В списке pgsql-hackers по дате отправления: