Re: [HACKERS] md.c is feeling much better now, thank you
От | Tom Lane |
---|---|
Тема | Re: [HACKERS] md.c is feeling much better now, thank you |
Дата | |
Msg-id | 12033.936337806@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | RE: [HACKERS] md.c is feeling much better now, thank you ("Hiroshi Inoue" <Inoue@tpf.co.jp>) |
Список | pgsql-hackers |
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes: >> Doesn't work though: the ALTER TABLE tests in regress/misc fail. >> Apparently, this change causes the sinval report from update of the >> relation's pg_class heap entry to be read while the relation has refcnt>0, >> so RelationFlushRelation doesn't flush it, so we have an obsolete >> relation cache entry. Ooops. > How about inserting "RelationDecrementReferenceCount(relation);" > between LockAcquire() and DiscardInvalid() ? Would only help if the relation had been opened exactly once before the lock; not if its refcnt is greater than 1. Worse, it would only help for the particular relation being locked, but we might receive an sinval report for a different already-locked relation. > And isn't it preferable that LockRelation() returns the relation > which RelationIdGetRelation(tag.relId) returns ? No, because that would only inform the immediate caller of LockRelation of a change. This is insufficient for both the reasons mentioned above. For that matter, my first-cut patch is insufficient, because it won't detect the case that a relcache entry other than the one currently being locked has been flushed. I think what we need to do is revise RelationFlushRelation so that it (a) deletes the relcache entry if its refcnt is zero; otherwise (b) leaves the relcache entry in existence, but recomputes all its contents and subsidiary data structures, and (c) if, while trying to recompute the contents, it finds that the relation has actually been deleted, then it's elog(ERROR) time. In this way, existing pointers to the relcache entry --- which might belong to routines very far down the call stack --- remain valid, or else we elog() if they aren't. We might still have a few bugs with routines that copy data out of the relcache entry before locking it, but I don't recall having seen any code like that. Most of the code seems to do heap_open immediately followed by LockRelation, and that should be safe. I'd like to hear Vadim's opinion before wading in, but this seems like it ought to be workable. regards, tom lane
В списке pgsql-hackers по дате отправления: