Обсуждение: Why is LockClassinfoForUpdate()'s mark4update a good idea?
Why does LockClassinfoForUpdate() insist on doing heap_mark4update? As far as I can see, this accomplishes nothing except to break concurrent index builds. If I do create index tenk1_s1 on tenk1(stringu1);create index tenk1_s2 on tenk1(stringu2); in two psqls at approximately the same time, the second one fails with ERROR: LockStatsForUpdate couldn't lock relid 274157 which is entirely unnecessary. I don't believe that the similar code in AlterTableDropColumn() and AlterTableCreateToastTable() is a good idea either. We do not depend on "SELECT FOR UPDATE" on pg_class tuples for interlocking changes to relations; we use exclusive locks on the relations themselves for that. mark4update is unnecessary in this context. Comments? regards, tom lane
Tom Lane wrote: > > Why does LockClassinfoForUpdate() insist on doing heap_mark4update? Because I want to guard the target pg_class tuple by myself. I don't think we could rely on the assumption that the lock on the corresponding relation is held. For example, AlterTableOwner() doesn't seem to open the corresponding relation. > As far as I can see, this accomplishes nothing except to break > concurrent index builds. If I do > > create index tenk1_s1 on tenk1(stringu1); > create index tenk1_s2 on tenk1(stringu2); > > in two psqls at approximately the same time, the second one fails with > > ERROR: LockStatsForUpdate couldn't lock relid 274157 > This is my fault. The error could be avoided by retrying to acquire the lock like "SELECT FOR UPDATE" does. Regards. Hiroshi Inoue
Hiroshi Inoue <Inoue@tpf.co.jp> writes: > Tom Lane wrote: >> Why does LockClassinfoForUpdate() insist on doing heap_mark4update? > Because I want to guard the target pg_class tuple by myself. > I don't think we could rely on the assumption that the lock on > the corresponding relation is held. For example, AlterTableOwner() > doesn't seem to open the corresponding relation. Possibly AlterTableOwner is broken. Not sure that it matters though, because heap_update won't update a tuple anyway if another process committed an update first. That seems to me to be sufficient locking; exactly what is the mark4update adding? (BTW, I notice that a lot of heap_update calls don't bother to check the result code, which is probably a bug ...) >> As far as I can see, this accomplishes nothing except to break >> concurrent index builds. If I do >> >> create index tenk1_s1 on tenk1(stringu1); >> create index tenk1_s2 on tenk1(stringu2); >> >> in two psqls at approximately the same time, the second one fails with >> >> ERROR: LockStatsForUpdate couldn't lock relid 274157 > This is my fault. The error could be avoided by retrying > to acquire the lock like "SELECT FOR UPDATE" does. I have a more fundamental objection, which is that if you think that this is necessary for index creation then it is logically necessary for *all* types of updates to system catalog tuples. I do not like that answer, mainly because it will clutter the system considerably --- to no purpose. The relation-level locks are necessary anyway for schema updates, and they are sufficient if consistently applied. Pre-locking the target tuple is *not* sufficient, and I don't think it helps anyway if not consistently applied, which it certainly is not at the moment. regards, tom lane
Tom Lane wrote: > > Hiroshi Inoue <Inoue@tpf.co.jp> writes: > > Tom Lane wrote: > >> Why does LockClassinfoForUpdate() insist on doing heap_mark4update? > > > Because I want to guard the target pg_class tuple by myself. > > I don't think we could rely on the assumption that the lock on > > the corresponding relation is held. For example, AlterTableOwner() > > doesn't seem to open the corresponding relation. > > Possibly AlterTableOwner is broken. Not sure that it matters though, > because heap_update won't update a tuple anyway if another process > committed an update first. That seems to me to be sufficient locking; > exactly what is the mark4update adding? > I like neither unexpected errors nor doing the wrong thing by handling tuples which aren't guaranteed to be up-to-date. After mark4update, the tuple is guaranteed to be up-to-date and heap_update won't fail even though some commands etc neglect to lock the correspoding relation. Isn't it proper to guard myself as much as possible ? > (BTW, I notice that a lot of heap_update calls don't bother to check > the result code, which is probably a bug ...) > > >> As far as I can see, this accomplishes nothing except to break > >> concurrent index builds. If I do > >> > >> create index tenk1_s1 on tenk1(stringu1); > >> create index tenk1_s2 on tenk1(stringu2); > >> > >> in two psqls at approximately the same time, the second one fails with > >> > >> ERROR: LockStatsForUpdate couldn't lock relid 274157 > > > This is my fault. The error could be avoided by retrying > > to acquire the lock like "SELECT FOR UPDATE" does. > > I have a more fundamental objection, which is that if you think that > this is necessary for index creation then it is logically necessary for > *all* types of updates to system catalog tuples. I do not like that > answer, mainly because it will clutter the system considerably --- > to no purpose. The relation-level locks are necessary anyway for schema > updates, and they are sufficient if consistently applied. Pre-locking > the target tuple is *not* sufficient, and I don't think it helps anyway > if not consistently applied, which it certainly is not at the moment. > > regards, tom lane
Hiroshi Inoue <Inoue@tpf.co.jp> writes: > I like neither unexpected errors nor doing the wrong > thing by handling tuples which aren't guaranteed to > be up-to-date. After mark4update, the tuple is > guaranteed to be up-to-date and heap_update won't > fail even though some commands etc neglect to lock > the correspoding relation. Isn't it proper to guard > myself as much as possible ? If one piece of the system "guards itself" and others do not, what have you gained? Not much. What I want is a consistently applied coding rule that protects all commands; and the simpler that coding rule is, the more likely it is to be consistently applied. I do not think that adding mark4update improves matters when seen in this light. The code to do it is bulky and error-prone, and I have no confidence that it will be done right everywhere. In fact, at the moment I'm not convinced that it's done right anywhere. The uses of mark4update for system-catalog updates are all demonstrably broken right now, and the ones in the executor make use of a hugely complex and probably buggy qualification re-evaluation mechanism. What is the equivalent of qual re-evaluation for a system catalog tuple, anyway? regards, tom lane
Tom Lane wrote: > > Hiroshi Inoue <Inoue@tpf.co.jp> writes: > > I like neither unexpected errors nor doing the wrong > > thing by handling tuples which aren't guaranteed to > > be up-to-date. After mark4update, the tuple is > > guaranteed to be up-to-date and heap_update won't > > fail even though some commands etc neglect to lock > > the correspoding relation. Isn't it proper to guard > > myself as much as possible ? > > If one piece of the system "guards itself" and others do not, what have > you gained? Not much. ??? The system guarding itself won't gain bad result at least. If one piece of system "guards others" and others do not, both may gain bad results. Locking a class info by locking the corrsponding relation is such a mechanism. However I don't think we could introduce this mechanism to all system catalogs. I implemented LockClassinfoForUpdate() by the following reason. 1) pg_class is the most significant relation. 2) LockClassinfoForUpdate() adds few new conflicts by locking the pg_class tuple because locking the corresponding relationlocks the pg_class entity implicitly unless some stuff neglects to lock corresponding relation. Regards. Hiroshi Inoue