Re: [PATCHES] [WIP] shared locks
От | Tom Lane |
---|---|
Тема | Re: [PATCHES] [WIP] shared locks |
Дата | |
Msg-id | 3122.1114643140@sss.pgh.pa.us обсуждение исходный текст |
Список | pgsql-hackers |
Found another interesting thing while testing this. I got a core dump from the Assert in GetMultiXactIdMembers, complaining that it was being asked about a MultiXactId >= nextMXact. Sure enough, there was a multixact on disk, left over from a previous core-dumped test, that was larger than the nextMXact the current postmaster had started with. My interpretation of this is that the MultiXact code is violating the fundamental WAL rule, namely it is allowing data (multixact IDs in data pages) to reach disk before the relevant WAL record (here the NEXTMULTI record that should have advanced nextMXact) got to disk. It is very easy for this to happen in the current system if the buffer page LSNs aren't updated properly, because the bgwriter will be industriously dumping dirty pages in the background. AFAICS there isn't any very convenient way of propagating the true location of the NEXTMULTI record into the page LSNs of the buffers that heap_lock_tuple might stick relevant multi IDs into. What's probably the easiest solution is for XLogPutNextMultiXactId to XLogFlush the NEXTMULTI record before it returns. This is a mite annoying for concurrency (because we'll have to hold MultiXactGenLock while flushing xlog) but it should occur rarely enough to not be a huge deal. At this point you're probably wondering why OID generation hasn't got exactly the same problem, seeing that you borrowed all this logic from the OID generator. The answer is that it would have the same problem, except that an OID can only get onto disk as part of a tuple insert or update, and all such events generate xlog records that must follow any relevant NEXTOID record. Those records *will* get into the page LSNs, and so the WAL rule is enforced. So the problem would go away if heap_lock_tuple were generating any xlog record of its own, which it might be doing by the time the 2PC dust settles. Plan B would be to decide that a multi ID that's >= nextMXact isn't worthy of an Assert failure, but ought to be treated as just a dead multixact. I'm kind of inclined to do that anyway, because I am not convinced that this code guarantees no wraparound of multi IDs. Thoughts? regards, tom lane
В списке pgsql-hackers по дате отправления: