Re: Disaster!
От | Tom Lane |
---|---|
Тема | Re: Disaster! |
Дата | |
Msg-id | 4221.1074892864@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: Disaster! (Martín Marqués<martin@bugs.unl.edu.ar>) |
Ответы |
Re: Disaster!
|
Список | pgsql-hackers |
Martín Marqués <martin@bugs.unl.edu.ar> writes: > Tom, could you give a small insight on what occurred here, why those > 8k of zeros fixed it, and what is a "WAL replay"? I think what happened is that there was insufficient space to write out a new page of the clog (transaction commit) file. This would result in a database panic, which is fine --- you're not gonna get much done anyway if you are down to zero free disk space. However, after Chris freed up space, the system needed to replay the WAL from the last checkpoint to ensure consistency. The WAL entries evidently included references to transactions whose commit bits were in the unwritten page. Now there would also be WAL entries recording those commits, so once the replay was complete everything would be cool. But the clog access code evidently got confused by being asked to read a page that didn't exist in the file. I'm not sure yet how that sequence of events occurred, which is why I asked Chris for a stack trace. Adding a page of zeroes fixed it by eliminating the read error condition. It was okay to do so because zeroes is the correct initial state for a clog page (all transactions in it "still in progress"). After WAL replay, any completed transactions would be updated in the page. regards, tom lane
В списке pgsql-hackers по дате отправления: