Re: BUG #17064: Parallel VACUUM operations cause the error "global/pg_filenode.map contains incorrect checksum"
От | Heikki Linnakangas |
---|---|
Тема | Re: BUG #17064: Parallel VACUUM operations cause the error "global/pg_filenode.map contains incorrect checksum" |
Дата | |
Msg-id | 19190f79-cf37-ff18-1b40-07a1a66a1d9e@iki.fi обсуждение исходный текст |
Ответ на | Re: BUG #17064: Parallel VACUUM operations cause the error "global/pg_filenode.map contains incorrect checksum" (Thomas Munro <thomas.munro@gmail.com>) |
Ответы |
Re: BUG #17064: Parallel VACUUM operations cause the error "global/pg_filenode.map contains incorrect checksum"
|
Список | pgsql-bugs |
On 23/06/2021 12:45, Thomas Munro wrote: > On Wed, Jun 23, 2021 at 7:46 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> Let's just add the lock there. > > +1, no doubt about that. Committed that. Thanks for the report, Alexander! >> ... What about the new kid on the block: >> Persistent Memory? I found this article: >> https://lwn.net/Articles/686150/. So at hardware level, Persistent >> Memory only guarantees atomicity at cache line level (64 bytes). To >> provide the traditional 512 byte sector atomicity, there's a feature in >> Linux called BTT. Perhaps we should add a note to the docs that you >> should enable that. > > Right, also called sector mode. I don't know enough about that to > comment really, but... if my google-fu is serving me, you can't > actually use interesting sector sizes like 8KB (you have to choose 512 > or 4096 bytes), so you'll have to pay for *two* synthetic atomic page > schemes: BTT and our full page writes. That makes me wonder... if you > need to leave full page writes on anyway, maybe it would be a better > trade-off to do double writes of our special atomic files (relmapper > files and control file) so that we could safely turn BTT off and avoid > double-taxation for relation data. Just a thought. No pmem > experience here, I could be way off. Yeah, you wouldn't want to turn on BTT for anything else than the pg_control file. That's the only place where we rely on sector atomicity, I believe. For everything else, it just adds overhead. Not sure how much overhead; maybe it doesn't matter in practice. >> We haven't heard of broken control files from the field, so that doesn't >> seem to be a problem in practice, at least not yet. Still, I would sleep >> better if the control file had more redundancy. For example, have two >> copies of it on disk. At startup, read both copies, and if they're both >> valid, ignore the one with older timestamp. When updating it, write over >> the older copy. That way, if you crash in the middle of updating it, the >> old copy is still intact. > > +1, with a flush in between so that only one can be borked no matter > how the storage works. It is interesting how few reports there are on > the mailing list of a control file CRC check failures though, if I'm > searching for the right thing[1]. > > [1] https://www.postgresql.org/search/?m=1&q=calculated+CRC+checksum+does+not+match+value+stored+in+file&l=&d=-1&s=r If anyone wants a write a patch for that, I'd be happy to review it. And if anyone has access to a system with pmem hardware, it would be interesting to try to reproduce a torn sector and broken control file by pulling the power plug. - Heikki
В списке pgsql-bugs по дате отправления: