Re: WALInsertLock tuning
От | Simon Riggs |
---|---|
Тема | Re: WALInsertLock tuning |
Дата | |
Msg-id | BANLkTik3a60e6cE+1B5UU6gLXotL8-7d+w@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: WALInsertLock tuning (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
On Tue, Jun 7, 2011 at 4:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: >> On 07.06.2011 10:55, Simon Riggs wrote: >>> How would that help? > >> It doesn't matter whether the pages are zeroed while they sit in memory. >> And if you write a full page of WAL data, any wasted bytes at the end of >> the page don't matter, because they're ignored at replay anyway. The >> possibility of mistaking random garbage for valid WAL only occurs when >> we write a partial WAL page to disk. So, it is enough to zero the >> remainder of the partial WAL page (or just the next few words) when we >> write it out. > >> That's a lot cheaper than fully zeroing every page. (except for the fact >> that you'd need to hold WALInsertLock while you do it) > > I think avoiding the need to hold both locks at once is probably exactly > why the zeroing was done where it is. > > An interesting alternative is to have XLogInsert itself just plop down a > few more zeroes immediately after the record it's inserted, before it > releases WALInsertLock. This will be redundant work once the next > record gets added, but it's cheap enough to not matter IMO. As was > mentioned upthread, zeroing out the bytes that will eventually hold the > next record's xl_prev field ought to be enough to maintain a guarantee > that we won't believe the next record is valid. Lets see what the overheads are with a continuous stream of short WAL records, say xl_heap_delete records. xl header is 32 bytes, xl_heap_delete is 24 bytes. So there would be ~145 records per page. 12 byte zeroing overhead per record gives 1740 total zero bytes written per page. The overhead is at worst case less than 25% of current overhead, plus its spread out across multiple records. When we get lots of full pages into WAL just after checkpoint we don't get as much overhead - nearly every full page forces a page switch. So we're removing overhead from where it hurts the most and amortising across other records. Maths work for me. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
В списке pgsql-hackers по дате отправления: