Re: Page Checksums + Double Writes
От | Jignesh Shah |
---|---|
Тема | Re: Page Checksums + Double Writes |
Дата | |
Msg-id | CAGvK12ULTkYVs_6OXMv-5EH3APXxC74R-w-17tmiVu9MyN2j+g@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Page Checksums + Double Writes (Robert Haas <robertmhaas@gmail.com>) |
Список | pgsql-hackers |
On Thu, Dec 22, 2011 at 3:04 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, Dec 22, 2011 at 1:50 PM, Jignesh Shah <jkshah@gmail.com> wrote: >> In the double write implementation, every checkpoint write is double >> writed, > > Unless I'm quite thoroughly confused, which is possible, the double > write will need to happen the first time a buffer is written following > each checkpoint. Which might mean the next checkpoint, but it could > also be sooner if the background writer kicks in, or in the worst case > a buffer has to do its own write. > Logically the double write happens for every checkpoint write and it gets fsynced.. Implementation wise you can do a chunk of those pages like we do in sets of pages and sync them once and yes it still performs better than full_page_write. As long as you compare with full_page_write=on, the scheme is always much better. If you compare it with performance of full_page_write=off it is slightly less but then you lose the the reliability. So for performance testers like me who always turn off full_page_write anyway during my benchmark run will not see any impact. However for folks in production who are rightly scared to turn off full_page_write will have an ability to increase performance without being scared on failed writes. > Furthermore, we can't *actually* write any pages until they are > written *and fsync'd* to the double-write buffer. So the penalty for > the background writer failing to do the right thing is going to go up > enormously. Think about VACUUM or COPY IN, using a ring buffer and > kicking out its own pages. Every time it evicts a page, it is going > to have to doublewrite the buffer, fsync it, and then write it for > real. That is going to make PostgreSQL 6.5 look like a speed demon. Like I said implementation detail wise it depends on how many such pages do you sync simultaneously and the real tests prove that it is actually much faster than one expects. > The background writer or checkpointer can conceivably dump a bunch of > pages into the doublewrite area and then fsync the whole thing in > bulk, but a backend that needs to evict a page only wants one page, so > it's pretty much screwed. > Generally what point you pay the penalty is a trade off.. I would argue that you are making me pay for the full page write for my first transaction commit that changes the page which I can never avoid and the result is I get a transaction response time that is unacceptable since the deviation of a similar transaction which modifies the page already made dirty is lot less. However I can avoid page evictions if I select a bigger bufferpool (not necessarily that I want to do that but I have a choice without losing reliability). Regards, Jignesh > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: