Re: Load distributed checkpoint
От | ITAGAKI Takahiro |
---|---|
Тема | Re: Load distributed checkpoint |
Дата | |
Msg-id | 20061208131001.6655.ITAGAKI.TAKAHIRO@oss.ntt.co.jp обсуждение исходный текст |
Ответ на | Re: Load distributed checkpoint (Ron Mayer <rm_pg@cheapcomplexdevices.com>) |
Список | pgsql-hackers |
Ron Mayer <rm_pg@cheapcomplexdevices.com> wrote: > >> 1. Query information (REDO pointer, next XID etc.) > >> 2. Write dirty pages in buffer pool > >> 3. Flush all modified files > >> 4. Update control file > > > > Hmm. Isn't it possible that step 3 affects the performance greatly? > > I'm sorry if you have already identified step 2 as disturbing > > backends. > > It seems to me that virtual memory settings of the OS will determine > if step 2 or step 3 causes much of the actual disk I/O. > > if the dirty_expire_centisecs number is low, most write()s > from step 2 would happen before step 3 because of the pdflush daemons. Exactly. It depends on OSes, kernel settings, and filesystems. I tested the patch on Linux kernel 2.6.9-39, default settings, and ext3fs. Maybe pdflush daemons were strong enough to write dirty buffers in kernel, so step 2 was a main part and 3 was not. There are technical issues to distribute step 3. We can write buffers on a page basis, that is granular enough. However, fsync() is on a file basis (1GB), so we can only control granularity of fsync roughly. sync_file_range (http://lwn.net/Articles/178199/) or some special APIs would be a help, but there are portability issues... Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
В списке pgsql-hackers по дате отправления: