WAL & SHM principles
От | Martin Devera |
---|---|
Тема | WAL & SHM principles |
Дата | |
Msg-id | Pine.LNX.4.10.10103071324550.2077-100000@luxik.cdi.cz обсуждение исходный текст |
Ответ на | CeBit (Michael Meskes <meskes@postgresql.org>) |
Ответы |
Re: WAL & SHM principles
|
Список | pgsql-hackers |
Hello, maybe I missed something, but in last days I was thinking how would I write my own sql server. I got several ideas and because these are not used in PG they are probably bad - but I can't figure why. 1) WAL We have buffer manager, ok. So why not to use WAL as part of it and don't log INSERT/UPDATE/DELETE xlog records but directly changes into buffer pages ? When someone dirties page it has to inform bmgr about dirty region and bmgr would formulate xlog record. The record could be for example fixed bitmap where each bit corresponds to part of page (of size pgsize/no-of-bits) which was changed. These changed regions follows. Multiple writes (by multiple backends) can be coalesced together as long as their transactions overlaps and there is enough memory to keep changed buffer pages in memory. Pros: upper layers can think thet buffers are always safe/logged and thereis no special handling for indices; very simple/fastredo Cons: can't implement undo - but in non-overwriting is not needed (?) 2) SHM vs. MMAP Why don't use mmap to share pages (instead of shm) ? There would be no problem with tuning pg's buffer cache size - it is balanced by OS. When using SHM there are often two copies of page: one in OS' page cache and one in SHM (vaste of memory). When using mmap the data goes (almost) directly from HDD into your memory page - now you need to copy it from OS' page to PG's page. There is one problem: how to assure that dirtied page is not flushed before its xlog. One can use mlock but you often need root privileges to use it. Another way is to implement own COW (copy on write) to create intermediate buffers used only until xlog is flushed. Are there considerations correct ? regards, devik
В списке pgsql-hackers по дате отправления: