Re: [HACKERS] Re: [QUESTIONS] Does Storage Manager support >2GB tables?
От | Bruce Momjian |
---|---|
Тема | Re: [HACKERS] Re: [QUESTIONS] Does Storage Manager support >2GB tables? |
Дата | |
Msg-id | 199803122113.QAA06519@candle.pha.pa.us обсуждение исходный текст |
Ответ на | Re: [HACKERS] Re: [QUESTIONS] Does Storage Manager support >2GB tables? (dg@illustra.com (David Gould)) |
Ответы |
Re: [HACKERS] Re: [QUESTIONS] Does Storage Manager support >2GB tables?
|
Список | pgsql-hackers |
> At least on the systems I am intimately familiar with, the prefetch that the > OS does (assuming a modern OS like Linux) is pretty hard to beat. If you have > a table that was bulk loaded in key order, a sequential scan is going to > result in a sequential access pattern to the underlying file and the OS > prefetch does the right thing. If you have an unindexed table with rows > inserted at the end, the OS prefetch still works. If you are using a secondary > index on some sort of chopped up table with rows inserted willy-nilly, it > then, it may be worth doing async reads in a burst and let the disk request > sort make the best of it. > > As far as I am aware, Postgres does not do async I/O. Perhaps it should. I am adding this to the TODO list: * Do async I/O to do better read-ahead of data Because we are not threaded, we really can't do anything else while we are waiting for I/O, but we can pre-request data we know we will need. > > > Also nice so you can control what gets written to disk/fsync'ed and what doesn't > > get fsync'ed. > > This is really the big win. Yep, and this is what we are trying to work around in our buffered pg_log change. Because we have the transaction ids all compact in one place, this seems like a workable solution to our lack of write-to-disk control. We just control the pg_log writes. > > > Our idea is to control when pg_log gets written to disk. We keep active > > pg_log pages in shared memory, and every 30-60 seconds, we make a memory > > copy of the current pg_log active pages, do a system sync() (which > > happens anyway at that interval), update the pg_log file with the saved > > changes, and fsync() the pg_log pages to disk. That way, after a crash, > > the current database only shows transactions as committed where we are > > sure all the data has made it to disk. > > OK as far as it goes, but probably bad for concurrancy if I have understood > you. Interesed in hearing your comments. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
В списке pgsql-hackers по дате отправления: