Re: Page Checksums + Double Writes

Поиск

Список

Период

Сортировка

От	Robert Haas
Тема	Re: Page Checksums + Double Writes
Дата	23 декабря 2011 г. 12:57:12
Msg-id	CA+Tgmobo8o-_r1Vdc6kWxRSPWCwpjbquB2ww4epi0dNzQPTFwQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Page Checksums + Double Writes ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Ответы	Re: Page Checksums + Double Writes Re: Page Checksums + Double Writes
Список	pgsql-hackers

Дерево обсуждения

On Fri, Dec 23, 2011 at 11:14 AM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> Thoughts?

Those are good thoughts.

Here's another random idea, which might be completely nuts.  Maybe we
could consider some kind of summarization of CLOG data, based on the
idea that most transactions commit.  We introduce the idea of a CLOG
rollup page.  On a CLOG rollup page, each bit represents the status of
N consecutive XIDs.  If the bit is set, that means all XIDs in that
group are known to have committed.  If it's clear, then we don't know,
and must fall through to a regular CLOG lookup.

If you let N = 1024, then 8K of CLOG rollup data is enough to
represent the status of 64 million transactions, which means that just
a couple of pages could cover as much of the XID space as you probably
need to care about.  Also, you would need to replace CLOG summary
pages in memory only very infrequently.  Backends could test the bit
without any lock.  If it's set, they do pg_read_barrier(), and then
check the buffer label to make sure it's still the summary page they
were expecting.  If so, no CLOG lookup is needed.  If the page has
changed under us or the bit is clear, then we fall through to a
regular CLOG lookup.

An obvious problem is that, if the abort rate is significantly
different from zero, and especially if the aborts are randomly mixed
in with commits rather than clustered together in small portions of
the XID space, the CLOG rollup data would become useless.  On the
other hand, if you're doing 10k tps, you only need to have a window of
a tenth of a second or so where everything commits in order to start
getting some benefit, which doesn't seem like a stretch.

Perhaps the CLOG rollup data wouldn't even need to be kept on disk.
We could simply have bgwriter (or bghinter) set the rollup bits in
shared memory for new transactions, as it becomes possible to do so,
and let lookups for XIDs prior to the last shutdown fall through to
CLOG.  Or, if that's not appealing, we could reconstruct the data in
memory by groveling through the CLOG pages - or maybe just set summary
bits only for CLOG pages that actually get faulted in.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Page Checksums + Double Writes