Re: Load Distributed Checkpoints, take 3

Поиск
Список
Период
Сортировка
От Greg Smith
Тема Re: Load Distributed Checkpoints, take 3
Дата
Msg-id Pine.GSO.4.64.0706251711430.2936@westnet.com
обсуждение исходный текст
Ответ на Re: Load Distributed Checkpoints, take 3  (Heikki Linnakangas <heikki@enterprisedb.com>)
Ответы Re: Load Distributed Checkpoints, take 3  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-patches
On Mon, 25 Jun 2007, Heikki Linnakangas wrote:

> Greg, is this the kind of workload you're having, or is there some other
> scenario you're worried about?

The way transitions between completely idle and all-out bursts happen were
one problematic area I struggled with.  Since the LRU point doesn't move
during the idle parts, and the lingering buffers have a usage_count>0, the
LRU scan won't touch them; the only way to clear out a bunch of dirty
buffers leftover from the last burst is with the all-scan.  Ideally, you
want those to write during idle periods so you're completely clean when
the next burst comes.  My plan for the code I wanted to put into 8.4 one
day was to have something like the current all-scan that defers to the LRU
and checkpoint, such that if neither of them are doing anything it would
go searching for buffers it might blow out.  Because the all-scan mainly
gets in the way under heavy load right now I've only found mild settings
helpful, but if it had a bit more information about what else was going on
it could run much harder during slow spots.  That's sort of the next stage
to the auto-tuning LRU writer code in the grand design floating through my
head.

As a general comment on this subject, a lot of the work in LDC presumes
you have an accurate notion of how close the next checkpoint is.  On
systems that can dirty buffers and write WAL really fast, I've found hyper
bursty workloads are a challenge for it to cope with.  You can go from
thinking you have all sorts of time to stream the data out to discovering
the next checkpoint is coming up fast in only seconds.  In that situation,
you'd have been better off had you been writing faster during the period
preceeding the burst when the code thought it should be "smooth"[1].
That falls into the category of things I haven't found a good way for
other people to test (I happened to have an internal bursty app that
aggrevated this area to use).

[1] This is actually a reference to "Yacht Rock", one of my favorite web
sites:  http://www.channel101.com/shows/show.php?show_id=152

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

В списке pgsql-patches по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: remove SIBackendInit return value
Следующее
От: Greg Smith
Дата:
Сообщение: Re: Load Distributed Checkpoints, take 3