Re: checkpointer continuous flushing

Поиск

Список

Период

Сортировка

От	Amit Kapila
Тема	Re: checkpointer continuous flushing
Дата	12 января 2016 г. 13:47:54
Msg-id	CAA4eK1LxUPRyvY4SYN2T6s00v60pvEbVN+YZkfpkSAEbapbDYg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: checkpointer continuous flushing (Andres Freund <andres@anarazel.de>)
Ответы	Re: checkpointer continuous flushing
Список	pgsql-hackers

Дерево обсуждения

On Tue, Jan 12, 2016 at 5:52 PM, Andres Freund <andres@anarazel.de> wrote:
>
> On 2016-01-12 17:50:36 +0530, Amit Kapila wrote:
> > On Tue, Jan 12, 2016 at 12:57 AM, Andres Freund <andres@anarazel.de> wrote:>
> > >
> > > My theory is that this happens due to the sorting: pgbench is an update
> > > heavy workload, the first few pages are always going to be used if
> > > there's free space as freespacemap.c essentially prefers those. Due to
> > > the sorting all a relation's early pages are going to be in "in a row".
> > >
> >
> > Not sure, what is best way to tackle this problem, but I think one way could
> > be to perform sorting at flush requests level rather than before writing
> > to OS buffers.
>
> I'm not following. If you just sort a couple hundred more or less random
> buffers - which is what you get if you look in buf_id order through
> shared_buffers - the likelihood of actually finding neighbouring writes
> is pretty low.

Why can't we do it at larger intervals (relative to total amount of writes)?

To explain, what I have in mind, let us assume that checkpoint interval

is longer (10 mins) and in the mean time all the writes are being done

by bgwriter which it registers in shared memory so that later checkpoint

can perform corresponding fsync's, now when the request queue

becomes threshhold size (let us say 1/3rd) full, then we can perform

sorting and merging and issue flush hints. Checkpointer task can

also follow somewhat similar technique which means that once it

has written 1/3rd or so of buffers (which we need to track), it can

perform flush hints after sort+merge. Now, I think we can also

do it in checkpointer alone rather than in bgwriter and checkpointer.

Basically, I think this can lead to lesser merging of neighbouring

writes, but might not hurt if sync_file_range() API is cheap.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: checkpointer continuous flushing