Re: checkpointer continuous flushing

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: checkpointer continuous flushing
Дата
Msg-id CAA4eK1LxUPRyvY4SYN2T6s00v60pvEbVN+YZkfpkSAEbapbDYg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: checkpointer continuous flushing  (Andres Freund <andres@anarazel.de>)
Ответы Re: checkpointer continuous flushing  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Tue, Jan 12, 2016 at 5:52 PM, Andres Freund <andres@anarazel.de> wrote:
>
> On 2016-01-12 17:50:36 +0530, Amit Kapila wrote:
> > On Tue, Jan 12, 2016 at 12:57 AM, Andres Freund <andres@anarazel.de> wrote:>
> > >
> > > My theory is that this happens due to the sorting: pgbench is an update
> > > heavy workload, the first few pages are always going to be used if
> > > there's free space as freespacemap.c essentially prefers those. Due to
> > > the sorting all a relation's early pages are going to be in "in a row".
> > >
> >
> > Not sure, what is best way to tackle this problem, but I think one way could
> > be to perform sorting at flush requests level rather than before writing
> > to OS buffers.
>
> I'm not following. If you just sort a couple hundred more or less random
> buffers - which is what you get if you look in buf_id order through
> shared_buffers - the likelihood of actually finding neighbouring writes
> is pretty low.
>

Why can't we do it at larger intervals (relative to total amount of writes)?
To explain, what I have in mind, let us assume that checkpoint interval
is longer (10 mins) and in the mean time all the writes are being done
by bgwriter which it registers in shared memory so that later checkpoint
can perform corresponding fsync's, now when the request queue
becomes threshhold size (let us say 1/3rd) full, then we can perform
sorting and merging and issue flush hints.  Checkpointer task can
also follow somewhat similar technique which means that once it
has written 1/3rd or so of buffers (which we need to track), it can
perform flush hints after sort+merge.  Now, I think we can also
do it in checkpointer alone rather than in bgwriter and checkpointer.
Basically, I think this can lead to lesser merging of neighbouring
writes, but might not hurt if sync_file_range() API is cheap.



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michal Novotny
Дата:
Сообщение: Re: Question about DROP TABLE
Следующее
От: Andres Freund
Дата:
Сообщение: Re: checkpointer continuous flushing