Обсуждение: batch write of dirty buffers

Поиск
Список
Период
Сортировка

batch write of dirty buffers

От
"Qingqing Zhou"
Дата:
In checkpoint and background writer, we flush out dirty buffer pages one
page one time. Is it possible to do in a batch mode? That is, try to find
out the continous page(same tblNode, relNode, adjacent blockNum), then write
them together?
To find out continous pages, most cases can be handled by just a qsort() of
candidate dirty pages, exceptional conditions may include the segment
boundary check if we don't let OS manage file size. This change will reduce
the write times, esp. if the database is in a batch update mode. We except
to write hundreds of pages by issuing just one smgrwrite().

There are other two points may need attentions. One is in function
StartBufferIO(), which asserts InProgressBuf, that is, we can just do one
page write one time. I am not quite sure the consequence if we remove this
variable. The other is that since we will acquire many locks on the buffer
page, so we may have to increase MAX_SIMUL_LWLOCKS. This should not be a
problem.

What's your ideas?

Regards,

Qingqing
http://www.cs.toronto.edu/~zhouqq











Re: batch write of dirty buffers

От
Tom Lane
Дата:
"Qingqing Zhou" <zhouqq@cs.toronto.edu> writes:
> In checkpoint and background writer, we flush out dirty buffer pages one
> page one time. Is it possible to do in a batch mode? That is, try to find
> out the continous page(same tblNode, relNode, adjacent blockNum), then write
> them together?

What for?  The kernel will have its own ideas about scheduling the
physical writes, anyway.  We are not flushing anything directly to disk
here, we are just pushing pages out to kernel buffers.

> There are other two points may need attentions. One is in function
> StartBufferIO(), which asserts InProgressBuf, that is, we can just do one
> page write one time. I am not quite sure the consequence if we remove this
> variable. The other is that since we will acquire many locks on the buffer
> page, so we may have to increase MAX_SIMUL_LWLOCKS. This should not be a
> problem.

If the bgwriter tries to lock more than one shared buffer at a time,
you will inevitably get deadlocks.  I don't actually see the point
of doing that anyway, even assuming that it's worth trying to do the
writes in block-number order.  It would hardly ever be the case that
successive pages would be located in adjacent shared buffers, and so
you'd almost always end up issuing separate write commands anyway.
        regards, tom lane