Re: checkpointer continuous flushing

Поиск

Список

Период

Сортировка

От	Amit Kapila
Тема	Re: checkpointer continuous flushing
Дата	18 августа 2015 г. 03:46:34
Msg-id	CAA4eK1K5yZJAQxyfz5BsUDDyTcic1UXdDegnCCLYFRLPGAsxQA@mail.gmail.com обсуждение исходный текст
Ответ на	Re: checkpointer continuous flushing (Fabien COELHO <coelho@cri.ensmp.fr>)
Ответы	Re: checkpointer continuous flushing
Список	pgsql-hackers

Дерево обсуждения

On Tue, Aug 18, 2015 at 1:02 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Hello Andres,

[...] posix_fadvise().

My current thinking is "maybe yes, maybe no":-), as it may depend on the OS
implementation of posix_fadvise, so it may differ between OS.

As long as fadvise has no 'undirty' option, I don't see how that
problem goes away. You're telling the OS to throw the buffer away, so
unless it ignores it that'll have consequences when you read the page
back in.

Yep, probably.

Note that we are talking about checkpoints, which "write" buffers out *but* keep them nevertheless. As the buffer is kept, the OS page is a duplicate, and freeing it should not harm, at least immediatly.

This theory could makes sense if we can predict in some way that

the data we are flushing out of OS cache won't be needed soon.

After flush, we can only rely to an extent that data could be found in

shared_buffers if the usage_count is high, other wise it could be

replaced any moment by backend needing the buffer and there is no

free buffer. Now here one way to think is that if the usage_count is

low, then anyway it's okay to assume that this won't be needed in near

future, however I don't think relying only on usage_count for such a thing

is good idea.

To sum up, I agree that it is indeed possible that flushing with posix_fadvise could reduce read OS-memory hits on some systems for some workloads, although not on Linux, see below.

So the option is best kept as "off" for now, without further data, I'm fine with that.

One point to think here is on what basis user can decide make

this option on, is it predictable in any way?

I think one case could be when the data set fits in shared_buffers.

In general, providing an option is a good idea if user can decide with

ease when to use that option or we can give some clear recommendation

for the same otherwise one has to recommend that test your workload

with this option and if it works then great else don't use it which might also

be okay in some cases, but it is better to be clear.

One minor point, while glancing through the patch, I noticed that couple

of multiline comments are not written in the way which is usually used

in code (Keep the first line as empty).

+/* Status of buffers to checkpoint for a particular tablespace,

+ * used internally in BufferSync.

+ * - space: oid of the tablespace

+ * - num_to_write: number of checkpoint pages counted for this tablespace

+ * - num_written: number of pages actually written out

+/* entry structure for table space to count hashtable,

+ * used internally in BufferSync.

+ */

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: checkpointer continuous flushing