Re: linux cachestat in file Readv and Prefetch

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: linux cachestat in file Readv and Prefetch
Дата
Msg-id 2f1236b2-7722-4471-ac80-3a47d090244c@enterprisedb.com
обсуждение исходный текст
Ответ на linux cachestat in file Readv and Prefetch  (Cedric Villemain <Cedric.Villemain+pgsql@abcSQL.com>)
Ответы Re: linux cachestat in file Readv and Prefetch  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers

On 1/18/24 01:25, Cedric Villemain wrote:
> Hi,
>
> I was testing the index prefetch and streamIO patches and I added
> cachestat() syscall to get a better view of the prefetching.
>
> It's a new linux syscall, it requires 6.5, it provides numerous
> interesting information from the VM for the range of pages examined.
> It's way way faster than the old mincore() and provides much more
> valuable information:
>
>     uint64 nr_cache;        /* Number of cached pages */
>     uint64 nr_dirty;           /* Number of dirty pages */
>     uint64 nr_writeback;  /* Number of pages marked for writeback. */
>     uint64 nr_evicted;       /* Number of pages evicted from the cache. */
>     /*
>     * Number of recently evicted pages. A page is recently evicted if its
>     * last eviction was recent enough that its reentry to the cache would
>     * indicate that it is actively being used by the system, and that
there
>     * is memory pressure on the system.
>     */
>     uint64 nr_recently_evicted;
>
>
> While here I also added some quick tweaks to suspend prefetching on
> memory pressure.

I may be missing some important bit behind this idea, but this does not
seem like a great idea to me. The comment added to FilePrefetch says this:

  /*
   * last time we visit this file (somewhere), nr_recently_evicted pages
   * of the range were just removed from vm cache, it's a sign a memory
   * pressure. so do not prefetch further.
   * it is hard to guess if it is always the right choice in absence of
   * more information like:
   *  - prefetching distance expected overall
   *  - access pattern/backend maybe
   */

Firstly, is this even a good way to detect memory pressure? It's clearly
limited to a single 1GB segment, so what's the chance we'll even see the
"pressure" on a big database with many files?

If we close/reopen the file (which on large databases we tend to do very
often) how does that affect the data reported for the file descriptor?

I'm not sure I even agree with the idea that we should stop prefetching
when there is memory pressure. IMHO it's perfectly fine to keep
prefetching stuff even if it triggers eviction of unnecessary pages from
page cache. That's kinda why the eviction exists.


> It's working but I have absolutely not checked the performance impact of
> my additions.
>

Well ... I'd argue at least some basic evaluation of performance is a
rather important / expected part of a proposal for a patch that aims to
improve a performance-focused feature. It's impossible to have any sort
of discussion about such patch without that.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Matthias van de Meent
Дата:
Сообщение: Re: automating RangeTblEntry node support
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: Add pg_basetype() function to obtain a DOMAIN base type