Re: [PATCH] Prefetch index pages for B-Tree index scans
От | Claudio Freire |
---|---|
Тема | Re: [PATCH] Prefetch index pages for B-Tree index scans |
Дата | |
Msg-id | CAGTBQpbu2M=-M7NUr6DWr0K8gUVmXVhwKohB-Cnj7kYS1AhH4A@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [PATCH] Prefetch index pages for B-Tree index scans (John Lumby <johnlumby@hotmail.com>) |
Ответы |
Re: [PATCH] Prefetch index pages for B-Tree index scans
|
Список | pgsql-hackers |
On Thu, Nov 1, 2012 at 1:37 PM, John Lumby <johnlumby@hotmail.com> wrote: > > Claudio wrote : >> >> Oops - forgot to effectively attach the patch. >> > > I've read through your patch and the earlier posts by you and Cédric. > > This is very interesting. You chose to prefetch index btree (key-ptr) pages > whereas I chose to prefetch the data pages pointed to by the key-ptr pages. > Never mind why -- I think they should work very well together - as both have > been demonstrated to produce improvements. I will see if I can combine them, > git permitting (as of course their changed file lists overlap). Check the latest patch, it contains heap page prefetching too. > I was surprised by this design decision : > /* start prefetch on next page, but not if we're reading sequentially already, as it's counterproductive in those cases*/ > Is it really? Are you assuming the it's redundant with posix_fadvise for this case? > I think possibly when async_io is also in use by the postgresql prefetcher, > this decision could change. async_io indeed may make that logic obsolete, but it's not redundant posix_fadvise what's the trouble there, but the fact that the kernel stops doing read-ahead when a call to posix_fadvise comes. I noticed the performance hit, and checked the kernel's code. It effectively changes the prediction mode from sequential to fadvise, negating the (assumed) kernel's prefetch logic. > However I think in some environments the async-io has significant benefits over > posix-fadvise, especially (of course!) where access is very non-sequential, > but even also for sequential if there are many concurrent conflicting sets of sequential > command streams from different backends > (always assuming the RAID can manage them concurrently). I've mused about the possibility to batch async_io requests, and use the scatter/gather API insead of sending tons of requests to the kernel. I think doing so would enable a zero-copy path that could very possibly imply big speed improvements when memory bandwidth is the bottleneck.
В списке pgsql-hackers по дате отправления: