Re: ANALYZE sampling is too good
От | Heikki Linnakangas |
---|---|
Тема | Re: ANALYZE sampling is too good |
Дата | |
Msg-id | 52A64EDE.60105@vmware.com обсуждение исходный текст |
Ответ на | Re: ANALYZE sampling is too good (Claudio Freire <klaussfreire@gmail.com>) |
Ответы |
Re: ANALYZE sampling is too good
Re: ANALYZE sampling is too good |
Список | pgsql-hackers |
On 12/09/2013 11:56 PM, Claudio Freire wrote: > On Mon, Dec 9, 2013 at 6:47 PM, Heikki Linnakangas > <hlinnakangas@vmware.com> wrote: >> On 12/09/2013 11:35 PM, Jim Nasby wrote: >>> >>> On 12/8/13 1:49 PM, Heikki Linnakangas wrote: >>>> >>>> On 12/08/2013 08:14 PM, Greg Stark wrote: >>>>> >>>>> The whole accounts table is 1.2GB and contains 10 million rows. As >>>>> expected with rows_per_block set to 1 it reads 240MB of that >>>>> containing nearly 2 million rows (and takes nearly 20s -- doing a full >>>>> table scan for select count(*) only takes about 5s): >>>> >>>> One simple thing we could do, without or in addition to changing the >>>> algorithm, is to issue posix_fadvise() calls for the blocks we're >>>> going to read. It should at least be possible to match the speed of a >>>> plain sequential scan that way. >>> >>> Hrm... maybe it wouldn't be very hard to use async IO here either? I'm >>> thinking it wouldn't be very hard to do the stage 2 work in the callback >>> routine... >> >> Yeah, other than the fact we have no infrastructure to do asynchronous I/O >> anywhere in the backend. If we had that, then we could easily use it here. I >> doubt it would be much better than posix_fadvising the blocks, though. > > Without patches to the kernel, it is much better. > > posix_fadvise interferes with read-ahead, so posix_fadvise on, say, > bitmap heap scans (or similarly sorted analyze block samples) run at 1 > IO / block, ie horrible, whereas aio can do read coalescence and > read-ahead when the kernel thinks it'll be profitable, significantly > increasing IOPS. I've seen everything from a 2x to 10x difference. How did you test that, given that we don't actually have an asynchronous I/O implementation? I don't recall any recent patches floating around either to do that. When Greg Stark investigated this back in 2007-2008 and implemented posix_fadvise() for bitmap heap scans, posix_fadvise certainly gave a significant speedup on the test data he used. What kind of a data distribution gives a slowdown like that? I took a stab at using posix_fadvise() in ANALYZE. It turned out to be very easy, patch attached. Your mileage may vary, but I'm seeing a nice gain from this on my laptop. Taking a 30000 page sample of a table with 717717 pages (ie. slightly larger than RAM), ANALYZE takes about 6 seconds without the patch, and less than a second with the patch, with effective_io_concurrency=10. If anyone with a good test data set loaded would like to test this and post some numbers, that would be great. - Heikki
Вложения
В списке pgsql-hackers по дате отправления: