Re: ANALYZE sampling is too good
От | Martijn van Oosterhout |
---|---|
Тема | Re: ANALYZE sampling is too good |
Дата | |
Msg-id | 20131211223904.GA13377@svana.org обсуждение исходный текст |
Ответ на | Re: ANALYZE sampling is too good (Gavin Flower <GavinFlower@archidevsys.co.nz>) |
Список | pgsql-hackers |
On Thu, Dec 12, 2013 at 07:22:59AM +1300, Gavin Flower wrote: > Surely we want to sample a 'constant fraction' (obviously, in > practice you have to sample an integral number of rows in a page!) > of rows per page? The simplest way, as Tom suggests, is to use all > the rows in a page. > > However, if you wanted the same number of rows from a greater number > of pages, you could (for example) select a quarter of the rows from > each page. In which case, when this is a fractional number: take > the integral number of rows, plus on extra row with a probability > equal to the fraction (here 0.25). In this discussion we've mostly used block = 1 postgresql block of 8k. But when reading from a disk once you've read one block you can basically read the following ones practically for free. So I wonder if you could make your sampling read always 16 consecutive blocks, but then use 25-50% of the tuples. That way you get many more tuples for the same amount of disk I/O seeks.. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > He who writes carelessly confesses thereby at the very outset that he does > not attach much importance to his own thoughts. -- Arthur Schopenhauer
В списке pgsql-hackers по дате отправления: