Re: Improving N-Distinct estimation by ANALYZE
От | Greg Stark |
---|---|
Тема | Re: Improving N-Distinct estimation by ANALYZE |
Дата | |
Msg-id | 87ace429ua.fsf@stark.xeocode.com обсуждение исходный текст |
Ответ на | Re: Improving N-Distinct estimation by ANALYZE (Greg Stark <gsstark@mit.edu>) |
Ответы |
Re: Improving N-Distinct estimation by ANALYZE
|
Список | pgsql-hackers |
Greg Stark <gsstark@MIT.EDU> writes: > > > These numbers don't make much sense to me. It seems like 5% is about as slow > > > as reading the whole file which is even worse than I expected. I thought I was > > > being a bit pessimistic to think reading 5% would be as slow as reading 20% of > > > the table. > > I have a theory. My test program, like Postgres, is reading in 8k chunks. > Perhaps that's fooling Linux into thinking it's a sequential read and reading > in 32k chunks internally. That would effectively make a 25% scan a full table > scan. And a 5% scan would be a 20% scan which is about where I would have > expected the breakeven point to be. Well my theory was sort of half right. It has nothing to do with fooling Linux into thinking it's a sequential read. Apparently this filesystem was created with 32k blocks. I don't remember if that was intentional or if ext2/3 did it automatically based on the size of the filesystem. So it doesn't have wide-ranging implications for Postgres's default 8k block size. But it is a good lesson about the importance of not using a larger filesystem block than Postgres's block size. The net effect is that if the filesystem block is N*8k then your random_page_cost goes up by a factor of N. That could be devastating for OLTP performance. -- greg
В списке pgsql-hackers по дате отправления: