Optimizing box_penalty (Re: WIP: Fast GiST index build)
От | Heikki Linnakangas |
---|---|
Тема | Optimizing box_penalty (Re: WIP: Fast GiST index build) |
Дата | |
Msg-id | 4E088690.5080706@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: WIP: Fast GiST index build (Alexander Korotkov <aekorotkov@gmail.com>) |
Список | pgsql-hackers |
On 27.06.2011 13:45, Alexander Korotkov wrote: > I've added information about testing on some real-life dataset to wiki page. > This dataset have a speciality: data is ordered inside it. In this case > tradeoff was inverse in comparison with expectations about "fast build" > algrorithm. Index built is longer but index quality is significantly better. > I think high speed of regular index built is because sequential inserts are > into near tree parts. That's why number of actual page reads and writes is > low. The difference in tree quality I can't *convincingly explain now.* > I've also maked tests with shuffled data of this dataset. In this case > results was similar to random generated data. Hmm, I assume the CPU overhead is coming from the penalty calls in this case too. There's some low-hanging optimization fruit in gist_box_penalty(), see attached patch. I tested this with: CREATE TABLE points (a point); CREATE INDEX i_points ON points using gist (a); INSERT INTO points SELECT point(random(), random()) FROM generate_series(1,1000000); and running "checkpoint; reindex index i_points;" a few times with and without the patch. The patch reduced the runtime from about 17.5 s to 15.5 s. oprofile confirms that the time spent in gist_box_penalty() and rt_box_union() is reduced significantly. This is all without the fast GiST index build patch, so this is worthwhile on its own. If penalty function is called more, then this becomes even more significant. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Вложения
В списке pgsql-hackers по дате отправления: