Re: WIP: Fast GiST index build
От | Heikki Linnakangas |
---|---|
Тема | Re: WIP: Fast GiST index build |
Дата | |
Msg-id | 4E5F5266.4010602@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: WIP: Fast GiST index build (Alexander Korotkov <aekorotkov@gmail.com>) |
Список | pgsql-hackers |
On 01.09.2011 12:23, Alexander Korotkov wrote: > On Thu, Sep 1, 2011 at 12:59 PM, Heikki Linnakangas< > heikki.linnakangas@enterprisedb.com> wrote: > >> So I changed the test script to generate the table as: >> >> CREATE TABLE points AS SELECT random() as x, random() as y FROM >> generate_series(1, $NROWS); >> >> The unordered results are in: >> >> testname | nrows | duration | accesses >> -----------------------------+**-----------+-----------------+**---------- >> points unordered buffered | 250000000 | 05:56:58.575789 | 2241050 >> points unordered auto | 250000000 | 05:34:12.187479 | 2246420 >> points unordered unbuffered | 250000000 | 04:38:48.663952 | 2244228 >> >> Although the buffered build doesn't lose as badly as it did with more >> overlap, it still doesn't look good :-(. Any ideas? > > > But it's still a lot of overlap. It's about 220 accesses per small area > request. It's about 10 - 20 times greater than should be without overlaps. Hmm, those "accesses" numbers are actually quite bogus for this test. I changed the creation of the table as you suggested, so that all x and y values are in the range 0.0 - 1.0, but I didn't change the loop to calculate those accesses, so it still queried for boxes in the range 0 - 100000. That makes me wonder, why does it need 220 accesses on average to satisfy queries most of which lie completely outside the range of actual values in the index? I would expect such queries to just look at the root node, conclude that there can't be any matching tuples, and return immediately. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: