Re: Hash index todo list item
От | Heikki Linnakangas |
---|---|
Тема | Re: Hash index todo list item |
Дата | |
Msg-id | 4730492A.7090609@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: Hash index todo list item (Shreya Bhargava <shreya_bhargav@yahoo.com>) |
Список | pgsql-hackers |
Shreya Bhargava wrote: > 1. Populate the table with 80 million tuples. > 2. Create HASH index on the table. > 3. clear both linux cache & psql buffers. > (exiting psql and restarting it cleared the psql buffers; > to clear linux cache, we used drop_cache command) > 4. start psql > 5. select on an integer in the range of values in the table. > (all test numbers were big ones, like 98934599) > 6. record the time. > 7. exit psql. > 8. drop caches.(as described above) > 9. repeat 4-8 for different numbers. > 10. Drop Hash index. > 11. Create Btree index and repeat 3-9. It seems you're mostly measuring the overhead of starting a backend, populating the relcache etc. Restarting psql doesn't clear the postgres shared buffer cache. Or did you mean that you restarted postgres? Anyway, I don't think it's interesting to test with cleared caches. Surely the metapage and first 1-2 levels of the b-tree would stay cached all the time in real life. > From the results obtained, the average of all the hash probes is 141.8ms, the average for btree is 168.5, a differenceof about 27.The standard deviations are about 23, so this is a statistically significant difference. I don't trust those numbers much, but in any case I don't think that edge is big enough to justify the existence of hash indexes. If you're looking for a use case where hash index is faster, I'd suggest using a data type with an expensive comparison function. Like long multi-byte strings in UTF-8 encoding. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: