Re: Hash index todo list item
От | Mark Mielke |
---|---|
Тема | Re: Hash index todo list item |
Дата | |
Msg-id | 46E32897.8080106@mark.mielke.cc обсуждение исходный текст |
Ответ на | Re: Hash index todo list item (Kenneth Marshall <ktm@rice.edu>) |
Ответы |
Re: Hash index todo list item
Re: Hash index todo list item |
Список | pgsql-hackers |
Kenneth Marshall wrote:<br /><blockquote cite="mid:20070908214900.GB5679@it.is.rice.edu" type="cite"><pre wrap="">The valueof storing the actual value, possibly as an option, is that the key check can be done in the index without requiring a heap lookup to check the actual value which would be a benefit for a unique index. I agree that supporting variable length data would complicate the index and reduce performance. I am not willing to assume that ~1 entry per hash bucket is necessarily what we want, at least without some testing.</pre></blockquote> I think that if the case of >1 entry per hash becomescommon enough to be significant, and the key is stored in the hash, that a btree will perform equal or better, andthere is no point in pursuing such a hash index model. This is where we are today.<br /><br /><blockquote cite="mid:20070908214900.GB5679@it.is.rice.edu"type="cite"><pre wrap="">It seems reasonable that with the performance differences between L1/L2/L3 cache, main memory, and the disk subsystem a higher load factor would result in better overall system performance by reducing cache line misses and improving hardware pre-fetch efficiency.</pre></blockquote> If the key is stored, all of these factors likely favor the btree formatover the hash format. Again, this is where we are today.<br /><br /><blockquote cite="mid:20070908214900.GB5679@it.is.rice.edu"type="cite"><pre wrap="">Along with the hypothetical performance wins, the hash index space efficiency would be improved by a similar factor. Obviously, all of these ideas would need to be tested in various workload environments. In the large index arena, 10^6 to 10^9 keys and more, space efficiency will help keep the index manageable in todays system memories. </pre></blockquote> Space efficiency is provided by not storing the key, nor the header data required(length prefix?).<br /><blockquote cite="mid:20070908214900.GB5679@it.is.rice.edu" type="cite"><pre wrap="">Pleasekeep the ideas and comments coming. I am certain that a synthesis of them will provide an implementation with the performance characteristics that we are seeking. </pre></blockquote> The subject interests me. :-)<br /><br /> Cheers,<br /> mark<br /><br /><pre class="moz-signature"cols="72">-- Mark Mielke <a class="moz-txt-link-rfc2396E" href="mailto:mark@mielke.cc"><mark@mielke.cc></a> </pre>
В списке pgsql-hackers по дате отправления: