Re: WIP: dynahash replacement for buffer table
От | Ryan Johnson |
---|---|
Тема | Re: WIP: dynahash replacement for buffer table |
Дата | |
Msg-id | 543EB0B3.7080608@cs.utoronto.ca обсуждение исходный текст |
Ответ на | Re: WIP: dynahash replacement for buffer table (Ants Aasma <ants@cybertec.at>) |
Список | pgsql-hackers |
On 15/10/2014 10:32 AM, Ants Aasma wrote: > On Tue, Oct 14, 2014 at 6:19 PM, Robert Haas <robertmhaas@gmail.com> wrote: >>> With regard for using a hash table for the buffer mapping lock I'm >>> doubtful that any form of separate chaining is the right one. We >>> currently have a quite noticeable problem with the number of cache >>> misses in the buffer mapping hash (and also the heavyweight lock mgr) - >>> if we stay with hashes that's only going to be fundamentally lower than >>> dynahash if we change the type of hashing. I've had good, *preliminary*, >>> results using an open addressing + linear probing approach. >> I'm very skeptical of open addressing + linear probing. Such hash >> tables tend to degrade over time, because you have to leave tombstones >> behind to ensure that future scans advance far enough. There's really >> no way to recover without rebuilding the whole hash table, and >> eventually it degrades to linear search. If we're spending too much >> time walking hash chains, I think the solution is to increase the >> number of buckets so that the chains get shorter. > What about cuckoo hashing? There was a recent paper on how to do fine > grained locking with cuckoo hashes. [1] > > I'm imagining a bucketized cuckoo hash with 5 item buckets (5-way > associativity). This allows us to fit the bucket onto 2 regular sized > cache lines and have 8 bytes left over. Buckets would be protected by > seqlocks stored in the extra space. On the read side we would only > need 2 read barriers (basically free on x86), and we are guaranteed to > have an answer by fetching two pairs of cache lines. We can trade > memory bandwidth for latency by issuing prefetches (once we add > primitives for this). Alternatively we can trade insert speed for > lookup speed by using asymmetrically sized tables. > > [1] https://www.cs.princeton.edu/~mfreed/docs/cuckoo-eurosys14.pdf Actually, I'd go with second-chance hashing [2], same number of hash functions but it's more stable (no infinite loops, for example). Most probably the techniques from [1] would apply equally well. [2] http://www.eecs.harvard.edu/~michaelm/postscripts/infocom_hardware_submit.pdf Ryan
В списке pgsql-hackers по дате отправления: