Re: [HACKERS] A design for amcheck heapam verification
От | Peter Geoghegan |
---|---|
Тема | Re: [HACKERS] A design for amcheck heapam verification |
Дата | |
Msg-id | CAH2-WznhneJxZdkbGbqaxeb9i0HRMCiaBAL=xvE6wkCiVpHeGw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] A design for amcheck heapam verification (Peter Geoghegan <pg@bowt.ie>) |
Ответы |
Re: [HACKERS] A design for amcheck heapam verification
|
Список | pgsql-hackers |
On Mon, Feb 5, 2018 at 12:55 PM, Peter Geoghegan <pg@bowt.ie> wrote: > Anyway, parallel CREATE INDEX added a new "scan" argument to > IndexBuildHeapScan(), which caused this patch to bitrot. At a minimum, > an additional NULL argument should be passed by amcheck. However, I > have a better idea. > > ISTM that verify_nbtree.c should manage the heap scan itself, it the > style of parallel CREATE INDEX. It can acquire its own MVCC snapshot > for bt_index_check() (which pretends to be a CREATE INDEX > CONCURRENTLY). There can be an MVCC snapshot acquired per index > verified, a snapshot that is under the direct control of amcheck. The > snapshot would be acquired at the start of verification on an index > (when "heapallindex = true"), before the verification of the index > structure even begins, and released at the very end of verification. Attached patch fixes the parallel index build bitrot in this way. This is version 6 of the patch. This approach resulted in a nice reduction in complexity: bt_index_check() and bt_index_parent_check() heapallindexed verification operations both work in exactly the same way now, except that bt_index_check() imitates a CREATE INDEX CONCURRENTLY (to match the heavyweight relation locks acquired). This doesn't really need to be explained as a special case anymore; bt_index_parent_check() is like an ordinary CREATE INDEX, without any additional "TransactionXmin heap tuple xmin recheck" complication. A further benefit is that this makes running bt_index_check() checks against many indexes more thorough, and easier to reason about. Users won't have to worry about TransactionXmin becoming very stale when many indexes are verified within a single command. I made the following additional, unrelated changes based on various feedback: * Faster modulo operations. Andrey Borodin suggested that I make k_hashes() do fewer modulo operations. While I don't want to change the algorithm to make this happen, the overhead has been reduced. Modulo operations are now performed through bitwise AND operations, which is possible only because the bitset size is always a power of two. Apparently this is a fairly common optimization for Bloom filters that use (enhanced) double-hashing; we might as well do it this way. I've really just transcribed the enhanced double hashing pseudo-code from the Georgia Tech/Dillinger & Manolios paper into C code, so no real change there; bloomfilter.c's k_hashes() is still closely based on "5.2 Enhanced Double Hashing" from that same paper. Experience suggests that we ought to be very conservative about developing novel hashing techniques. Paranoid, even. * New reference to the modulo bias effect. Michael Paquier wondered why the Bloom filter was always a power-of-two, which this addresses. (Of course, the "modulo bitwise AND" optimization I just mentioned is another reason to limit ourselves to power-of-two bitset sizes, albeit a new one.) * Removed sdbmhash(). Michael also wanted to know more about sdbmhash(), due to some general concern about its quality. I realized that it is best to avoid adding a new general-purpose hash function, whose sole purpose is to be different to hash_any(), when I could instead use hash_uint32_extended() to get two 32-bit values all at once. Robert suggested this approach at one point, actually, but for some reason I didn't follow up until now. -- Peter Geoghegan
Вложения
В списке pgsql-hackers по дате отправления: