Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
От | Peter Geoghegan |
---|---|
Тема | Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. |
Дата | |
Msg-id | CAH2-Wzkj9AgE1YJ=dXKajwiOM7_AqWpDs=9bDwowoQ=ycFLPRA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. (Peter Geoghegan <pg@bowt.ie>) |
Список | pgsql-hackers |
On Mon, Sep 16, 2019 at 11:58 AM Peter Geoghegan <pg@bowt.ie> wrote: > I think that the problem here is that you didn't copy this old code > from _bt_split() over to _bt_dedup_one_page(): > > /* > * Copy the original page's LSN into leftpage, which will become the > * updated version of the page. We need this because XLogInsert will > * examine the LSN and possibly dump it in a page image. > */ > PageSetLSN(leftpage, PageGetLSN(origpage)); > isleaf = P_ISLEAF(oopaque); I can confirm that this is what the problem was. Attached are two patches: * A version of your v14 from today with a couple of tiny changes to make it work against the current master branch -- I had to rebase the patch, but the changes made while rebasing were totally trivial. (I like to keep CFTester green.) * The second patch actually fixes the PageSetLSN() thing, setting the temp page buffer's LSN to match the original page before any real work is done, and before XLogInsert() is called. Just like _bt_split(). The test case now shows exactly what you reported for "FPWs off" when FPWs are turned on, at least on my machine and with my checkpoint settings. That is, there are *zero* FPIs/FPWs, so the final nbtree volume is 2128 MB. This means that the volume of additional WAL required over what the master branch requires for the same test case is very small (2128 MB compares well with master's 2011 MB of WAL). Maybe we could do better than 2128 MB with more work, but this is definitely already low enough overhead to be acceptable. This also passes "make check-world" testing. However, my usual wal_consistency_checking smoke test fails pretty quickly with the two patches applied: 3634/2019-09-16 13:53:22 PDT FATAL: inconsistent page found, rel 1663/16385/2673, forknum 0, blkno 13 3634/2019-09-16 13:53:22 PDT CONTEXT: WAL redo at 0/3202370 for Btree/DEDUPLICATE: items were deduplicated to 12 items 3633/2019-09-16 13:53:22 PDT LOG: startup process (PID 3634) exited with exit code 1 Maybe the lack of the PageSetLSN() thing masked a bug in WAL replay, since without that we effectively always just replay FPIs, never truly relying on redo. (I didn't try wal_consistency_checking without the second patch, but I assume that you did, and found no problems for this reason.) Can you produce a new version that integrates the PageSetLSN() thing, and fixes this bug? Thanks -- Peter Geoghegan
Вложения
В списке pgsql-hackers по дате отправления: