Microvacuum support for Hash Index
| От | Ashutosh Sharma |
|---|---|
| Тема | Microvacuum support for Hash Index |
| Дата | |
| Msg-id | CAE9k0PkRSyzx8dOnokEpUi2A-RFZK72WN0h9DEMv_ut9q6bPRw@mail.gmail.com обсуждение исходный текст |
| Ответы |
Re: Microvacuum support for Hash Index
|
| Список | pgsql-hackers |
Hi All, I have added a microvacuum support for hash index access method and attached is the v1 patch for the same. The patch basically takes care of the following things: 1. Firstly, it changes the marking of dead tuples from 'tuple-at-a-time' to 'page-at-a-time' during hash index scan. For this we accumulate the heap tids and offset of all the hash index tuples if it is pointed by kill_prior_tuple during scan and then mark all accumulated tids as LP_DEAD either while stepping from one page to another (assuming the scan in both forward and backward direction) or during end of the hash index scan or during rescan. 2. Secondly, when inserting tuple into hash index table, if not enough space is found on a current page then it ensures that we first clean the dead tuples if found in the current hash index page before moving to the next page in a bucket chain or going for a bucket split. This basically increases the page reusability and reduces the number of page splits, thereby reducing the overall size of hash index table. I have compared the hash index size with and without my patch (microvacuum_hash_index_v1.patch attached with this mail) on a high end machine at various scale factors and the results are shown below. For testing this, i have created hash index (pgbench_accounts_aid) on aid column of 'pgbench_accounts' table instead of primary key and the results shown in below table are for the same. The patch (pgbench.patch) having these changes is also attached with this mail. Moreover, I am using my own script file (file_hash_kill_prior_tuple) for updating the index column with pgbench read-write command. The script file 'file_hash_kill_prior_tuple' is also attached with this mail. Here are some initial test results showing the benefit of this patch: postgresql.conf and pgbench settings: autovacuum=off client counts = 64 run time duration = 15 mins ./pgbench -c $threads -j $threads -T 900 postgres -f ~/file_hash_kill_prior_tuple Scale Factor hash index size @ start HEAD HEAD + Patch 10 32 MB 579 MB 158 MB 50 128 MB 630 MB 350 MB 100 256 MB 1255 MB 635 MB 300 1024 MB 2233 MB 1093 MB As shown in above result, at 10 scale factor the hash index size has reduced by almost 4 times whereas at 50 and 300 scale factors it has reduced by half with my patch. This basically proves that we can reduce the hash index size to a good extent with this patch. System specifications: --------------------------------- Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0-127 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 8 NUMA node(s): 8 Vendor ID: GenuineIntel Note: The patch (microvacuum_hash_index_v1.patch) is prepared on top of concurrent_hash_index_v8.patch-[1] and wal_hash_index_v5.1.patch[2] for hash index. [1] - https://www.postgresql.org/message-id/CAA4eK1%2BX%3D8sUd1UCZDZnE3D9CGi9kw%2Bkjxp2Tnw7SX5w8pLBNw%40mail.gmail.com [2] - https://www.postgresql.org/message-id/CAA4eK1KE%3D%2BkkowyYD0vmch%3Dph4ND3H1tViAB%2B0cWTHqjZDDfqg%40mail.gmail.com
Вложения
В списке pgsql-hackers по дате отправления: