Re: PATCH: optimized DROP of multiple tables within a transaction
От | Tomas Vondra |
---|---|
Тема | Re: PATCH: optimized DROP of multiple tables within a transaction |
Дата | |
Msg-id | 50CE59B4.5030605@fuzzy.cz обсуждение исходный текст |
Ответ на | Re: PATCH: optimized DROP of multiple tables within a transaction (Tomas Vondra <tv@fuzzy.cz>) |
Ответы |
Re: PATCH: optimized DROP of multiple tables within a transaction
|
Список | pgsql-hackers |
Hi, I've updated the patch to include the optimization described in the previous post, i.e. if the number of relations is below a certain threshold, use a simple for loop, for large numbers of relations use bsearch calls. This is done by a new constant BSEARCH_LIMIT, which is set to 10 in the patch. Then I've modified the 'drop-test' script to take yet another argument - number of indexes for each table. So for example this ./drop-test.py 10000 100 3 'dbname=test' means "create 10000 tables, 3 indexes for each of them, and then drop them in batches of 100 tables." Then I've run the test with 0, 1, 2, ... 11 indexes for each table for (a) unpatched HEAD (b) patch v3.1 (without the optimization) (c) patch v3.3 (with BSEARCH_LIMIT=10) and I got these results: 1) dropping one-by-one ---------------------- This is the really interesting part - the issue with the v3.1 is that for a single table, it's ~2x slower than unpatched PostgreSQL. 0 1 2 3 4 5 6 7 8 9 10 11 -------------------------------------------------------------- unpatched 16 28 40 52 63 75 87 99 110 121 135 147 v3.1 33 43 46 56 58 60 63 72 75 76 79 80 v3.3 16 20 23 25 29 33 36 40 44 47 79 82 The values are durations in seconds, rounded to integer values. I've run the test repeatedly and there's very small variance in the numbers. The v3.3 improves that and it's actually even faster than unpatched PostgreSQL. How can this happen? I believe it's because the code is rewritten from for each relation (r) in the drop list DropRelFileNodeAllBuffers (r) for each shared buffer check and invalidate to copy the relations from drop list to array (a) DropRelFileNodeAllBuffers(a) for each shared buffer for each relation (r) in the array check and invalidate At least that's the only explanation I was able to come up with. Yet another interesting observation is that even the v3.1 is about as fast as the unpatched code once there are 3 or more indexes (or TOAST tables). So my opinion is that the optimizated patch works as expected, and that even without the optimization the performance would be acceptable for most real-world cases. 2) dropping in transaction -------------------------- This is mostly to verify that the code did not break anything, because the optimization should not kick-in in this case at all. And that seems to be the case: 0 1 2 3 4 5 6 7 8 9 10 11 -------------------------------------------------------------- unpatched 13 24 35 46 58 69 81 92 103 115 127 139 v3.1 3 5 7 8 10 12 14 15 16 18 20 21 v3.3 3 4 6 7 8 10 11 13 14 15 18 20 The differences between v3.1 and v3.3 are mostly due to rounding etc. Attached is the v3.3 patch and the testing script I've been using for the tests above. Feel free to run the tests on your hardware, with your hardware, shared buffers size etc. I've run that on a 4-core i5-2500 CPU with 2GB shared buffers. Tomas
Вложения
В списке pgsql-hackers по дате отправления: