Re: Improving count(*)
От | Greg Stark |
---|---|
Тема | Re: Improving count(*) |
Дата | |
Msg-id | 878xvmr3t8.fsf@stark.xeocode.com обсуждение исходный текст |
Ответ на | Improving count(*) (Simon Riggs <simon@2ndquadrant.com>) |
Список | pgsql-hackers |
I think the important thing to keep track of is a single bit: Which the following apply? a) all of the tuples in this block are visible b) at least one tuple in this block is in-doubt and may need to be vacuumed That isn't enough to calculate count(*) on its own but it means you could scan an index like any other database and avoid checking every single tuple. If the tuple lies in a block that is known not to contain any in-doubt records then the tuple can be counted immediately. This has the advantage that it helps with a lot more cases than a simple "select count(*) from tab". As Tom pointed out that case can be tackled more directly with a O(1) solution anyways. More complex cases are where fast index scans are really important. So you could do "SELECT count(*) FROM tab WHERE a > ?" and have it scan an index on <a> without having to check the visibility of every single tuple. It only has to check the visibility of tuples that lie on blocks that contain at least one in-doubt tuple. You could even imagine using the new bitmap index scan machinery to combine these bits of information too. And this is exactly the same information that vacuum needs. Once vacuum has run and cleaned out a block it knows whether there are any records that are still in-doubt or whether every record it left is universally visible. It can note that and allow future vacuums to skip that block if no deletes or inserts have changed that bit since. -- greg
В списке pgsql-hackers по дате отправления: