Re: I: About "Our CLUSTER implementation is pessimal" patch
От | Josh Kupershmidt |
---|---|
Тема | Re: I: About "Our CLUSTER implementation is pessimal" patch |
Дата | |
Msg-id | AANLkTimQxxis81hh1weqVtwx_HU7exE_Mr8Ki0zPp3Bf@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: I: About "Our CLUSTER implementation is pessimal" patch (Itagaki Takahiro <itagaki.takahiro@gmail.com>) |
Ответы |
Re: I: About "Our CLUSTER implementation is pessimal" patch
Re: I: About "Our CLUSTER implementation is pessimal" patch |
Список | pgsql-hackers |
On Mon, Sep 27, 2010 at 10:05 PM, Itagaki Takahiro <itagaki.takahiro@gmail.com> wrote: > I re-ordered some description in the doc. Does it look better? > Comments and suggestions welcome. I thought this paragraph was a little confusing: ! In the second case, a full table scan is followed by a sort operation. ! The method is faster than the first one when the table is highly fragmented. ! You need twice disk space of the sum in the case. In addition to the free ! space needed by the previous case, this approach may also need a temporary ! disk sort file which can be as big as the original table. I think the worst-case disk space could be made a little more clear here, and maybe some general wordsmithing as well. I wasn't sure what "twice disk space of the sum" was in this description -- sum of what (table and all indexes?). And does "twice disk space" include the temporary disk sort file? Here's an idea of how I think this paragraph could be cleaned up a bit, if my understanding of the disk space required is about right: ! In the second case, a full table scan is followed by a sort operation. ! This method is faster than when the table is highly fragmented. ! However, <command>CLUSTER</command> may require available disk space of ! up to twice the sum of the size of the table and its indexes, if it uses a temporary ! disk sort file, which can be as big as the original table. Also, AIUI, this second clustering method is similar to the older idiom of CREATE TABLE new AS SELECT * FROM old ORDER BY col; Since the paragraph describing this older idiom is being removed, perhaps a brief mention in the documentation could be made of this similarity. Some more wordsmithing: change ! The planner tries to choose a faster method in them base on the information to: ! The planner tries to choose the fastest method based on the information I started looking at the performance impact of this patch based on Leonardo's SQL file. On the 2 million row table, I see a consistent ~10% advantage for the sequential scan clusters. I'm going to try to run the bigger tests a few times and post results from there when I get a chance. Josh
В списке pgsql-hackers по дате отправления: