Re: PoC: Duplicate Tuple Elidation during External Sort for DISTINCT
От | Jon Nelson |
---|---|
Тема | Re: PoC: Duplicate Tuple Elidation during External Sort for DISTINCT |
Дата | |
Msg-id | CAKuK5J07k3rEWq6QT0_i7pTT3OSBK9ReQwQfi5LXNp8dmeokEQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: PoC: Duplicate Tuple Elidation during External Sort for DISTINCT (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: PoC: Duplicate Tuple Elidation during External Sort for DISTINCT
|
Список | pgsql-hackers |
On Wed, Jan 22, 2014 at 3:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Jeremy Harris <jgh@wizmail.org> writes: >> On 22/01/14 03:53, Tom Lane wrote: >>> Jon Nelson <jnelson+pgsql@jamponi.net> writes: >>>> - in createplan.c, eliding duplicate tuples is enabled if we are >>>> creating a unique plan which involves sorting first > >>> [ raised eyebrow ... ] And what happens if the planner drops the >>> unique step and then the sort doesn't actually go to disk? > >> I don't think Jon was suggesting that the planner drop the unique step. > > Hm, OK, maybe I misread what he said there. Still, if we've told > tuplesort to remove duplicates, why shouldn't we expect it to have > done the job? Passing the data through a useless Unique step is > not especially cheap. That's correct - I do not propose to drop the unique step. Duplicates are only dropped if it's convenient to do so. In one case, it's a zero-cost drop (no extra comparison is made). In most other cases, an extra comparison is made, typically right before writing a tuple to tape. If it compares as identical to the previously-written tuple, it's thrown out instead of being written. The output of the modified code is still sorted, still *might* (and in most cases, probably will) contain duplicates, but will (probably) contain fewer duplicates. -- Jon
В списке pgsql-hackers по дате отправления: