Re: GSOC 2018 Project - A New Sorting Routine
От | Peter Geoghegan |
---|---|
Тема | Re: GSOC 2018 Project - A New Sorting Routine |
Дата | |
Msg-id | CAH2-Wzmj2XstMK58tJ0yEr+0MwpqMU8rfUB0j7GVe=p+yW5rTg@mail.gmail.com обсуждение исходный текст |
Ответ на | Fwd: GSOC 2018 Project - A New Sorting Routine (Kefan Yang <starordust@gmail.com>) |
Ответы |
Re: GSOC 2018 Project - A New Sorting Routine
|
Список | pgsql-hackers |
On Fri, Jul 13, 2018 at 3:04 PM, Kefan Yang <starordust@gmail.com> wrote: > 1. Slow on CREATE INDEX cases. > > I am still trying to figure out where the bottleneck is. Is the data pattern > in index creation very different from other cases? Also, pg_qsort has > 10%-20% advantage at creating index even on sorted data (faster CPU, N = > 1000000). This is very strange to me since the two sorting routines execute > exactly the same code when the input data is sorted. Yes. CREATE INDEX uses heap TID as a tie-breaker, so it's impossible for any two index tuples to compare as equal within tuplesort.c, even though they may be equal in other contexts. This is likely to defeat things like the Bentley-McIlroy optimization where equal keys are swapped, which is very effective in the event of many equal keys. (Could also be parallelism, though I suppose you probably accounted for that.) -- Peter Geoghegan
В списке pgsql-hackers по дате отправления: