Optimizing DISTINCT with LIMIT
От | tmp |
---|---|
Тема | Optimizing DISTINCT with LIMIT |
Дата | |
Msg-id | gh8m5v$7oj$1@news.hub.org обсуждение исходный текст |
Ответы |
Re: Optimizing DISTINCT with LIMIT
|
Список | pgsql-hackers |
As far as I have understood the following query SELECT DISTINCT foo FROM bar LIMIT baz is done by first sorting the input and then traversing the sorted data, ensuring uniqueness of output and stopping when the LIMIT threshold is reached. Furthermore, a part of the sort procedure is to traverse input at least one time. Now, if the input is large but the LIMIT threshold is small, this sorting step may increase the query time unnecessarily so here is a suggestion for optimization: If the input is "sufficiently" large and the LIMIT threshold "sufficiently" small, maintain the DISTINCT output by hashning while traversing the input and stop when the LIMIT threshold is reached. No sorting required and *at* *most* one read of input. Use case: Websites that needs to present small samples of huge queries fast.
В списке pgsql-hackers по дате отправления: