Re: TABLESAMPLE patch

Поиск
Список
Период
Сортировка
От Peter Eisentraut
Тема Re: TABLESAMPLE patch
Дата
Msg-id 5526D369.1070905@gmx.net
обсуждение исходный текст
Ответ на Re: TABLESAMPLE patch  (Michael Paquier <michael.paquier@gmail.com>)
Ответы Re: TABLESAMPLE patch  (Simon Riggs <simon@2ndQuadrant.com>)
Re: TABLESAMPLE patch  (Petr Jelinek <petr@2ndquadrant.com>)
Список pgsql-hackers
On 4/9/15 5:02 AM, Michael Paquier wrote:
> Just to be clear, the example above being misleading... Doing table
> sampling using SYSTEM at physical level makes sense. In this case I
> think that we should properly error out when trying to use this method
> on something not present at physical level. But I am not sure that
> this restriction applies to BERNOUILLI: you may want to apply it on
> other things than physical relations, like views or results of WITH
> clauses. Also, based on the fact that we support custom sampling
> methods, I think that it should be up to the sampling method to define
> on what kind of objects it supports sampling, and where it supports
> sampling fetching, be it page-level fetching or analysis from an
> existing set of tuples. Looking at the patch, TABLESAMPLE is just
> allowed on tables and matviews, this limitation is too restrictive
> IMO.

In the SQL standard, the TABLESAMPLE clause is attached to a table
expression (<table primary>), which includes table functions,
subqueries, CTEs, etc.  In the proposed patch, it is attached to a table
name, allowing only an ONLY clause.  So this is a significant deviation.

Obviously, doing block sampling on a physical table is a significant use
case, but we should be clear about which restrictions and tradeoffs were
are making now and in the future, especially if we are going to present
extension interfaces.  The fact that physical tables are interchangeable
with other relation types, at least in data-reading contexts, is a
feature worth preserving.

It may be worth thinking about some examples of other sampling methods,
in order to get a better feeling for whether the interfaces are appropriate.

Earlier in the thread, someone asked about supporting specifying a
number of rows instead of percents.  While not essential, that seems
pretty useful, but I wonder how that could be implemented later on if we
take the approach that the argument to the sampling method can be an
arbitrary quantity that is interpreted only by the method.




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Magnus Hagander
Дата:
Сообщение: Re: SSL information view
Следующее
От: Magnus Hagander
Дата:
Сообщение: Re: psql showing owner in \dT