Re: COPY FROM WHEN condition
От | Corey Huinker |
---|---|
Тема | Re: COPY FROM WHEN condition |
Дата | |
Msg-id | CADkLM=d94dynBhz0WXpg-8Wc2=aDCzO9J4AoikgXJ9d0uJyqAg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: COPY FROM WHEN condition (David Fetter <david@fetter.org>) |
Ответы |
Re: COPY FROM WHEN condition
Re: COPY FROM WHEN condition Re: COPY FROM WHEN condition |
Список | pgsql-hackers |
> Are you thinking something like having a COPY command that provides
> results in such a way that they could be referenced in a FROM clause
> (perhaps a COPY that defines a cursor…)?
That would also be nice, but what I was thinking of was that some
highly restricted subset of cases of SQL in general could lend
themselves to levels of optimization that would be impractical in
other contexts.
If COPY (or a syntactical equivalent) can return a result set, then the whole of SQL is available to filter and aggregate the results and we don't have to invent new syntax, or endure confusion whenCOPY-WHEN syntax behaves subtly different from a similar FROM-WHERE.
Also, what would we be saving computationally? The whole file (or program output) has to be consumed no matter what, the columns have to be parsed no matter what. At least some of the columns have to be converted to their assigned datatypes enough to know whether or not to filter the row, but we might be able push that logic inside a copy. I'm thinking of something like this:
SELECT x.a, sum(x.b)FROM ( COPY INLINE '/path/to/foo.txt' FORMAT CSV ) as x( a integer, b numeric, c text, d date, e json) )WHERE x.d >= '2018-11-01'
In this case, there is the opportunity to see the following optimizations:
- columns c and e are never referenced, and need never be turned into a datum (though we might do so just to confirm that they conform to the data type)
- if column d is converted first, we can filter on it and avoid converting columns a,b
- whatever optimizations we can infer from knowing that the two surviving columns will go directly into an aggregate
If we go this route, we can train the planner to notice other optimizations and add those mechanisms at that time, and then existing code gets faster.
If we go the COPY-WHEN route, then we have to make up new syntax for every possible future optimization.
If we go the COPY-WHEN route, then we have to make up new syntax for every possible future optimization.
В списке pgsql-hackers по дате отправления: