Re: Make COPY format extendable: Extract COPY TO format implementations

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: Make COPY format extendable: Extract COPY TO format implementations
Дата
Msg-id CAD21AoCZv3cVU+NxR2s9J_dWvjrS350GFFr2vMgCH8wWxQ5hTQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Make COPY format extendable: Extract COPY TO format implementations  (Sutou Kouhei <kou@clear-code.com>)
Ответы Re: Make COPY format extendable: Extract COPY TO format implementations  (Sutou Kouhei <kou@clear-code.com>)
Список pgsql-hackers
On Thu, Dec 14, 2023 at 6:44 PM Sutou Kouhei <kou@clear-code.com> wrote:
>
> Hi,
>
> In <CAD21AoCvjGserrtEU=UcA3Mfyfe6ftf9OXPHv9fiJ9DmXMJ2nQ@mail.gmail.com>
>   "Re: Make COPY format extendable: Extract COPY TO format implementations" on Mon, 11 Dec 2023 10:57:15 +0900,
>   Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> > IIUC we cannot create two same name functions with the same arguments
> > but a different return value type in the first place. It seems to me
> > to be an overkill to change such a design.
>
> Oh, sorry. I didn't notice it.
>
> > Another idea is to encapsulate copy_to/from_handler by a super class
> > like copy_handler. The handler function is called with an argument,
> > say copyto, and returns copy_handler encapsulating either
> > copy_to/from_handler depending on the argument.
>
> It's for using "${copy_format_name}" such as "json" and
> "parquet" as a function name, right?

Right.

> If we use the
> "${copy_format_name}" approach, we can't use function names
> that are already used by tablesample method handler such as
> "system" and "bernoulli" for COPY FORMAT name. Because both
> of tablesample method handler function and COPY FORMAT
> handler function use "(internal)" as arguments.
>
> I think that tablesample method names and COPY FORMAT names
> will not be conflicted but the limitation (using the same
> namespace for tablesample method and COPY FORMAT) is
> unnecessary limitation.

Presumably, such function name collisions are not limited to
tablesample and copy, but apply to all functions that have an
"internal" argument. To avoid collisions, extensions can be created in
a different schema than public. And note that built-in format copy
handler doesn't need to declare its handler function.

>
> How about using prefix ("copy_to_${copy_format_name}" or
> something) or suffix ("${copy_format_name}_copy_to" or
> something) for function names? For example,
> "copy_to_json"/"copy_from_json" for "json" COPY FORMAT.
>
> ("copy_${copy_format_name}" that returns copy_handler
> encapsulating either copy_to/from_handler depending on the
> argument may be an option.)

While there is a way to avoid collision as I mentioned above, I can
see the point that we might want to avoid using a generic function
name such as "arrow" and "parquet" as custom copy handler functions.
Adding a prefix or suffix would be one option but to give extensions
more flexibility, another option would be to support format = 'custom'
and add the "handler" option to specify a copy handler function name
to call. For example, COPY ... FROM ... WITH (FORMAT = 'custom',
HANDLER = 'arrow_copy_handler').

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Euler Taveira"
Дата:
Сообщение: Re: logical decoding and replication of sequences, take 2
Следующее
От: Masahiko Sawada
Дата:
Сообщение: Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)