Re: Make COPY format extendable: Extract COPY TO format implementations

Поиск
Список
Период
Сортировка
От Sutou Kouhei
Тема Re: Make COPY format extendable: Extract COPY TO format implementations
Дата
Msg-id 20231207.140458.425537343057608813.kou@clear-code.com
обсуждение исходный текст
Ответ на Re: Make COPY format extendable: Extract COPY TO format implementations  (Junwang Zhao <zhjwpku@gmail.com>)
Ответы Re: Make COPY format extendable: Extract COPY TO format implementations  (Junwang Zhao <zhjwpku@gmail.com>)
Список pgsql-hackers
Hi,

In <CAEG8a3LSRhK601Bn50u71BgfNWm4q3kv-o-KEq=hrbyLbY_EsA@mail.gmail.com>
  "Re: Make COPY format extendable: Extract COPY TO format implementations" on Wed, 6 Dec 2023 22:07:51 +0800,
  Junwang Zhao <zhjwpku@gmail.com> wrote:

> Should we extract both *copy to* and *copy from* for the first step, in that
> case we can add the pg_copy_handler catalog smoothly later.

I don't object it (mixing TO/FROM changes to one patch) but
it may make review difficult. Is it acceptable?

FYI: I planed that I implement TO part, and then FROM part,
and then unify TO/FROM parts if needed. [1]

> Attached V4 adds 'extract copy from' and it passed the cirrus ci,
> please take a look.

Thanks. Here are my comments:

> +        /*
> +            * Error is relevant to a particular line.
> +            *
> +            * If line_buf still contains the correct line, print it.
> +            */
> +        if (cstate->line_buf_valid)

We need to fix the indentation.

> +CopyFromFormatBinaryStart(CopyFromState cstate, TupleDesc tupDesc)
> +{
> +    FmgrInfo   *in_functions;
> +    Oid           *typioparams;
> +    Oid            in_func_oid;
> +    AttrNumber    num_phys_attrs;
> +
> +    /*
> +     * Pick up the required catalog information for each attribute in the
> +     * relation, including the input function, the element type (to pass to
> +     * the input function), and info about defaults and constraints. (Which
> +     * input function we use depends on text/binary format choice.)
> +     */
> +    num_phys_attrs = tupDesc->natts;
> +    in_functions = (FmgrInfo *) palloc(num_phys_attrs * sizeof(FmgrInfo));
> +    typioparams = (Oid *) palloc(num_phys_attrs * sizeof(Oid));

We need to update the comment because defaults and
constraints aren't picked up here.

> +CopyFromFormatTextStart(CopyFromState cstate, TupleDesc tupDesc)
...
> +    /*
> +     * Pick up the required catalog information for each attribute in the
> +     * relation, including the input function, the element type (to pass to
> +     * the input function), and info about defaults and constraints. (Which
> +     * input function we use depends on text/binary format choice.)
> +     */
> +    in_functions = (FmgrInfo *) palloc(num_phys_attrs * sizeof(FmgrInfo));
> +    typioparams = (Oid *) palloc(num_phys_attrs * sizeof(Oid));

ditto.


> @@ -1716,15 +1776,6 @@ BeginCopyFrom(ParseState *pstate,
>          ReceiveCopyBinaryHeader(cstate);
>      }

I think that this block should be moved to
CopyFromFormatBinaryStart() too. But we need to run it after
we setup inputs such as data_source_cb, pipe and filename...

+/* Routines for a COPY HANDLER implementation. */
+typedef struct CopyHandlerOps
+{
+    /* Called when COPY TO is started. This will send a header. */
+    void        (*copy_to_start) (CopyToState cstate, TupleDesc tupDesc);
+
+    /* Copy one row for COPY TO. */
+    void        (*copy_to_one_row) (CopyToState cstate, TupleTableSlot *slot);
+
+    /* Called when COPY TO is ended. This will send a trailer. */
+    void        (*copy_to_end) (CopyToState cstate);
+
+    void        (*copy_from_start) (CopyFromState cstate, TupleDesc tupDesc);
+    bool        (*copy_from_next) (CopyFromState cstate, ExprContext *econtext,
+                                    Datum *values, bool *nulls);
+    void        (*copy_from_error_callback) (CopyFromState cstate);
+    void        (*copy_from_end) (CopyFromState cstate);
+} CopyHandlerOps;

It seems that "copy_" prefix is redundant. Should we use
"to_start" instead of "copy_to_start" and so on?

BTW, it seems that "COPY FROM (FORMAT json)" may not be implemented. [2]
We may need to care about NULL copy_from_* cases.


> I added a hook *copy_from_end* but this might be removed later if not used.

It may be useful to clean up resources for COPY FROM but the
patch doesn't call the copy_from_end. How about removing it
for now? We can add it and call it from EndCopyFrom() later?
Because it's not needed for now.

I think that we should focus on refactoring instead of
adding a new feature in this patch.


[1]: https://www.postgresql.org/message-id/20231204.153548.2126325458835528809.kou%40clear-code.com
[2]: https://www.postgresql.org/message-id/flat/CALvfUkBxTYy5uWPFVwpk_7ii2zgT07t3d-yR_cy4sfrrLU%3Dkcg%40mail.gmail.com


Thanks,
-- 
kou



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: logical decoding and replication of sequences, take 2
Следующее
От: Dilip Kumar
Дата:
Сообщение: Re: logical decoding and replication of sequences, take 2