Re: Make COPY format extendable: Extract COPY TO format implementations

Поиск
Список
Период
Сортировка
От Hannu Krosing
Тема Re: Make COPY format extendable: Extract COPY TO format implementations
Дата
Msg-id CAMT0RQRqVo4fGDWHqOn+wr_eoiXQVfyC=8-c=H=y6VcNxi6BvQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Make COPY format extendable: Extract COPY TO format implementations  (Junwang Zhao <zhjwpku@gmail.com>)
Ответы Re: Make COPY format extendable: Extract COPY TO format implementations  (Sutou Kouhei <kou@clear-code.com>)
Re: Make COPY format extendable: Extract COPY TO format implementations  (Junwang Zhao <zhjwpku@gmail.com>)
Список pgsql-hackers
Hi Junwang

Please also see my presentation slides from last years PostgreSQL
Conference in Berlin (attached)

The main Idea is to make not just "format", but also "transport" and
"stream processing" extendable via virtual function tables.

Btw, will any of you here be in Prague next week ?
Would be a good opportunity to discuss this in person.


Best Regards
Hannu

On Sat, Dec 9, 2023 at 9:39 AM Junwang Zhao <zhjwpku@gmail.com> wrote:
>
> On Sat, Dec 9, 2023 at 10:43 AM Hayato Kuroda (Fujitsu)
> <kuroda.hayato@fujitsu.com> wrote:
> >
> > Dear Junagn, Sutou-san,
> >
> > Basically I agree your point - improving a extendibility is good.
> > (I remember that this theme was talked at Japan PostgreSQL conference)
> > Below are my comments for your patch.
> >
> > 01. General
> >
> > Just to confirm - is it OK to partially implement APIs? E.g., only COPY TO is
> > available. Currently it seems not to consider a case which is not implemented.
> >
> For partially implements, we can leave the hook as NULL, and check the NULL
> at *ProcessCopyOptions* and report error if not supported.
>
> > 02. General
> >
> > It might be trivial, but could you please clarify how users can extend? Is it OK
> > to do below steps?
> >
> > 1. Create a handler function, via CREATE FUNCTION,
> > 2. Register a handler, via new SQL (CREATE COPY HANDLER),
> > 3. Specify the added handler as COPY ... FORMAT clause.
> >
> My original thought was option 2, but as Michael point, option 1 is
> the right way
> to go.
>
> > 03. General
> >
> > Could you please add document-related tasks to your TODO? I imagined like
> > fdwhandler.sgml.
> >
> > 04. General - copyright
> >
> > For newly added files, the below copyright seems sufficient. See applyparallelworker.c.
> >
> > ```
> >  * Copyright (c) 2023, PostgreSQL Global Development Group
> > ```
> >
> > 05. src/include/catalog/* files
> >
> > IIUC, 8000 or higher OIDs should be used while developing a patch. src/include/catalog/unused_oids
> > would suggest a candidate which you can use.
>
> Yeah, I will run renumber_oids.pl at last.
>
> >
> > 06. copy.c
> >
> > I felt that we can create files per copying methods, like copy_{text|csv|binary}.c,
> > like indexes.
> > How do other think?
>
> Not sure about this, it seems others have put a lot of effort into
> splitting TO and From.
> Also like to hear from others.
>
> >
> > 07. fmt_to_name()
> >
> > I'm not sure the function is really needed. Can we follow like get_foreign_data_wrapper_oid()
> > and remove the funciton?
>
> I have referenced some code from greenplum, will remove this.
>
> >
> > 08. GetCopyRoutineByName()
> >
> > Should we use syscache for searching a catalog?
> >
> > 09. CopyToFormatTextSendEndOfRow(), CopyToFormatBinaryStart()
> >
> > Comments still refer CopyHandlerOps, whereas it was renamed.
> >
> > 10. copy.h
> >
> > Per foreign.h and fdwapi.h, should we add a new header file and move some APIs?
> >
> > 11. copy.h
> >
> > ```
> > -/* These are private in commands/copy[from|to].c */
> > -typedef struct CopyFromStateData *CopyFromState;
> > -typedef struct CopyToStateData *CopyToState;
> > ```
> >
> > Are above changes really needed?
> >
> > 12. CopyFormatOptions
> >
> > Can we remove `bool binary` in future?
> >
> > 13. external functions
> >
> > ```
> > +extern void CopyToFormatTextStart(CopyToState cstate, TupleDesc tupDesc);
> > +extern void CopyToFormatTextOneRow(CopyToState cstate, TupleTableSlot *slot);
> > +extern void CopyToFormatTextEnd(CopyToState cstate);
> > +extern void CopyFromFormatTextStart(CopyFromState cstate, TupleDesc tupDesc);
> > +extern bool CopyFromFormatTextNext(CopyFromState cstate, ExprContext *econtext,
> > +
> > Datum *values, bool *nulls);
> > +extern void CopyFromFormatTextErrorCallback(CopyFromState cstate);
> > +
> > +extern void CopyToFormatBinaryStart(CopyToState cstate, TupleDesc tupDesc);
> > +extern void CopyToFormatBinaryOneRow(CopyToState cstate, TupleTableSlot *slot);
> > +extern void CopyToFormatBinaryEnd(CopyToState cstate);
> > +extern void CopyFromFormatBinaryStart(CopyFromState cstate, TupleDesc tupDesc);
> > +extern bool CopyFromFormatBinaryNext(CopyFromState cstate,
> > ExprContext *econtext,
> > +
> >   Datum *values, bool *nulls);
> > +extern void CopyFromFormatBinaryErrorCallback(CopyFromState cstate);
> > ```
> >
> > FYI - If you add files for {text|csv|binary}, these declarations can be removed.
> >
> > Best Regards,
> > Hayato Kuroda
> > FUJITSU LIMITED
> >
>
> Thanks for all the valuable suggestions.
>
> --
> Regards
> Junwang Zhao
>
>

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Hannu Krosing
Дата:
Сообщение: Why are wal_keep_size, max_slot_wal_keep_size requiring server restart?
Следующее
От: John Naylor
Дата:
Сообщение: Re: Change GUC hashtable to use simplehash?