Re: Make COPY extendable in order to support Parquet and other formats
От | Aleksander Alekseev |
---|---|
Тема | Re: Make COPY extendable in order to support Parquet and other formats |
Дата | |
Msg-id | CAJ7c6TNFD84KK62xrGP-PDwPM7OESM8=TTv8TjsZpbOuNMnwGA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Make COPY extendable in order to support Parquet and other formats (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>) |
Список | pgsql-hackers |
Hi Ashutosh, > IIUC, you want extensibility in FORMAT argument to COPY command > https://www.postgresql.org/docs/current/sql-copy.html. Where the > format is pluggable. That seems useful. > Another option is to dump the data in csv format but use external > utility to convert csv to parquet or whatever other format is. I > understand that that's not going to be as efficient as dumping > directly in the desired format. Exactly. However, to clarify, I suspect this may be a bit more involved than simply extending the FORMAT arguments. This change per se will not be extremely useful. Currently nothing prevents an extension author to iterate over a table using heap_open(), heap_getnext(), etc API and dump its content in any format. The user will have to write "dump_table(foo, filename)" instead of "COPY ..." but that's not a big deal. The problem is that every new extension has to re-invent things like figuring out the schema, the validation of the data, etc. If we could do this in the core so that an extension author has to implement only the minimal format-dependent list of callbacks that would be really great. In order to make the interface practical though one will have to implement a practical extension as well, for instance, a Parquet one. This being said, if it turns out that for some reason this is not realistic to deliver, ending up with simply extending this part of the syntax a bit should be fine too. -- Best regards, Aleksander Alekseev
В списке pgsql-hackers по дате отправления: