Re: Emitting JSON to file using COPY TO

Поиск
Список
Период
Сортировка
От Sehrope Sarkuni
Тема Re: Emitting JSON to file using COPY TO
Дата
Msg-id CAH7T-aoWF9oHuUC2PJ303YVRFurkLMPayHawZayqdhtT2fpEHg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Emitting JSON to file using COPY TO  (Andrew Dunstan <andrew@dunslane.net>)
Ответы Re: Emitting JSON to file using COPY TO  (Joe Conway <mail@joeconway.com>)
Список pgsql-hackers
Big +1 to this overall feature.

This is something I've wanted for a long time as well. While it's possible to use a COPY with text output for a trivial case, the double escaping falls apart quickly for arbitrary data. It's really only usable when you know exactly what you are querying and know it will not be a problem.

Regarding the defaults for the output, I think JSON lines (rather than a JSON array of objects) would be preferred. It's more natural to combine them and generate that type of data on the fly rather than forcing aggregation into a single object.

Couple more features / use cases come to mind as well. Even if they're not part of a first round of this feature I think it'd be helpful to document them now as it might give some ideas for what does make that first cut:

1. Outputting a top level JSON object without the additional column keys. IIUC, the top level keys are always the column names. A common use case would be a single json/jsonb column that is already formatted exactly as the user would like for output. Rather than enveloping it in an object with a dedicated key, it would be nice to be able to output it directly. This would allow non-object results to be outputted as well (e.g., lines of JSON arrays, numbers, or strings). Due to how JSON is structured, I think this would play nice with the JSON lines v.s. array concept.

COPY (SELECT json_build_object('foo', x) AS i_am_ignored FROM generate_series(1, 3) x) TO STDOUT WITH (FORMAT JSON, SOME_OPTION_TO_NOT_ENVELOPE)
{"foo":1}
{"foo":2}
{"foo":3}

2. An option to ignore null fields so they are excluded from the output. This would not be a default but would allow shrinking the total size of the output data in many situations. This would be recursive to allow nested objects to be shrunk down (not just the top level). This might be worthwhile as a standalone JSON function though handling it during output would be more efficient as it'd only be read once.

COPY (SELECT json_build_object('foo', CASE WHEN x > 1 THEN x END) FROM generate_series(1, 3) x) TO STDOUT WITH (FORMAT JSON, SOME_OPTION_TO_NOT_ENVELOPE, JSON_SKIP_NULLS)
{}
{"foo":2}
{"foo":3}

3. Reverse of #2 when copying data in to allow defaulting missing fields to NULL.

Regards,
-- Sehrope Sarkuni
Founder & CEO | JackDB, Inc. | https://www.jackdb.com/

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: Remove MSVC scripts from the tree
Следующее
От: "Tristan Partin"
Дата:
Сообщение: Re: Clean up some signal usage mainly related to Windows