Re: Proposal to use JSON for Postgres Parser format

Поиск
Список
Период
Сортировка
От Matthias van de Meent
Тема Re: Proposal to use JSON for Postgres Parser format
Дата
Msg-id CAEze2WgeznFH0yQVcLa+-oV6Uiq4MMGsrZ6vmr-Q0nyZg3dn-w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Proposal to use JSON for Postgres Parser format  (Michel Pelletier <pelletier.michel@gmail.com>)
Список pgsql-hackers
On Mon, 31 Oct 2022 at 15:56, Michel Pelletier
<pelletier.michel@gmail.com> wrote:
> On Mon, Oct 31, 2022 at 6:15 AM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote:
>> On Mon, 31 Oct 2022 at 13:46, Alexander Korotkov <aekorotkov@gmail.com> wrote:
>>> On Fri, Oct 28, 2022 at 4:27 PM Andrew Dunstan <andrew@dunslane.net> wrote:
>>>> On 2022-10-27 Th 19:38, Andres Freund wrote:
>>>>> Hi,
>>>>>
>>>>> On 2022-09-19 22:29:15 -0400, Tom Lane wrote:
>>>>>> Maybe a compromise could be found whereby we provide a conversion function
>>>>>> that converts whatever the catalog storage format is to some JSON
>>>>>> equivalent.  That would address the needs of external code that doesn't want
>>>>>> to write a custom parser, while not tying us directly to JSON.
>>>>> +1
>>>>
>>>> Agreed.
>>>
>>> +1
>>>
>>> Michel, it seems that you now have a green light to implement node to
>>> json function.
>>
>> I think that Tom's proposal that we +1 is on a pg_node_tree to json
>> SQL function / cast; which is tangentially related to the "nodeToJson
>> / changing the storage format of pg_node_tree to json" proposal, but
>> not the same.
>
>
> I agree.
>
>>
>> I will add my +1 to Tom's proposal for that function/cast, but I'm not
>> sure on changing the storage format of pg_node_tree to json.
>
>
> I'm going to spike on this function and will get back to the thread with any updates.

Michel, did you get a result from this spike?

I'm asking, because as I spiked most of my ideas on updating the node
text format, and am working on wrapping it up into a patch (or
patchset) later this week. The ideas for this are:

1. Don't write fields with default values for their types, such as
NULL for Node* fields;
2. Reset location fields before transforming the node tree to text
when we don't have a copy of the original query, which removes
location fields from serialization with step 1;
3. Add default() node labels to struct fields that do not share the
field type's default, allowing more fields to be omitted with step 1;
4. Add special default_ref() pg_node_attr for node fields that default
to other node field's values, used in Var's varnosyn/varattnosyn as
refering to varno/varattno; and
5. Truncate trailing 0s in Const' outDatum notation of by-ref types,
so that e.g. Consts with `name` data don't waste so much space with 0s

Currently, it reduces the pg_total_relation_size metric of pg_rewrite
after TOAST compression by 35% vs pg16, down to 483328 bytes / 59
pages, from 753664 bytes / 92 pages. The raw size of the ev_action
column's data (that is, before compression) is reduced by 55% to
1.18MB (from 2.80MB), and the largest default shipped row (the
information_schema.columns view) in that table is reduced to 'only'
78kB raw, from 193kB.

RW performance hasn't been tested yet, so that is still to be determined...

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Drouvot, Bertrand"
Дата:
Сообщение: Re: Synchronizing slots from primary to standby
Следующее
От: jian he
Дата:
Сообщение: Re: PATCH: Add REINDEX tag to event triggers