Обсуждение: [PATCH] Introduce unified support for composite GUC options

Поиск
Список
Период
Сортировка

[PATCH] Introduce unified support for composite GUC options

От
Чумак Антон
Дата:

Hello hackers,

This patch adds a unified mechanism for declaring and using composite configuration options in GUC, eliminating the need to write a custom parser for each new complex data type.  New syntax for end user is json-like.
Currently, adding a new composite configuration option requires a significant amount of boilerplate code:

  1. For DBAs: Learning a new syntax for each composite option.
  2. For developers: Implementing a new parser from scratch for each composite type in GUC.

This patch solves these problems by providing a declarative system for defining composite types and their structure.

Major changes:

- guc_tables.h: Added new type config_composite for all composite configuration options.

- guc_composite.c: This file contains all functions related to composite options: calculating alignments, defining field types, working with memory, serialization. The functions from here are used in guc.c and guc_composite_gram.y

- guc.c: New code in this file describes the behavior of the system in the case of PGC_COMPOSITE

- guc_composite_scan.l: guc_composite_gram.y is a lexer and parser for values of composite data types.

 

Usage features:

Mapping between UI representation and internal variables works due to the signature that the programmer declares for the composite type.  For core options, the declaration data is in the UserDefinedConfigureTypes array. For extensions composite types are declared using the DefineCustomCompositeType function.

All declarations must be arranged topologically. That is, if the type A option contains a type B field, then type B must be declared first, and only after that type A. 

The main fields in the type definition are the type name and its signature.

The type signature has the following syntax:

“field_type field_name; field_type field_name; ...; field_type field_name”

where field_type is the already registered type, field_name is the field name.

There are also data types that do not need to be declared - these are arrays. So, if there is a registered type A, then the following data types automatically become available: A[n] is a static array of length n and A[] is a dynamic array. 

Note that the declared type signature must exactly match the signature of the structure in the C code, since it will then be used to calculate the alignment of fields according to the rules of the C language.

Dynamic arrays are always mapped into a structure like:

struct DynArr {

void *data; //pointer to data

int size; //length of the array

}

After declaring the type definition, you can declare a composite type configuration option. The core options are declared in the guc_parameters.dat file. They must specify the type => ‘composite’ fields and specify in the type_name field the name of the composite type that was declared earlier. In the boot_val field, write a pointer to a global variable that will store this value. Options from extensions are declared using DefineCustomCompositeVariable.

 

Now you can use the following syntax to work with the new options both in the configuration file and in psql:

Access field of the struct: option_name->field_name

Access to an array element: option_name[index]

You can combine these access methods.

Dynamic arrays always have implicit fields data and size. data is the data of the array, size is its length.

Values of composite types have the following syntax:

Structures: {field: value, ..., field: value}

Static arrays: [index: value, index: value]

As mentioned earlier, dynamic arrays have implicit fields, so you can use 2 syntaxes to set values.:

compact (same as for static arrays) and extended:

{data: [index: value, .., index: value], size: value}.

It is not necessary to write indexes in array values. If you write without indexes, it is assumed that indexing starts from 0 with an increment of 1. In this case, all elements within the same array must be either with or without indexes.

When using the show command, the display of the dynamic array depends on the extended_guc_arrays option. If this flag is true, then the extended form is used, otherwise the compact form is used.

String values within composite types also support escape sequences.

All the functionality available to scalar options is also supported, such as: …

The system uses incremental semantics. This means that when writing to a .conf file or the set command, only the specified fields of the structure will be changed, the remaining fields will not be involved. This semantics also applies to the ALTER SYSTEM. When using ALTER SYSTEM, the current value will be written to the .auto.conf file with the changed fields that were described when calling the command, while the current value of the option will not change.

The patch applies cleanly to the master (454c046094ab3431c2ce0c540c46e623bc05bd1a).

In the additional patch (guc_composite_types_tests.patch), I added several composite options so that the new functionality could be tested using their example. Regression and TAP tests were written for them in the same patch.

I would appreciate any feedback and review.

 

Best regards,
Anton Chumak

Вложения

Re: [PATCH] Introduce unified support for composite GUC options

От
Чумак Антон
Дата:

Hello hackers,

The new version of the patch adds support for multi-line writing of composite type values in the postgresql.conf file. Hidden fields have also been added. Such fields may be required to protect the private part of the state of a composite option from an external user. In order for the field to be hidden, the composite type signature must describe only the field type without the field name. 

Please note that all allocated resources used within hidden fields should use only guc_malloc. This is necessary to automatically release resources.

The patch applies cleanly to the master (9fc7f6ab7226d7c9dbe4ff333130c82f92749f69)

Best regards,
Anton Chumak
 

Вложения

Re: [PATCH] Introduce unified support for composite GUC options

От
Pavel Stehule
Дата:
Hi

what is use case for this?

Regards

Pavel


po 22. 9. 2025 v 8:55 odesílatel Чумак Антон <a.chumak@postgrespro.ru> napsal:

Hello hackers,

The new version of the patch adds support for multi-line writing of composite type values in the postgresql.conf file. Hidden fields have also been added. Such fields may be required to protect the private part of the state of a composite option from an external user. In order for the field to be hidden, the composite type signature must describe only the field type without the field name. 

Please note that all allocated resources used within hidden fields should use only guc_malloc. This is necessary to automatically release resources.

The patch applies cleanly to the master (9fc7f6ab7226d7c9dbe4ff333130c82f92749f69)

Best regards,
Anton Chumak
 

Re: [PATCH] Introduce unified support for composite GUC options

От
Tom Lane
Дата:
=?utf-8?q?=D0=A7=D1=83=D0=BC=D0=B0=D0=BA_=D0=90=D0=BD=D1=82=D0=BE=D0=BD?= <a.chumak@postgrespro.ru> writes:
> This patch adds a unified mechanism for declaring and using composite configuration options in GUC, eliminating the
needto write a custom parser for each new complex data type.  New syntax for end user is json-like. 

TBH, I think this is a bad idea altogether.  GUCs that would need
this are probably poorly designed in the first place; we should not
encourage inventing more.  I also don't love adding thousands of
lines of code without any use-case at hand.

            regards, tom lane



Re: [PATCH] Introduce unified support for composite GUC options

От
"David G. Johnston"
Дата:
On Monday, September 22, 2025, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Чум=D0�к Ан�=82он <a.chumak@postgrespro.ru> writes:
> This patch adds a unified mechanism for declaring and using composite configuration options in GUC, eliminating the need to write a custom parser for each new complex data type.  New syntax for end user is json-like.

TBH, I think this is a bad idea altogether.  GUCs that would need
this are probably poorly designed in the first place; we should not
encourage inventing more.  I also don't love adding thousands of
lines of code without any use-case at hand.


Yeah, there is a decent height bar for me too.  The main functional benefit we’d get is that since both (multiple) settings are being given values simultaneously the check option code can enforce that only valid combinations are ever specified instead of generally needing runtime checks.

Beyond that, just use separate options with a naming scheme.

I can maybe see this for session variables masquerading as GUCs since we lack the former.  Something like wanting to store a JWT as-is in a GUC then referencing its components.

David J.

Re: [PATCH] Introduce unified support for composite GUC options

От
Чумак Антон
Дата:

Sorry, I replied to the email without the hackers tag, so some of our correspondence was not saved on hackers. Therefore, I will quote my answer and Pavel's questions and remarks below.


>>Thank you for your question!
>>Composite parameters in a configuration system are needed to describe complex objects that have many interrelated parameters. Such examples already exist in PostgreSQL: synchronous_standby_names or primary_conninfo. And with these parameters, there are some difficulties for both developers and DBMS administrators.
>Do we really need this?
>synchronous_standby_names is a simple list and primary_conninfo is just a string - consistent with any other postgresql connection string.
 

synchronous_standby_names is somewhat more complicated than a regular list. Its first field is the mode, the second is the number of required replicas, and only then is the list. Note its check hook. A parser is called there, whose code length exceeds the rest of the logic associated with this parameter. This is exactly the kind of problem the patch solves.


>If you need to store more complex values, why you don't use integrated json parser? 
>
>I don't like you introduce new independent language just for GUC and this is not really short (and it is partially redundant to json). Currently working with GUC is simple, because supported operations and formats are simple.
 

I looked at the json value parsing function with the ability to use custom semantic actions, and it might be a really great idea to use it instead of a self-written parser. Then the composite values will have the standard json syntax, and the patch will probably decrease in size.


>>For administrators:
>> 1. The value of such parameters can only be written in full as a string and there is no way to access individual fields or substructure.
>> 2. Each such parameter has its own syntax (compare the syntax description of synchronous_standby_names and primary_conninfo) 
>>For developers:
>>1. For each composite parameter, you need to write your own parser that will parse the string value, instead of just describing the logic.
>>Personally, I needed to describe the cluster configuration. A cluster consists of nodes interconnected by some logic. And it turns out that in the current system, I need to write 1 more parser for this parameter, and the user will have to learn 1 more syntax.
>>This patch creates a unified approach to creating composite options, provides a unified syntax for values of composite types, adds the ability to work with fields and substructures, and eliminates the need for developers to write their own parsers for each composite parameter
>looks like overengineering for me - when you have complex configuration - isn't better to use table? Or json value - if you need to store all to one GUC.
 

Tables are not suitable for storing configuration, because we need GUC capabilities such as analyzing the source of a new value, working at the time of postmaster startup, SET LOCAL support, etc.


>Another issue is using symbols -> for dereferencing directly from the scanner. It can break applications that use the same symbols as a custom operator.
 

I made the dereference operator look like -> because the dot is already used to separate the class of names from options. It is possible to use a dot, but then we need to agree that composite parameters and extensions must not have the same names in order to avoid collisions.

Best regards

Anton Chumak


 

Re: [PATCH] Introduce unified support for composite GUC options

От
Pavel Stehule
Дата:
Hi

po 22. 9. 2025 v 23:31 odesílatel David G. Johnston <david.g.johnston@gmail.com> napsal:
On Monday, September 22, 2025, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Чумак Антон <a.chumak@postgrespro.ru> writes:
> This patch adds a unified mechanism for declaring and using composite configuration options in GUC, eliminating the need to write a custom parser for each new complex data type.  New syntax for end user is json-like.

TBH, I think this is a bad idea altogether.  GUCs that would need
this are probably poorly designed in the first place; we should not
encourage inventing more.  I also don't love adding thousands of
lines of code without any use-case at hand.


Yeah, there is a decent height bar for me too.  The main functional benefit we’d get is that since both (multiple) settings are being given values simultaneously the check option code can enforce that only valid combinations are ever specified instead of generally needing runtime checks.

Beyond that, just use separate options with a naming scheme.

I can maybe see this for session variables masquerading as GUCs since we lack the former.  Something like wanting to store a JWT as-is in a GUC then referencing its components.

Using GUC as session variables is a workaround because there is nothing better. But it is not good solution

- it doesn't to native format for data - all have to be text
- composites are not compatible with native composites, arrays are not compatible with native composites
- the GUC cannot be used simply in queries or expressions, and cannot be set as a result of query or expression
- the GUC are not stable, so there can be some unwanted artefacts inside queries

This patch has about 40% size like my session variables patches - and significantly less user documentation, significantly less regress tests - it does something else.

The basic question is if variables should be typed or typeless - like plpgsql or psql variables. Typeless variables are more simple on implementation, but a) can be too primitive to store some more complex structures, or b) are not simple on implementation - and the complexity is very similar to typed variables. When variables are typed, then it is necessary to modify plan cache invalidation (what is not necessary for typeless variables). With its own catalog, the enhancing plan cache invalidation is simple and clean (because invalidation of plan cache is based on watching catalog changes), without catalog the plan cache invalidation is much more dirty work. Another question is the possibility to set different defaults for GUC by ALTER command. It can be an interesting, but very dangerous feature for session variables.

GUC are great for configuration and holding possibly different defaults for role, database or system. PostgreSQL configuration is large, but it uses a very simple format, and I think it is working well.

Regards

Pavel







David J.

Re: [PATCH] Introduce unified support for composite GUC options

От
Pavel Stehule
Дата:


út 23. 9. 2025 v 5:38 odesílatel Чумак Антон <a.chumak@postgrespro.ru> napsal:

Sorry, I replied to the email without the hackers tag, so some of our correspondence was not saved on hackers. Therefore, I will quote my answer and Pavel's questions and remarks below.


>>Thank you for your question!
>>Composite parameters in a configuration system are needed to describe complex objects that have many interrelated parameters. Such examples already exist in PostgreSQL: synchronous_standby_names or primary_conninfo. And with these parameters, there are some difficulties for both developers and DBMS administrators.
>Do we really need this?
>synchronous_standby_names is a simple list and primary_conninfo is just a string - consistent with any other postgresql connection string.
 

synchronous_standby_names is somewhat more complicated than a regular list. Its first field is the mode, the second is the number of required replicas, and only then is the list. Note its check hook. A parser is called there, whose code length exceeds the rest of the logic associated with this parameter. This is exactly the kind of problem the patch solves.


>If you need to store more complex values, why you don't use integrated json parser? 
>
>I don't like you introduce new independent language just for GUC and this is not really short (and it is partially redundant to json). Currently working with GUC is simple, because supported operations and formats are simple.
 

I looked at the json value parsing function with the ability to use custom semantic actions, and it might be a really great idea to use it instead of a self-written parser. Then the composite values will have the standard json syntax, and the patch will probably decrease in size.


when you use json, then what is the benefit from your patch?
 
It is not too big difference if I set value by SET command or by SELECT set_config()

for some complex configuration I don't think so the best way is a direct modification of some complex value. You still need some helper functions, and these functions can hide all complexity. More - the performance there is not important, so plpgsql can be used well - and working with json in plpgsql is almost comfortable.


>>For administrators:
>> 1. The value of such parameters can only be written in full as a string and there is no way to access individual fields or substructure.
>> 2. Each such parameter has its own syntax (compare the syntax description of synchronous_standby_names and primary_conninfo) 
>>For developers:
>>1. For each composite parameter, you need to write your own parser that will parse the string value, instead of just describing the logic.
>>Personally, I needed to describe the cluster configuration. A cluster consists of nodes interconnected by some logic. And it turns out that in the current system, I need to write 1 more parser for this parameter, and the user will have to learn 1 more syntax.
>>This patch creates a unified approach to creating composite options, provides a unified syntax for values of composite types, adds the ability to work with fields and substructures, and eliminates the need for developers to write their own parsers for each composite parameter
>looks like overengineering for me - when you have complex configuration - isn't better to use table? Or json value - if you need to store all to one GUC.
 

Tables are not suitable for storing configuration, because we need GUC capabilities such as analyzing the source of a new value, working at the time of postmaster startup, SET LOCAL support, etc.


>Another issue is using symbols -> for dereferencing directly from the scanner. It can break applications that use the same symbols as a custom operator.
 

I made the dereference operator look like -> because the dot is already used to separate the class of names from options. It is possible to use a dot, but then we need to agree that composite parameters and extensions must not have the same names in order to avoid collisions.

Best regards

Anton Chumak


 

Re: [PATCH] Introduce unified support for composite GUC options

От
Tom Lane
Дата:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> Using GUC as session variables is a workaround because there is nothing
> better. But it is not good solution

Agreed, but we don't yet have a better one ...

> The basic question is if variables should be typed or typeless - like
> plpgsql or psql variables.

I think it is absolutely critical that GUCs *not* depend on the
SQL type system in any way.  That would be a fundamental layering
violation, because we need to be able to read postgresql.conf
before we can read catalogs --- not to mention that relevant type
definitions might be different in different databases.

I'm not sure that this point means much to the feature proposed in
this thread, since IIUC it's proposing "use JSON no matter what".
But it is a big problem for trying to use GUCs as session variables
with non-built-in types.

            regards, tom lane



Re: [PATCH] Introduce unified support for composite GUC options

От
Pavel Stehule
Дата:


út 23. 9. 2025 v 5:50 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> Using GUC as session variables is a workaround because there is nothing
> better. But it is not good solution

Agreed, but we don't yet have a better one ...

but better GUC will not be good session variables
 

> The basic question is if variables should be typed or typeless - like
> plpgsql or psql variables.

I think it is absolutely critical that GUCs *not* depend on the
SQL type system in any way.  That would be a fundamental layering
violation, because we need to be able to read postgresql.conf
before we can read catalogs --- not to mention that relevant type
definitions might be different in different databases.

I'm not sure that this point means much to the feature proposed in
this thread, since IIUC it's proposing "use JSON no matter what".
But it is a big problem for trying to use GUCs as session variables
with non-built-in types.

This is a very important note - and it clearly describes the advantages (and sense) of GUC.

GUC is good enough for work with text types - and our json is text type based.
I can serialize and deserialize any array or composite to GUC but it has no "nice" output in the SHOW command.

I can imagine an idea so the SET command can be able to evaluate simple expressions (like arguments of CALL statements).
But it is not possible without a compatibility break. So right side of SET command will be always reduced to value or list, and 
I don't see a strong benefit to enhancing it gently.



 

                        regards, tom lane

Re: [PATCH] Introduce unified support for composite GUC options

От
Чумак Антон
Дата:

>when you use json, then what is the benefit from your patch?

json is just a syntax. This is only part of the patch. The main feature is that we can directly, in a standard way, without the efforts of developers, translate composite values from user interfaces like psql or postgresql.conf into structures in C code. With this patch, the configuration system gains the ability to correctly manage the state of composite objects. This is important when you need to change 2 out of 5 fields at the same time so that the structure remains consistent. In addition, the new configuration module takes over the management of resources within the framework, which can be important for strings and dynamic arrays. There are other auxiliary features like hidden fields.


>It is not too big difference if I set value by SET command or by SELECT set_config()

Working with parameters is not limited to working within a session, otherwise the PGC_INTERNAL, PGC_POSTMASTER, and PGC_SIGHUP contexts would not be needed. My patch provides unified support for composite types and within such contexts. Example: you have a composite boot value and in the postgresql.conf file you need to change only 2 fields, and you need to do this at the same time to maintain the consistency of the structure. Now you would have to describe all the fields in one big line, and with the patch you can only describe the changed fields.


Best regards

Anton Chumak

 

Re: [PATCH] Introduce unified support for composite GUC options

От
Pavel Stehule
Дата:


út 23. 9. 2025 v 6:33 odesílatel Чумак Антон <a.chumak@postgrespro.ru> napsal:

>when you use json, then what is the benefit from your patch?

json is just a syntax. This is only part of the patch. The main feature is that we can directly, in a standard way, without the efforts of developers, translate composite values from user interfaces like psql or postgresql.conf into structures in C code. With this patch, the configuration system gains the ability to correctly manage the state of composite objects. This is important when you need to change 2 out of 5 fields at the same time so that the structure remains consistent. In addition, the new configuration module takes over the management of resources within the framework, which can be important for strings and dynamic arrays. There are other auxiliary features like hidden fields.

How common are composites in configuration? It goes against the simplicity of configuration.

And if you really need it - you can use plpgsql code and set_config function.


 


>It is not too big difference if I set value by SET command or by SELECT set_config()

Working with parameters is not limited to working within a session, otherwise the PGC_INTERNAL, PGC_POSTMASTER, and PGC_SIGHUP contexts would not be needed. My patch provides unified support for composite types and within such contexts. Example: you have a composite boot value and in the postgresql.conf file you need to change only 2 fields, and you need to do this at the same time to maintain the consistency of the structure. Now you would have to describe all the fields in one big line, and with the patch you can only describe the changed fields.

your patch does just parsing. At the end, you still need to validate values.
 


Best regards

Anton Chumak

 

Re: [PATCH] Introduce unified support for composite GUC options

От
"David G. Johnston"
Дата:
On Mon, Sep 22, 2025 at 10:33 PM Чумак Антон <a.chumak@postgrespro.ru> wrote:

Working with parameters is not limited to working within a session, otherwise the PGC_INTERNAL, PGC_POSTMASTER, and PGC_SIGHUP contexts would not be needed. My patch provides unified support for composite types and within such contexts. Example: you have a composite boot value and in the postgresql.conf file you need to change only 2 fields, and you need to do this at the same time to maintain the consistency of the structure. Now you would have to describe all the fields in one big line, and with the patch you can only describe the changed fields.

You might wish to try an approach where people who do think such a thing might be useful can voice their support for it rather than trying to convince people who don't presently see any immediate use cases that there are some (without saying what those use cases are...).

In short - post to -general.

As you note - moving runtime checks to "SET" time has value and this patch brings that value.  But it is not evident there is enough value to take on the added complexity.  There are few to no requests asking for this ability.

David J.

Re: [PATCH] Introduce unified support for composite GUC options

От
Tom Lane
Дата:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> As you note - moving runtime checks to "SET" time has value and this patch
> brings that value.  But it is not evident there is enough value to take on
> the added complexity.  There are few to no requests asking for this ability.

If anything, I'd say we have decades of experience showing that early
checking of GUC values creates more problems than it solves.  There
are too many cases where necessary context is not available at the
time of setting the value.  Particularly, CREATE FUNCTION ... SET
and ALTER DATABASE/USER ... SET are problematic for this.

            regards, tom lane