Обсуждение: Portal->commandTag as an enum

Поиск
Список
Период
Сортировка

Portal->commandTag as an enum

От
Mark Dilger
Дата:
Hackers,

I have implemented $subject, attached.


While reviewing the "New SQL counter statistics view (pg_stat_sql)” thread [1], I came across Andres’ comment

> That's not really something in this patch, but a lot of this would be
> better if we didn't internally have command tags as strings, but as an
> enum.  We should really only convert to a string with needed.  That
> we're doing string comparisons on Portal->commandTag is just plain bad.
>
>
>
> If so, we could also make this whole patch a lot cheaper - have a fixed
> size array that has an entry for every possible tag (possibly even in
> shared memory, and then use atomics there).



I put the CommandTag enum in src/common because there wasn’t any reason not to do so.  It seems plausible that frontend
testframeworks might want access to this enum.  I don’t have any frontend code using it yet, nor any concrete plans for
that. I’m indifferent about this, and will move it into src/backend if folks think that’s better. 


In commands/event_trigger.c, I changed the separation between EVENT_TRIGGER_COMMAND_TAG_NOT_SUPPORTED and
EVENT_TRIGGER_COMMAND_TAG_NOT_RECOGNIZED. It used to claim not to recognize command tags that are indeed recognized
elsewherein the system, but simply not expected here.  It now returns “not supported” for them, and only returns “not
recognized”for special enum values COMMANDTAG_NULL and COMMANDTAG_UNKNOWN, as well as values outside the recognized
rangeof the enum.  I’m happy to change my implementation to preserve the old behavior if necessary.  Is there a
backwardcompatibility issue here?  It does not impact regression test output for me to change this, but that’s not
definitive….

I have extended the event_trigger.sql regression test, with new expected output, and when applying that change to
master,the test fails due to the “not supported” vs. “not recognized” distinction.  I have kept this regression test
changein its own patch file, 0002.  The differences when applied to master look like: 

>  create event trigger regress_event_trigger_ALTER_SYSTEM on ddl_command_start
>     when tag in ('ALTER SYSTEM')
>     execute procedure test_event_trigger2();
> -ERROR:  event triggers are not supported for ALTER SYSTEM
> +ERROR:  filter value "ALTER SYSTEM" not recognized for filter variable "tag"



PreventCommandIfReadOnly and PreventCommandIfParallelMode sometimes take a commandTag, but in commands/sequence.c they
takestrings “nextval()” and “setval()”.  Likewise, PreventCommandDuringRecovery takes "txid_current()” in adt/txid.c.
Ihad to work around this a little, which was not hard to do, but it made me wonder if command tags and these sorts of
functionsshouldn’t be unified a bit more.  They don’t really look consistent with all the other values in the
CommandTagenum, so I left them out.  I’m open to opinions about this. 


There was some confusion in the code between a commandTag and a completionTag, with the commandTag getting overwritten
withthe completionTag over the course of execution.  I’ve split that out into two distinctly separate concepts, which I
thinkmakes the code easier to grok.  I’ve added a portal->completionTag field that is a fixed size buffer rather than a
palloc’dstring, to match how completionTag works elsewhere.  But the old code that was overwriting the commandTag (a
palloc’dstring) with a completionTag (a char[] buffer) was using pstrdup for that purpose.  I’m now using strlcpy.  I
don’tcare much which way to go here (buffer vs. palloc’d string).  Let me know if using a fixed sized buffer as I’ve
donebothers anybody. 


There were some instances of things like:

  strcpy(completionTag, portal->commandTag);

which should have more properly been

  strlcpy(completionTag, portal->commandTag, COMPLETION_TAG_BUFSIZE);

I don’t know if any of these were live bugs, but they seemed like traps for the future, should any new commandTag
lengthoverflow the buffer size.  I think this patch fixes all of those cases. 


Generating CommandTag enum values from user queries and then converting those back to string for printing or use in
set_ps_displayresults in normalization of the commandTag, by which I mean that it becomes all uppercase.  I don’t know
ofany situations where this would matter, but I can’t say for sure that it doesn’t.  Anybody have thoughts on that? 


[1] https://www.postgresql.org/message-id/flat/CAJrrPGeY4xujjoR=z=KoyRMHEK_pSjjp=7VBhOAHq9rfgpV7QQ@mail.gmail.com



—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Вложения

Re: Portal->commandTag as an enum

От
Tom Lane
Дата:
Mark Dilger <mark.dilger@enterprisedb.com> writes:
> I put the CommandTag enum in src/common because there wasn’t any reason
> not to do so.  It seems plausible that frontend test frameworks might want
> access to this enum.

Au contraire, that's an absolutely fundamental mistake.  There is
zero chance of this enum holding still across PG versions, so if
we expose it to frontend code, we're going to have big problems
with cross-version compatibility.  See our historical problems with
assuming the enum for character set encodings was the same between
frontend and backend ... and that set is orders of magnitude more
stable than this one.

As I recall, the hardest problem with de-string-ifying this is the fact
that for certain tags we include a rowcount in the string.  I'd like to
see that undone --- we have to keep it like that on-the-wire to avoid a
protocol break, but it'd be best if noplace inside the backend did it that
way, and we converted at the last moment before sending a CommandComplete
to the client.  Your references to "completion tags" make it sound like
you've only half done the conversion (though I've not read the patch
in enough detail to verify).

BTW, the size of the patch is rather depressing, especially if it's
only half done.  Unlike Andres, I'm not even a little bit convinced
that this is worth the amount of code churn it'll cause.  Have you
done any code-size or speed comparisons?

            regards, tom lane



Re: Portal->commandTag as an enum

От
Mark Dilger
Дата:

> On Feb 2, 2020, at 6:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Mark Dilger <mark.dilger@enterprisedb.com> writes:
>> I put the CommandTag enum in src/common because there wasn’t any reason
>> not to do so.  It seems plausible that frontend test frameworks might want
>> access to this enum.
>

Thanks for looking!

> Au contraire, that's an absolutely fundamental mistake.  There is
> zero chance of this enum holding still across PG versions, so if
> we expose it to frontend code, we're going to have big problems
> with cross-version compatibility.  See our historical problems with
> assuming the enum for character set encodings was the same between
> frontend and backend ... and that set is orders of magnitude more
> stable than this one.

I completely agree that this enum cannot be expected to remain stable across versions.

For the purposes of this patch, which has nothing to do with frontend tools, this issue doesn’t matter to me.  I’m
happyto move this into src/backend. 

Is there no place to put code which would be useful for frontend tools without implying stability?  Sure, psql and
friendscan’t use it, because they need to be able to connect to servers of other versions.  But why couldn’t a test
frameworktool use something like this?  Could we have someplace like src/common/volatile for this sort of thing? 

>
> As I recall, the hardest problem with de-string-ifying this is the fact
> that for certain tags we include a rowcount in the string.  I'd like to
> see that undone --- we have to keep it like that on-the-wire to avoid a
> protocol break, but it'd be best if noplace inside the backend did it that
> way, and we converted at the last moment before sending a CommandComplete
> to the client.  Your references to "completion tags" make it sound like
> you've only half done the conversion (though I've not read the patch
> in enough detail to verify).

In v1, I stayed closer to the existing code structure than you are requesting.  I like the direction you’re suggesting
thatI go, and I’ve begun that transition in anticipation of posting a v2 patch set soon. 

> BTW, the size of the patch is rather depressing, especially if it's
> only half done.  Unlike Andres, I'm not even a little bit convinced
> that this is worth the amount of code churn it'll cause.  Have you
> done any code-size or speed comparisons?

A fair amount of the code churn is replacing strings with their enum equivalent, creating the enum itself, and creating
adata table mapping enums to strings.  The churn doesn’t look too bad to me when viewing the original vs new code diff
side-by-side.

The second file (v1-0002…) is entirely an extension of the regression tests.  Applying v1-0001… doesn’t entail needing
toapply v1-0002… as the code being tested exists before and after the patch.  If you don’t want to apply that
regressiontest change, that’s fine.  It just provides more extensive coverage of event_triggers over different command
tags.

There will be a bit more churn in v2, since I’m changing the code flow a bit more to avoid generating the strings until
theyare about to get sent to the client, per your comments above.  That has the advantage that multiple places in the
oldcode where the completionTag was parsed to get the nprocessed count back out now doesn’t need any parsing. 

I’ll include stats about code-size and speed when I post v2.

Thanks again for reviewing my patch idea!

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Portal->commandTag as an enum

От
Mark Dilger
Дата:

> On Feb 3, 2020, at 9:41 AM, Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>
> In v1, I stayed closer to the existing code structure than you are requesting.  I like the direction you’re
suggestingthat I go, and I’ve begun that transition in anticipation of posting a v2 patch set soon. 

Ok, here is v2, attached.

In master, a number of functions pass a char *completionTag argument (really a char
completionTag[COMPLETION_TAG_BUFSIZE])which gets filled in with the string to return to the client from EndCommand.  I
haveremoved that kind of logic: 

-               /* save the rowcount if we're given a completionTag to fill */
-               if (completionTag)
-                       snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
-                                        "SELECT " UINT64_FORMAT,
-                                        queryDesc->estate->es_processed);

In the patch, this is replaced with a new struct, QueryCompletionData.  That bit of code above is replaced with:

+               /* save the rowcount if we're given a qc to fill */
+               if (qc)
+                       SetQC(qc, COMMANDTAG_SELECT, queryDesc->estate->es_processed, DISPLAYFORMAT_NPROCESSED);

For wire protocol compatibility, we have to track the display format.  When this gets to EndCommand, the display format
allowsthe string to be written exactly as the client will expect.  If we ever get to the point where we can break with
thatcompatibility, the third member of this struct, display_format, can be removed. 

Where string parsing is being done in master to get the count back out, it changes to look like this:

-                                       if (strncmp(completionTag, "SELECT ", 7) == 0)
-                                               _SPI_current->processed =
-                                                       pg_strtouint64(completionTag + 7, NULL, 10);
+                                       if (qcdata.commandTag == COMMANDTAG_SELECT)
+                                               _SPI_current->processed = qcdata.nprocessed;

One of the advantages to the patch is that the commandTag for a portal is not overwritten by the commandTag in the
QueryCompletionData,meaning for example that if an EXECUTE command returns the string “UPDATE 0”, the
portal->commandTagremains COMMANDTAG_EXECUTE while the qcdata.commandTag becomes COMMANDTAG_UPDATE.  This could be
helpfulto code trying to track how many operations of a given type have run. 

In event_trigger.c, in master there are ad-hoc comparisons against c-strings:

-       /*
-        * Handle some idiosyncratic special cases.
-        */
-       if (pg_strcasecmp(tag, "CREATE TABLE AS") == 0 ||
-               pg_strcasecmp(tag, "SELECT INTO") == 0 ||
-               pg_strcasecmp(tag, "REFRESH MATERIALIZED VIEW") == 0 ||
-               pg_strcasecmp(tag, "ALTER DEFAULT PRIVILEGES") == 0 ||
-               pg_strcasecmp(tag, "ALTER LARGE OBJECT") == 0 ||
-               pg_strcasecmp(tag, "COMMENT") == 0 ||
-               pg_strcasecmp(tag, "GRANT") == 0 ||
-               pg_strcasecmp(tag, "REVOKE") == 0 ||
-               pg_strcasecmp(tag, "DROP OWNED") == 0 ||
-               pg_strcasecmp(tag, "IMPORT FOREIGN SCHEMA") == 0 ||
-               pg_strcasecmp(tag, "SECURITY LABEL") == 0)

These are replaced by switch() case statements over the possible commandTags:

+       switch (commandTag)
+       {
+                       /*
+                        * Supported idiosyncratic special cases.
+                        */
+               case COMMANDTAG_ALTER_DEFAULT_PRIVILEGES:
+               case COMMANDTAG_ALTER_LARGE_OBJECT:
+               case COMMANDTAG_COMMENT:
+               case COMMANDTAG_CREATE_TABLE_AS:
+               case COMMANDTAG_DROP_OWNED:
+               case COMMANDTAG_GRANT:
+               case COMMANDTAG_IMPORT_FOREIGN_SCHEMA:
+               case COMMANDTAG_REFRESH_MATERIALIZED_VIEW:
+               case COMMANDTAG_REVOKE:
+               case COMMANDTAG_SECURITY_LABEL:
+               case COMMANDTAG_SELECT_INTO:

I think this is easier to read, verify, and maintain.  The compiler can help if you leave a command tag out of the
list,which the compiler cannot help discover in master as it is currently written.  But I also think all those
pg_strcasecmpcalls are likely more expensive at runtime. 

In master, EventTriggerCacheItem tracks a sorted array of palloc’d cstrings.  In the patch, that becomes a Bitmapset
overthe enum: 

typedef struct
 {
        Oid                     fnoid;                  /* function to be called */
        char            enabled;                /* as SESSION_REPLICATION_ROLE_* */
-       int                     ntags;                  /* number of command tags */
-       char      **tag;                        /* command tags in SORTED order */
+       Bitmapset  *tagset;                     /* command tags, or NULL if empty */
 } EventTriggerCacheItem;

The code in evtcache.c is shorter and, in my opinion, easier to read.  In filter_event_trigger, rather than running
bsearchthrough a sorted array of strings, it just runs bms_is_member. 

I’ve kept this change to the event trigger code in its own separate patch file, to make the change easier to review in
isolation.

> I’ll include stats about code-size and speed when I post v2.

The benchmarks are from tpc-b_96.sql.  I think I’ll need to adjust the benchmark to put more emphasis on the particular
codethat I’m changing, but I have run this standard benchmark for this email: 

For master (1fd687a035):

    postgresql % find src -type f -name "*.c" -or -name "*.h" | xargs cat | wc
     1482117 5690660 45256959

    postgresql % find src -type f -name "*.o" | xargs cat | wc
       38283  476264 18999164

Averages for test set 1 by scale:
set    scale    tps    avg_latency    90%<    max_latency
1    1    3741    1.734    3.162    133.718
1    9    6124    0.904    1.05    230.547
1    81    5921    0.931    1.015    67.023

Averages for test set 1 by clients:
set    clients    tps    avg_latency    90%<    max_latency
1    1    2163    0.461    0.514    24.414
1    4    5968    0.675    0.791    40.354
1    16    7655    2.433    3.922    366.519


For command tag patch (branched from 1fd687a035):

    postgresql % find src -type f -name "*.c" -or -name "*.h" | xargs cat | wc
     1482969 5691908 45281399

    postgresql % find src -type f -name "*.o" | xargs cat | wc
       38209  476243 18999752


Averages for test set 1 by scale:
set    scale    tps    avg_latency    90%<    max_latency
1    1    3877    1.645    3.066    24.973
1    9    6383    0.859    1.032    64.566
1    81    5945    0.925    1.023    162.9

Averages for test set 1 by clients:
set    clients    tps    avg_latency    90%<    max_latency
1    1    2141    0.466    0.522    11.531
1    4    5967    0.673    0.783    136.882
1    16    8096    2.292    3.817    104.026





—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Вложения

Re: Portal->commandTag as an enum

От
Andres Freund
Дата:
Hi,

On 2020-02-04 18:18:52 -0800, Mark Dilger wrote:
> In master, a number of functions pass a char *completionTag argument (really a char
completionTag[COMPLETION_TAG_BUFSIZE])which gets filled in with the string to return to the client from EndCommand.  I
haveremoved that kind of logic:
 
>
> -               /* save the rowcount if we're given a completionTag to fill */
> -               if (completionTag)
> -                       snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
> -                                        "SELECT " UINT64_FORMAT,
> -                                        queryDesc->estate->es_processed);
>
> In the patch, this is replaced with a new struct, QueryCompletionData.  That bit of code above is replaced with:
>
> +               /* save the rowcount if we're given a qc to fill */
> +               if (qc)
> +                       SetQC(qc, COMMANDTAG_SELECT, queryDesc->estate->es_processed, DISPLAYFORMAT_NPROCESSED);
>
> For wire protocol compatibility, we have to track the display format.
> When this gets to EndCommand, the display format allows the string to
> be written exactly as the client will expect.  If we ever get to the
> point where we can break with that compatibility, the third member of
> this struct, display_format, can be removed.

Hm. While I like not having this as strings a lot, I wish we could get
rid of this displayformat stuff.



> These are replaced by switch() case statements over the possible commandTags:
>
> +       switch (commandTag)
> +       {
> +                       /*
> +                        * Supported idiosyncratic special cases.
> +                        */
> +               case COMMANDTAG_ALTER_DEFAULT_PRIVILEGES:
> +               case COMMANDTAG_ALTER_LARGE_OBJECT:
> +               case COMMANDTAG_COMMENT:
> +               case COMMANDTAG_CREATE_TABLE_AS:
> +               case COMMANDTAG_DROP_OWNED:
> +               case COMMANDTAG_GRANT:
> +               case COMMANDTAG_IMPORT_FOREIGN_SCHEMA:
> +               case COMMANDTAG_REFRESH_MATERIALIZED_VIEW:
> +               case COMMANDTAG_REVOKE:
> +               case COMMANDTAG_SECURITY_LABEL:
> +               case COMMANDTAG_SELECT_INTO:

The number of these makes me wonder if we should just have a metadata
table in one place, instead of needing to edit multiple
locations. Something like

const ... CommandTagBehaviour[] = {
    [COMMANDTAG_INSERT] = {
        .display_processed = true, .display_oid = true, ...},
    [COMMANDTAG_CREATE_TABLE_AS] = {
        .event_trigger = true, ...},

with the zero initialized defaults being the common cases.

Not sure if it's worth going there. But it's maybe worth thinking about
for a minute?


> Averages for test set 1 by scale:
> set    scale    tps    avg_latency    90%<    max_latency
> 1    1    3741    1.734    3.162    133.718
> 1    9    6124    0.904    1.05    230.547
> 1    81    5921    0.931    1.015    67.023
>
> Averages for test set 1 by clients:
> set    clients    tps    avg_latency    90%<    max_latency
> 1    1    2163    0.461    0.514    24.414
> 1    4    5968    0.675    0.791    40.354
> 1    16    7655    2.433    3.922    366.519
>
>
> For command tag patch (branched from 1fd687a035):
>
>     postgresql % find src -type f -name "*.c" -or -name "*.h" | xargs cat | wc
>      1482969 5691908 45281399
>
>     postgresql % find src -type f -name "*.o" | xargs cat | wc
>        38209  476243 18999752
>
>
> Averages for test set 1 by scale:
> set    scale    tps    avg_latency    90%<    max_latency
> 1    1    3877    1.645    3.066    24.973
> 1    9    6383    0.859    1.032    64.566
> 1    81    5945    0.925    1.023    162.9
>
> Averages for test set 1 by clients:
> set    clients    tps    avg_latency    90%<    max_latency
> 1    1    2141    0.466    0.522    11.531
> 1    4    5967    0.673    0.783    136.882
> 1    16    8096    2.292    3.817    104.026

Not bad.


> diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
> index 9aa2b61600..5322c14ce4 100644
> --- a/src/backend/commands/async.c
> +++ b/src/backend/commands/async.c
> @@ -594,7 +594,7 @@ pg_notify(PG_FUNCTION_ARGS)
>          payload = text_to_cstring(PG_GETARG_TEXT_PP(1));
>
>      /* For NOTIFY as a statement, this is checked in ProcessUtility */
> -    PreventCommandDuringRecovery("NOTIFY");
> +    PreventCommandDuringRecovery(COMMANDTAG_NOTIFY);
>
>      Async_Notify(channel, payload);
>
> diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
> index 40a8ec1abd..4828e75bd5 100644
> --- a/src/backend/commands/copy.c
> +++ b/src/backend/commands/copy.c
> @@ -1063,7 +1063,7 @@ DoCopy(ParseState *pstate, const CopyStmt *stmt,
>
>          /* check read-only transaction and parallel mode */
>          if (XactReadOnly && !rel->rd_islocaltemp)
> -            PreventCommandIfReadOnly("COPY FROM");
> +            PreventCommandIfReadOnly(COMMANDTAG_COPY_FROM);
>
>          cstate = BeginCopyFrom(pstate, rel, stmt->filename, stmt->is_program,
>                                 NULL, stmt->attlist, stmt->options);

I'm not sure this really ought to be part of this change - seems like a
somewhat independent change to me. With less obvious benefits.



>  static event_trigger_command_tag_check_result
> -check_ddl_tag(const char *tag)
> +check_ddl_tag(CommandTag commandTag)
>  {
> -    const char *obtypename;
> -    const event_trigger_support_data *etsd;
> +    switch (commandTag)
> +    {
> +            /*
> +             * Supported idiosyncratic special cases.
> +             */
> +        case COMMANDTAG_ALTER_DEFAULT_PRIVILEGES:
> +        case COMMANDTAG_ALTER_LARGE_OBJECT:
> +        case COMMANDTAG_COMMENT:
> +        case COMMANDTAG_CREATE_TABLE_AS:
> +        case COMMANDTAG_DROP_OWNED:
> +        case COMMANDTAG_GRANT:
> +        case COMMANDTAG_IMPORT_FOREIGN_SCHEMA:
> +        case COMMANDTAG_REFRESH_MATERIALIZED_VIEW:
> +        case COMMANDTAG_REVOKE:
> +        case COMMANDTAG_SECURITY_LABEL:
> +        case COMMANDTAG_SELECT_INTO:
>
> -    /*
> -     * Handle some idiosyncratic special cases.
> -     */
> -    if (pg_strcasecmp(tag, "CREATE TABLE AS") == 0 ||
> -        pg_strcasecmp(tag, "SELECT INTO") == 0 ||
> -        pg_strcasecmp(tag, "REFRESH MATERIALIZED VIEW") == 0 ||
> -        pg_strcasecmp(tag, "ALTER DEFAULT PRIVILEGES") == 0 ||
> -        pg_strcasecmp(tag, "ALTER LARGE OBJECT") == 0 ||
> -        pg_strcasecmp(tag, "COMMENT") == 0 ||
> -        pg_strcasecmp(tag, "GRANT") == 0 ||
> -        pg_strcasecmp(tag, "REVOKE") == 0 ||
> -        pg_strcasecmp(tag, "DROP OWNED") == 0 ||
> -        pg_strcasecmp(tag, "IMPORT FOREIGN SCHEMA") == 0 ||
> -        pg_strcasecmp(tag, "SECURITY LABEL") == 0)
> -        return EVENT_TRIGGER_COMMAND_TAG_OK;
> +            /*
> +             * Supported CREATE commands
> +             */
> +        case COMMANDTAG_CREATE_ACCESS_METHOD:
> +        case COMMANDTAG_CREATE_AGGREGATE:
> +        case COMMANDTAG_CREATE_CAST:
> +        case COMMANDTAG_CREATE_COLLATION:
> +        case COMMANDTAG_CREATE_CONSTRAINT:
> +        case COMMANDTAG_CREATE_CONVERSION:
> +        case COMMANDTAG_CREATE_DOMAIN:
> +        case COMMANDTAG_CREATE_EXTENSION:
> +        case COMMANDTAG_CREATE_FOREIGN_DATA_WRAPPER:
> +        case COMMANDTAG_CREATE_FOREIGN_TABLE:
> +        case COMMANDTAG_CREATE_FUNCTION:
> +        case COMMANDTAG_CREATE_INDEX:
> +        case COMMANDTAG_CREATE_LANGUAGE:
> +        case COMMANDTAG_CREATE_MATERIALIZED_VIEW:
> +        case COMMANDTAG_CREATE_OPERATOR:
> +        case COMMANDTAG_CREATE_OPERATOR_CLASS:
> +        case COMMANDTAG_CREATE_OPERATOR_FAMILY:
> +        case COMMANDTAG_CREATE_POLICY:
> +        case COMMANDTAG_CREATE_PROCEDURE:
> +        case COMMANDTAG_CREATE_PUBLICATION:
> +        case COMMANDTAG_CREATE_ROUTINE:
> +        case COMMANDTAG_CREATE_RULE:
> +        case COMMANDTAG_CREATE_SCHEMA:
> +        case COMMANDTAG_CREATE_SEQUENCE:
> +        case COMMANDTAG_CREATE_SERVER:
> +        case COMMANDTAG_CREATE_STATISTICS:
> +        case COMMANDTAG_CREATE_SUBSCRIPTION:
> +        case COMMANDTAG_CREATE_TABLE:
> +        case COMMANDTAG_CREATE_TEXT_SEARCH_CONFIGURATION:
> +        case COMMANDTAG_CREATE_TEXT_SEARCH_DICTIONARY:
> +        case COMMANDTAG_CREATE_TEXT_SEARCH_PARSER:
> +        case COMMANDTAG_CREATE_TEXT_SEARCH_TEMPLATE:
> +        case COMMANDTAG_CREATE_TRANSFORM:
> +        case COMMANDTAG_CREATE_TRIGGER:
> +        case COMMANDTAG_CREATE_TYPE:
> +        case COMMANDTAG_CREATE_USER_MAPPING:
> +        case COMMANDTAG_CREATE_VIEW:
>
> -    /*
> -     * Otherwise, command should be CREATE, ALTER, or DROP.
> -     */
> -    if (pg_strncasecmp(tag, "CREATE ", 7) == 0)
> -        obtypename = tag + 7;
> -    else if (pg_strncasecmp(tag, "ALTER ", 6) == 0)
> -        obtypename = tag + 6;
> -    else if (pg_strncasecmp(tag, "DROP ", 5) == 0)
> -        obtypename = tag + 5;
> -    else
> -        return EVENT_TRIGGER_COMMAND_TAG_NOT_RECOGNIZED;
> +            /*
> +             * Supported ALTER commands
> +             */
> +        case COMMANDTAG_ALTER_ACCESS_METHOD:
> +        case COMMANDTAG_ALTER_AGGREGATE:
> +        case COMMANDTAG_ALTER_CAST:
> +        case COMMANDTAG_ALTER_COLLATION:
> +        case COMMANDTAG_ALTER_CONSTRAINT:
> +        case COMMANDTAG_ALTER_CONVERSION:
> +        case COMMANDTAG_ALTER_DOMAIN:
> +        case COMMANDTAG_ALTER_EXTENSION:
> +        case COMMANDTAG_ALTER_FOREIGN_DATA_WRAPPER:
> +        case COMMANDTAG_ALTER_FOREIGN_TABLE:
> +        case COMMANDTAG_ALTER_FUNCTION:
> +        case COMMANDTAG_ALTER_INDEX:
> +        case COMMANDTAG_ALTER_LANGUAGE:
> +        case COMMANDTAG_ALTER_MATERIALIZED_VIEW:
> +        case COMMANDTAG_ALTER_OPERATOR:
> +        case COMMANDTAG_ALTER_OPERATOR_CLASS:
> +        case COMMANDTAG_ALTER_OPERATOR_FAMILY:
> +        case COMMANDTAG_ALTER_POLICY:
> +        case COMMANDTAG_ALTER_PROCEDURE:
> +        case COMMANDTAG_ALTER_PUBLICATION:
> +        case COMMANDTAG_ALTER_ROUTINE:
> +        case COMMANDTAG_ALTER_RULE:
> +        case COMMANDTAG_ALTER_SCHEMA:
> +        case COMMANDTAG_ALTER_SEQUENCE:
> +        case COMMANDTAG_ALTER_SERVER:
> +        case COMMANDTAG_ALTER_STATISTICS:
> +        case COMMANDTAG_ALTER_SUBSCRIPTION:
> +        case COMMANDTAG_ALTER_TABLE:
> +        case COMMANDTAG_ALTER_TEXT_SEARCH_CONFIGURATION:
> +        case COMMANDTAG_ALTER_TEXT_SEARCH_DICTIONARY:
> +        case COMMANDTAG_ALTER_TEXT_SEARCH_PARSER:
> +        case COMMANDTAG_ALTER_TEXT_SEARCH_TEMPLATE:
> +        case COMMANDTAG_ALTER_TRANSFORM:
> +        case COMMANDTAG_ALTER_TRIGGER:
> +        case COMMANDTAG_ALTER_TYPE:
> +        case COMMANDTAG_ALTER_USER_MAPPING:
> +        case COMMANDTAG_ALTER_VIEW:
>
> -    /*
> -     * ...and the object type should be something recognizable.
> -     */
> -    for (etsd = event_trigger_support; etsd->obtypename != NULL; etsd++)
> -        if (pg_strcasecmp(etsd->obtypename, obtypename) == 0)
> +            /*
> +             * Supported DROP commands
> +             */
> +        case COMMANDTAG_DROP_ACCESS_METHOD:
> +        case COMMANDTAG_DROP_AGGREGATE:
> +        case COMMANDTAG_DROP_CAST:
> +        case COMMANDTAG_DROP_COLLATION:
> +        case COMMANDTAG_DROP_CONSTRAINT:
> +        case COMMANDTAG_DROP_CONVERSION:
> +        case COMMANDTAG_DROP_DOMAIN:
> +        case COMMANDTAG_DROP_EXTENSION:
> +        case COMMANDTAG_DROP_FOREIGN_DATA_WRAPPER:
> +        case COMMANDTAG_DROP_FOREIGN_TABLE:
> +        case COMMANDTAG_DROP_FUNCTION:
> +        case COMMANDTAG_DROP_INDEX:
> +        case COMMANDTAG_DROP_LANGUAGE:
> +        case COMMANDTAG_DROP_MATERIALIZED_VIEW:
> +        case COMMANDTAG_DROP_OPERATOR:
> +        case COMMANDTAG_DROP_OPERATOR_CLASS:
> +        case COMMANDTAG_DROP_OPERATOR_FAMILY:
> +        case COMMANDTAG_DROP_POLICY:
> +        case COMMANDTAG_DROP_PROCEDURE:
> +        case COMMANDTAG_DROP_PUBLICATION:
> +        case COMMANDTAG_DROP_ROUTINE:
> +        case COMMANDTAG_DROP_RULE:
> +        case COMMANDTAG_DROP_SCHEMA:
> +        case COMMANDTAG_DROP_SEQUENCE:
> +        case COMMANDTAG_DROP_SERVER:
> +        case COMMANDTAG_DROP_STATISTICS:
> +        case COMMANDTAG_DROP_SUBSCRIPTION:
> +        case COMMANDTAG_DROP_TABLE:
> +        case COMMANDTAG_DROP_TEXT_SEARCH_CONFIGURATION:
> +        case COMMANDTAG_DROP_TEXT_SEARCH_DICTIONARY:
> +        case COMMANDTAG_DROP_TEXT_SEARCH_PARSER:
> +        case COMMANDTAG_DROP_TEXT_SEARCH_TEMPLATE:
> +        case COMMANDTAG_DROP_TRANSFORM:
> +        case COMMANDTAG_DROP_TRIGGER:
> +        case COMMANDTAG_DROP_TYPE:
> +        case COMMANDTAG_DROP_USER_MAPPING:
> +        case COMMANDTAG_DROP_VIEW:
> +            return EVENT_TRIGGER_COMMAND_TAG_OK;
> +
> +            /*
> +             * Unsupported CREATE commands
> +             */
> +        case COMMANDTAG_CREATE_DATABASE:
> +        case COMMANDTAG_CREATE_EVENT_TRIGGER:
> +        case COMMANDTAG_CREATE_ROLE:
> +        case COMMANDTAG_CREATE_TABLESPACE:
> +
> +            /*
> +             * Unsupported ALTER commands
> +             */
> +        case COMMANDTAG_ALTER_DATABASE:
> +        case COMMANDTAG_ALTER_EVENT_TRIGGER:
> +        case COMMANDTAG_ALTER_ROLE:
> +        case COMMANDTAG_ALTER_TABLESPACE:
> +
> +            /*
> +             * Unsupported DROP commands
> +             */
> +        case COMMANDTAG_DROP_DATABASE:
> +        case COMMANDTAG_DROP_EVENT_TRIGGER:
> +        case COMMANDTAG_DROP_ROLE:
> +        case COMMANDTAG_DROP_TABLESPACE:
> +
> +            /*
> +             * Other unsupported commands.  These used to return
> +             * EVENT_TRIGGER_COMMAND_TAG_NOT_RECOGNIZED prior to the
> +             * conversion of commandTag from string to enum.
> +             */
> +        case COMMANDTAG_ALTER_SYSTEM:
> +        case COMMANDTAG_ANALYZE:
> +        case COMMANDTAG_BEGIN:
> +        case COMMANDTAG_CALL:
> +        case COMMANDTAG_CHECKPOINT:
> +        case COMMANDTAG_CLOSE:
> +        case COMMANDTAG_CLOSE_CURSOR:
> +        case COMMANDTAG_CLOSE_CURSOR_ALL:
> +        case COMMANDTAG_CLUSTER:
> +        case COMMANDTAG_COMMIT:
> +        case COMMANDTAG_COMMIT_PREPARED:
> +        case COMMANDTAG_COPY:
> +        case COMMANDTAG_COPY_FROM:
> +        case COMMANDTAG_DEALLOCATE:
> +        case COMMANDTAG_DEALLOCATE_ALL:
> +        case COMMANDTAG_DECLARE_CURSOR:
> +        case COMMANDTAG_DELETE:
> +        case COMMANDTAG_DISCARD:
> +        case COMMANDTAG_DISCARD_ALL:
> +        case COMMANDTAG_DISCARD_PLANS:
> +        case COMMANDTAG_DISCARD_SEQUENCES:
> +        case COMMANDTAG_DISCARD_TEMP:
> +        case COMMANDTAG_DO:
> +        case COMMANDTAG_DROP_REPLICATION_SLOT:
> +        case COMMANDTAG_EXECUTE:
> +        case COMMANDTAG_EXPLAIN:
> +        case COMMANDTAG_FETCH:
> +        case COMMANDTAG_GRANT_ROLE:
> +        case COMMANDTAG_INSERT:
> +        case COMMANDTAG_LISTEN:
> +        case COMMANDTAG_LOAD:
> +        case COMMANDTAG_LOCK_TABLE:
> +        case COMMANDTAG_MOVE:
> +        case COMMANDTAG_NOTIFY:
> +        case COMMANDTAG_PREPARE:
> +        case COMMANDTAG_PREPARE_TRANSACTION:
> +        case COMMANDTAG_REASSIGN_OWNED:
> +        case COMMANDTAG_REINDEX:
> +        case COMMANDTAG_RELEASE:
> +        case COMMANDTAG_RESET:
> +        case COMMANDTAG_REVOKE_ROLE:
> +        case COMMANDTAG_ROLLBACK:
> +        case COMMANDTAG_ROLLBACK_PREPARED:
> +        case COMMANDTAG_SAVEPOINT:
> +        case COMMANDTAG_SELECT:
> +        case COMMANDTAG_SELECT_FOR_KEY_SHARE:
> +        case COMMANDTAG_SELECT_FOR_NO_KEY_UPDATE:
> +        case COMMANDTAG_SELECT_FOR_SHARE:
> +        case COMMANDTAG_SELECT_FOR_UPDATE:
> +        case COMMANDTAG_SET:
> +        case COMMANDTAG_SET_CONSTRAINTS:
> +        case COMMANDTAG_SHOW:
> +        case COMMANDTAG_START_TRANSACTION:
> +        case COMMANDTAG_TRUNCATE_TABLE:
> +        case COMMANDTAG_UNLISTEN:
> +        case COMMANDTAG_UPDATE:
> +        case COMMANDTAG_VACUUM:
> +            return EVENT_TRIGGER_COMMAND_TAG_NOT_SUPPORTED;
> +        case COMMANDTAG_UNKNOWN:
>              break;
> -    if (etsd->obtypename == NULL)
> -        return EVENT_TRIGGER_COMMAND_TAG_NOT_RECOGNIZED;
> -    if (!etsd->supported)
> -        return EVENT_TRIGGER_COMMAND_TAG_NOT_SUPPORTED;
> -    return EVENT_TRIGGER_COMMAND_TAG_OK;
> +    }
> +    return EVENT_TRIGGER_COMMAND_TAG_NOT_RECOGNIZED;
>  }

This is pretty painful.


> @@ -745,7 +902,7 @@ EventTriggerCommonSetup(Node *parsetree,
>          return NIL;
>
>      /* Get the command tag. */
> -    tag = CreateCommandTag(parsetree);
> +    tag = GetCommandTagName(CreateCommandTag(parsetree));
>
>      /*
>       * Filter list of event triggers by command tag, and copy them into our
> @@ -2136,7 +2293,7 @@ pg_event_trigger_ddl_commands(PG_FUNCTION_ARGS)
>                      /* objsubid */
>                      values[i++] = Int32GetDatum(addr.objectSubId);
>                      /* command tag */
> -                    values[i++] = CStringGetTextDatum(CreateCommandTag(cmd->parsetree));
> +                    values[i++] = CStringGetTextDatum(GetCommandTagName(CreateCommandTag(cmd->parsetree)));
>                      /* object_type */
>                      values[i++] = CStringGetTextDatum(type);
>                      /* schema */
> @@ -2161,7 +2318,7 @@ pg_event_trigger_ddl_commands(PG_FUNCTION_ARGS)
>                  /* objsubid */
>                  nulls[i++] = true;
>                  /* command tag */
> -                values[i++] = CStringGetTextDatum(CreateCommandTag(cmd->parsetree));
> +                values[i++] = CStringGetTextDatum(GetCommandTagName(CreateCommandTag(cmd->parsetree)));
>                  /* object_type */
>                  values[i++] = CStringGetTextDatum(stringify_adefprivs_objtype(cmd->d.defprivs.objtype));
>                  /* schema */

So GetCommandTagName we commonly do twice for some reason? Once in
EventTriggerCommonSetup() and then again in
pg_event_trigger_ddl_commands()? Why is EventTriggerData.tag still the
string?

>      Assert(list_length(plan->plancache_list) == 1);
> @@ -1469,7 +1469,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
>                          (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>                  /* translator: %s is a SQL statement name */
>                           errmsg("%s is not allowed in a non-volatile function",
> -                                CreateCommandTag((Node *) pstmt))));
> +                                GetCommandTagName(CreateCommandTag((Node *) pstmt)))));

Probably worth having a wrapper for this - these lines are pretty long,
and there quite a number of cases like it in the patch.

> @@ -172,11 +175,38 @@ EndCommand(const char *commandTag, CommandDest dest)
>          case DestRemoteSimple:
>
>              /*
> -             * We assume the commandTag is plain ASCII and therefore requires
> -             * no encoding conversion.
> +             * We assume the tagname is plain ASCII and therefore
> +             * requires no encoding conversion.
>               */
> -            pq_putmessage('C', commandTag, strlen(commandTag) + 1);
> -            break;
> +            tagname = GetCommandTagName(qc->commandTag);
> +            switch (qc->display_format)
> +            {
> +                case DISPLAYFORMAT_PLAIN:
> +                    pq_putmessage('C', tagname, strlen(tagname) + 1);
> +                    break;
> +                case DISPLAYFORMAT_LAST_OID:
> +                    /*
> +                     * We no longer display LastOid, but to preserve the wire protocol,
> +                     * we write InvalidOid where the LastOid used to be written.  For
> +                     * efficiency in the snprintf(), hard-code InvalidOid as zero.
> +                     */
> +                    Assert(InvalidOid == 0);
> +                    snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
> +                                "%s 0 " UINT64_FORMAT,
> +                                tagname,
> +                                qc->nprocessed);
> +                    pq_putmessage('C', completionTag, strlen(completionTag) + 1);
> +                    break;
> +                case DISPLAYFORMAT_NPROCESSED:
> +                    snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
> +                            "%s " UINT64_FORMAT,
> +                            tagname,
> +                            qc->nprocessed);
> +                    pq_putmessage('C', completionTag, strlen(completionTag) + 1);
> +                    break;
> +                default:
> +                    elog(ERROR, "Invalid display_format in EndCommand");
> +            }

Imo there should only be one pq_putmessage().  Also think this type of
default: is a bad idea, just prevents the compiler from warning if we
were to ever introduce a new variant of DISPLAYFORMAT_*, without
providing any meaningful additional security.

> @@ -855,7 +889,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>
>          case T_DiscardStmt:
>              /* should we allow DISCARD PLANS? */
> -            CheckRestrictedOperation("DISCARD");
> +            CheckRestrictedOperation(COMMANDTAG_DISCARD);
>              DiscardCommand((DiscardStmt *) parsetree, isTopLevel);
>              break;
>
> @@ -974,7 +1008,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>                  if (EventTriggerSupportsObjectType(stmt->objtype))
>                      ProcessUtilitySlow(pstate, pstmt, queryString,
>                                         context, params, queryEnv,
> -                                       dest, completionTag);
> +                                       dest, qc);
>                  else
>                      ExecuteGrantStmt(stmt);
>              }
> @@ -987,7 +1021,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>                  if (EventTriggerSupportsObjectType(stmt->removeType))
>                      ProcessUtilitySlow(pstate, pstmt, queryString,
>                                         context, params, queryEnv,
> -                                       dest, completionTag);
> +                                       dest, qc);
>                  else
>                      ExecDropStmt(stmt, isTopLevel);
>              }
> @@ -1000,7 +1034,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>                  if (EventTriggerSupportsObjectType(stmt->renameType))
>                      ProcessUtilitySlow(pstate, pstmt, queryString,
>                                         context, params, queryEnv,
> -                                       dest, completionTag);
> +                                       dest, qc);
>                  else
>                      ExecRenameStmt(stmt);
>              }
> @@ -1013,7 +1047,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>                  if (EventTriggerSupportsObjectType(stmt->objectType))
>                      ProcessUtilitySlow(pstate, pstmt, queryString,
>                                         context, params, queryEnv,
> -                                       dest, completionTag);
> +                                       dest, qc);
>                  else
>                      ExecAlterObjectDependsStmt(stmt, NULL);
>              }
> @@ -1026,7 +1060,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>                  if (EventTriggerSupportsObjectType(stmt->objectType))
>                      ProcessUtilitySlow(pstate, pstmt, queryString,
>                                         context, params, queryEnv,
> -                                       dest, completionTag);
> +                                       dest, qc);
>                  else
>                      ExecAlterObjectSchemaStmt(stmt, NULL);
>              }
> @@ -1039,7 +1073,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>                  if (EventTriggerSupportsObjectType(stmt->objectType))
>                      ProcessUtilitySlow(pstate, pstmt, queryString,
>                                         context, params, queryEnv,
> -                                       dest, completionTag);
> +                                       dest, qc);
>                  else
>                      ExecAlterOwnerStmt(stmt);
>              }
> @@ -1052,7 +1086,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>                  if (EventTriggerSupportsObjectType(stmt->objtype))
>                      ProcessUtilitySlow(pstate, pstmt, queryString,
>                                         context, params, queryEnv,
> -                                       dest, completionTag);
> +                                       dest, qc);
>                  else
>                      CommentObject(stmt);
>                  break;
> @@ -1065,7 +1099,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>                  if (EventTriggerSupportsObjectType(stmt->objtype))
>                      ProcessUtilitySlow(pstate, pstmt, queryString,
>                                         context, params, queryEnv,
> -                                       dest, completionTag);
> +                                       dest, qc);

Not this patch's fault or task. But I hate this type of code - needing
to touch a dozen places for new type of statement is just
insane. utility.c should long have been rewritten to just have one
metadata table for nearly all of this. Perhaps with a few callbacks for
special cases.


> +static const char * tag_names[] = {
> +    "???",
> +    "ALTER ACCESS METHOD",
> +    "ALTER AGGREGATE",
> +    "ALTER CAST",

This seems problematic to maintain, because the order needs to match
between this and something defined in a header - and there's no
guarantee a misordering is immediately noticeable. We should either go
for my metadata table idea, or at least rewrite this, even if more
verbose, to something like

static const char * tag_names[] = {
    [COMMAND_TAG_ALTER_ACCESS_METHOD] = "ALTER ACCESS METHOD",
    ...

I think the fact that this would show up in a grep for
COMMAND_TAG_ALTER_ACCESS_METHOD is good too.



> +/*
> + * Search CommandTag by name
> + *
> + * Returns CommandTag, or COMMANDTAG_UNKNOWN if not recognized
> + */
> +CommandTag
> +GetCommandTagEnum(const char *commandname)
> +{
> +    const char **base, **last, **position;
> +    int         result;
> +
> +    OPTIONALLY_CHECK_COMMAND_TAGS();
> +    if (commandname == NULL || *commandname == '\0')
> +        return COMMANDTAG_UNKNOWN;
> +
> +    base = tag_names;
> +    last = tag_names + tag_name_length - 1;
> +    while (last >= base)
> +    {
> +        position = base + ((last - base) >> 1);
> +        result = pg_strcasecmp(commandname, *position);
> +        if (result == 0)
> +            return (CommandTag) (position - tag_names);
> +        else if (result < 0)
> +            last = position - 1;
> +        else
> +            base = position + 1;
> +    }
> +    return COMMANDTAG_UNKNOWN;
> +}

This seems pretty grotty - but you get rid of it later. See my comments there.



> +#ifdef COMMANDTAG_CHECKING
> +bool
> +CheckCommandTagEnum()
> +{
> +    CommandTag    i, j;
> +
> +    if (FIRST_COMMAND_TAG < 0 || LAST_COMMAND_TAG < 0 || LAST_COMMAND_TAG < FIRST_COMMAND_TAG)
> +    {
> +        elog(ERROR, "FIRST_COMMAND_TAG (%u), LAST_COMMAND_TAG (%u) not reasonable",
> +             (unsigned int) FIRST_COMMAND_TAG, (unsigned int) LAST_COMMAND_TAG);
> +        return false;
> +    }
> +    if (FIRST_COMMAND_TAG != (CommandTag)0)
> +    {
> +        elog(ERROR, "FIRST_COMMAND_TAG (%u) != 0", (unsigned int) FIRST_COMMAND_TAG);
> +        return false;
> +    }
> +    if (LAST_COMMAND_TAG != (CommandTag)(tag_name_length - 1))
> +    {
> +        elog(ERROR, "LAST_COMMAND_TAG (%u) != tag_name_length (%u)",
> +             (unsigned int) LAST_COMMAND_TAG, (unsigned int) tag_name_length);
> +        return false;
> +    }

These all seem to want to be static asserts.


> +    for (i = FIRST_COMMAND_TAG; i < LAST_COMMAND_TAG; i++)
> +    {
> +        for (j = i+1; j < LAST_COMMAND_TAG; j++)
> +        {
> +            int cmp = strcmp(tag_names[i], tag_names[j]);
> +            if (cmp == 0)
> +            {
> +                elog(ERROR, "Found duplicate tag_name: \"%s\"",
> +                    tag_names[i]);
> +                return false;
> +            }
> +            if (cmp > 0)
> +            {
> +                elog(ERROR, "Found commandnames out of order: \"%s\" before \"%s\"",
> +                    tag_names[i], tag_names[j]);
> +                return false;
> +            }
> +        }
> +    }
> +    return true;
> +}

And I think we could get rid of this with my earlier suggestions?


> +/*
> + * BEWARE: These are in sorted order, but ordered by their printed
> + *         values in the tag_name list (see common/commandtag.c).
> + *         In particular it matters because the sort ordering changes
> + *         when you replace a space with an underscore.  To wit:
> + *
> + *             "CREATE TABLE"
> + *             "CREATE TABLE AS"
> + *             "CREATE TABLESPACE"
> + *
> + *         but...
> + *
> + *             CREATE_TABLE
> + *             CREATE_TABLESPACE
> + *             CREATE_TABLE_AS
> + *
> + *         It also matters that COMMANDTAG_UNKNOWN is written "???".
> + *
> + * If you add a value here, add it in common/commandtag.c also, and
> + * be careful to get the ordering right.  You can build with
> + * COMMANDTAG_CHECKING to have this automatically checked
> + * at runtime, but that adds considerable overhead, so do so sparingly.
> + */
> +typedef enum CommandTag
> +{

This seems pretty darn nightmarish.


> +#define FIRST_COMMAND_TAG COMMANDTAG_UNKNOWN
> +    COMMANDTAG_UNKNOWN,
> +    COMMANDTAG_ALTER_ACCESS_METHOD,
> +    COMMANDTAG_ALTER_AGGREGATE,
> +    COMMANDTAG_ALTER_CAST,
> +    COMMANDTAG_ALTER_COLLATION,
> +    COMMANDTAG_ALTER_CONSTRAINT,
> +    COMMANDTAG_ALTER_CONVERSION,
> +    COMMANDTAG_ALTER_DATABASE,
> +    COMMANDTAG_ALTER_DEFAULT_PRIVILEGES,
> +    COMMANDTAG_ALTER_DOMAIN,
> [...]

I'm a bit worried that this basically duplicates a good portion of NodeTag, without having otherwise much of a point?


> From a70b0cadc1142e92b2354a0ca3cd47aaeb0c148e Mon Sep 17 00:00:00 2001
> From: Mark Dilger <mark.dilger@enterprisedb.com>
> Date: Tue, 4 Feb 2020 12:25:05 -0800
> Subject: [PATCH v2 2/3] Using a Bitmapset of tags rather than a string array.
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> EventTriggerCacheItem no longer holds an array of palloc’d tag strings
> in sorted order, but rather just a Bitmapset over the CommandTags.  This
> makes the code a little simpler and easier to read, in my opinion.  In
> filter_event_trigger, rather than running bsearch through a sorted array
> of strings, it just runs bms_is_member.
> ---

It seems weird to add the bsearch just to remove it immediately again a
patch later. This probably should just go first?




> diff --git a/src/test/regress/sql/event_trigger.sql b/src/test/regress/sql/event_trigger.sql
> index 346168673d..cad02212ad 100644
> --- a/src/test/regress/sql/event_trigger.sql
> +++ b/src/test/regress/sql/event_trigger.sql
> @@ -10,6 +10,13 @@ BEGIN
>  END
>  $$ language plpgsql;
>
> +-- OK
> +create function test_event_trigger2() returns event_trigger as $$
> +BEGIN
> +    RAISE NOTICE 'test_event_trigger2: % %', tg_event, tg_tag;
> +END
> +$$ LANGUAGE plpgsql;
> +
>  -- should fail, event triggers cannot have declared arguments
>  create function test_event_trigger_arg(name text)
>  returns event_trigger as $$ BEGIN RETURN 1; END $$ language plpgsql;
> @@ -82,6 +89,783 @@ create event trigger regress_event_trigger2 on ddl_command_start
>  -- OK
>  comment on event trigger regress_event_trigger is 'test comment';
>
> +-- These are all unsupported
> +create event trigger regress_event_triger_NULL on ddl_command_start
> +   when tag in ('')
> +   execute procedure test_event_trigger2();
> +
> +create event trigger regress_event_triger_UNKNOWN on ddl_command_start
> +   when tag in ('???')
> +   execute procedure test_event_trigger2();
> +
> +create event trigger regress_event_trigger_ALTER_DATABASE on ddl_command_start
> +   when tag in ('ALTER DATABASE')
> +   execute procedure test_event_trigger2();
[...]

There got to be a more maintainable way to write this.

Greetings,

Andres Freund



Re: Portal->commandTag as an enum

От
Mark Dilger
Дата:

> On Feb 4, 2020, at 7:34 PM, Andres Freund <andres@anarazel.de> wrote:
>
> Hi,

Thanks for reviewing!  I am pretty much in agreement with your comments, below.

> On 2020-02-04 18:18:52 -0800, Mark Dilger wrote:
>> In master, a number of functions pass a char *completionTag argument (really a char
completionTag[COMPLETION_TAG_BUFSIZE])which gets filled in with the string to return to the client from EndCommand.  I
haveremoved that kind of logic: 
>>
>> -               /* save the rowcount if we're given a completionTag to fill */
>> -               if (completionTag)
>> -                       snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
>> -                                        "SELECT " UINT64_FORMAT,
>> -                                        queryDesc->estate->es_processed);
>>
>> In the patch, this is replaced with a new struct, QueryCompletionData.  That bit of code above is replaced with:
>>
>> +               /* save the rowcount if we're given a qc to fill */
>> +               if (qc)
>> +                       SetQC(qc, COMMANDTAG_SELECT, queryDesc->estate->es_processed, DISPLAYFORMAT_NPROCESSED);
>>
>> For wire protocol compatibility, we have to track the display format.
>> When this gets to EndCommand, the display format allows the string to
>> be written exactly as the client will expect.  If we ever get to the
>> point where we can break with that compatibility, the third member of
>> this struct, display_format, can be removed.
>
> Hm. While I like not having this as strings a lot, I wish we could get
> rid of this displayformat stuff.

Agreed, but I don’t know how.

>> These are replaced by switch() case statements over the possible commandTags:
>>
>> +       switch (commandTag)
>> +       {
>> +                       /*
>> +                        * Supported idiosyncratic special cases.
>> +                        */
>> +               case COMMANDTAG_ALTER_DEFAULT_PRIVILEGES:
>> +               case COMMANDTAG_ALTER_LARGE_OBJECT:
>> +               case COMMANDTAG_COMMENT:
>> +               case COMMANDTAG_CREATE_TABLE_AS:
>> +               case COMMANDTAG_DROP_OWNED:
>> +               case COMMANDTAG_GRANT:
>> +               case COMMANDTAG_IMPORT_FOREIGN_SCHEMA:
>> +               case COMMANDTAG_REFRESH_MATERIALIZED_VIEW:
>> +               case COMMANDTAG_REVOKE:
>> +               case COMMANDTAG_SECURITY_LABEL:
>> +               case COMMANDTAG_SELECT_INTO:
>
> The number of these makes me wonder if we should just have a metadata
> table in one place, instead of needing to edit multiple
> locations. Something like
>
> const ... CommandTagBehaviour[] = {
>    [COMMANDTAG_INSERT] = {
>        .display_processed = true, .display_oid = true, ...},
>    [COMMANDTAG_CREATE_TABLE_AS] = {
>        .event_trigger = true, ...},
>
> with the zero initialized defaults being the common cases.
>
> Not sure if it's worth going there. But it's maybe worth thinking about
> for a minute?

Yes, I was thinking about something like this, only I had in mind a Bitmapset for these.  It just so happens that there
are192 enum values, 0..191, which happens to fit in 3 64bit words plus a varlena header.  Mind you, that nice property
wouldbe immediately blown if we added another entry to the enum.  Has anybody made a compile-time static version of
Bitmapset? We could store this information in either 24 bytes or 32 bytes, depending on whether we add another enum
value.

Getting a little off topic, I was also thinking about having a counting Bitmapset that would store one bit per enum
thatis included, and then a sparse array of counts, perhaps with one byte counts for [0..127] and 8 byte counts for
[128..huge]that we could use in shared memory for the pg_stat_tag work.  Is there anything like that? 

Anyway, I don’t think we should invent lots of different structures for CommandTag tracking, so something that serves
doubleduty might keep the code tighter.  I’m already using ordinary Bitmapset over CommandTags in event_trigger, so
naturallythat comes to mind for this, too. 


>> Averages for test set 1 by scale:
>> set    scale    tps    avg_latency    90%<    max_latency
>> 1    1    3741    1.734    3.162    133.718
>> 1    9    6124    0.904    1.05    230.547
>> 1    81    5921    0.931    1.015    67.023
>>
>> Averages for test set 1 by clients:
>> set    clients    tps    avg_latency    90%<    max_latency
>> 1    1    2163    0.461    0.514    24.414
>> 1    4    5968    0.675    0.791    40.354
>> 1    16    7655    2.433    3.922    366.519
>>
>>
>> For command tag patch (branched from 1fd687a035):
>>
>>     postgresql % find src -type f -name "*.c" -or -name "*.h" | xargs cat | wc
>>      1482969 5691908 45281399
>>
>>     postgresql % find src -type f -name "*.o" | xargs cat | wc
>>        38209  476243 18999752
>>
>>
>> Averages for test set 1 by scale:
>> set    scale    tps    avg_latency    90%<    max_latency
>> 1    1    3877    1.645    3.066    24.973
>> 1    9    6383    0.859    1.032    64.566
>> 1    81    5945    0.925    1.023    162.9
>>
>> Averages for test set 1 by clients:
>> set    clients    tps    avg_latency    90%<    max_latency
>> 1    1    2141    0.466    0.522    11.531
>> 1    4    5967    0.673    0.783    136.882
>> 1    16    8096    2.292    3.817    104.026
>
> Not bad.

I still need to get a benchmark more targeted at this codepath.

>> diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
>> index 9aa2b61600..5322c14ce4 100644
>> --- a/src/backend/commands/async.c
>> +++ b/src/backend/commands/async.c
>> @@ -594,7 +594,7 @@ pg_notify(PG_FUNCTION_ARGS)
>>         payload = text_to_cstring(PG_GETARG_TEXT_PP(1));
>>
>>     /* For NOTIFY as a statement, this is checked in ProcessUtility */
>> -    PreventCommandDuringRecovery("NOTIFY");
>> +    PreventCommandDuringRecovery(COMMANDTAG_NOTIFY);
>>
>>     Async_Notify(channel, payload);
>>
>> diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
>> index 40a8ec1abd..4828e75bd5 100644
>> --- a/src/backend/commands/copy.c
>> +++ b/src/backend/commands/copy.c
>> @@ -1063,7 +1063,7 @@ DoCopy(ParseState *pstate, const CopyStmt *stmt,
>>
>>         /* check read-only transaction and parallel mode */
>>         if (XactReadOnly && !rel->rd_islocaltemp)
>> -            PreventCommandIfReadOnly("COPY FROM");
>> +            PreventCommandIfReadOnly(COMMANDTAG_COPY_FROM);
>>
>>         cstate = BeginCopyFrom(pstate, rel, stmt->filename, stmt->is_program,
>>                                NULL, stmt->attlist, stmt->options);
>
> I'm not sure this really ought to be part of this change - seems like a
> somewhat independent change to me. With less obvious benefits.

I don’t think I care too much either way.  I had some vague ideas about consolidating all of these strings in the
backendinto one place.  

>> static event_trigger_command_tag_check_result
>> -check_ddl_tag(const char *tag)
>> +check_ddl_tag(CommandTag commandTag)
>> {
>> -    const char *obtypename;
>> -    const event_trigger_support_data *etsd;
>> +    switch (commandTag)
>> +    {
>> +            /*
>> +             * Supported idiosyncratic special cases.
>> +             */
>> +        case COMMANDTAG_ALTER_DEFAULT_PRIVILEGES:
>> +        case COMMANDTAG_ALTER_LARGE_OBJECT:
>> +        case COMMANDTAG_COMMENT:
>> +        case COMMANDTAG_CREATE_TABLE_AS:
>> +        case COMMANDTAG_DROP_OWNED:
>> +        case COMMANDTAG_GRANT:
>> +        case COMMANDTAG_IMPORT_FOREIGN_SCHEMA:
>> +        case COMMANDTAG_REFRESH_MATERIALIZED_VIEW:
>> +        case COMMANDTAG_REVOKE:
>> +        case COMMANDTAG_SECURITY_LABEL:
>> +        case COMMANDTAG_SELECT_INTO:
>>
>> -    /*
>> -     * Handle some idiosyncratic special cases.
>> -     */
>> -    if (pg_strcasecmp(tag, "CREATE TABLE AS") == 0 ||
>> -        pg_strcasecmp(tag, "SELECT INTO") == 0 ||
>> -        pg_strcasecmp(tag, "REFRESH MATERIALIZED VIEW") == 0 ||
>> -        pg_strcasecmp(tag, "ALTER DEFAULT PRIVILEGES") == 0 ||
>> -        pg_strcasecmp(tag, "ALTER LARGE OBJECT") == 0 ||
>> -        pg_strcasecmp(tag, "COMMENT") == 0 ||
>> -        pg_strcasecmp(tag, "GRANT") == 0 ||
>> -        pg_strcasecmp(tag, "REVOKE") == 0 ||
>> -        pg_strcasecmp(tag, "DROP OWNED") == 0 ||
>> -        pg_strcasecmp(tag, "IMPORT FOREIGN SCHEMA") == 0 ||
>> -        pg_strcasecmp(tag, "SECURITY LABEL") == 0)
>> -        return EVENT_TRIGGER_COMMAND_TAG_OK;
>> +            /*
>> +             * Supported CREATE commands
>> +             */
>> +        case COMMANDTAG_CREATE_ACCESS_METHOD:
>> +        case COMMANDTAG_CREATE_AGGREGATE:
>> +        case COMMANDTAG_CREATE_CAST:
>> +        case COMMANDTAG_CREATE_COLLATION:
>> +        case COMMANDTAG_CREATE_CONSTRAINT:
>> +        case COMMANDTAG_CREATE_CONVERSION:
>> +        case COMMANDTAG_CREATE_DOMAIN:
>> +        case COMMANDTAG_CREATE_EXTENSION:
>> +        case COMMANDTAG_CREATE_FOREIGN_DATA_WRAPPER:
>> +        case COMMANDTAG_CREATE_FOREIGN_TABLE:
>> +        case COMMANDTAG_CREATE_FUNCTION:
>> +        case COMMANDTAG_CREATE_INDEX:
>> +        case COMMANDTAG_CREATE_LANGUAGE:
>> +        case COMMANDTAG_CREATE_MATERIALIZED_VIEW:
>> +        case COMMANDTAG_CREATE_OPERATOR:
>> +        case COMMANDTAG_CREATE_OPERATOR_CLASS:
>> +        case COMMANDTAG_CREATE_OPERATOR_FAMILY:
>> +        case COMMANDTAG_CREATE_POLICY:
>> +        case COMMANDTAG_CREATE_PROCEDURE:
>> +        case COMMANDTAG_CREATE_PUBLICATION:
>> +        case COMMANDTAG_CREATE_ROUTINE:
>> +        case COMMANDTAG_CREATE_RULE:
>> +        case COMMANDTAG_CREATE_SCHEMA:
>> +        case COMMANDTAG_CREATE_SEQUENCE:
>> +        case COMMANDTAG_CREATE_SERVER:
>> +        case COMMANDTAG_CREATE_STATISTICS:
>> +        case COMMANDTAG_CREATE_SUBSCRIPTION:
>> +        case COMMANDTAG_CREATE_TABLE:
>> +        case COMMANDTAG_CREATE_TEXT_SEARCH_CONFIGURATION:
>> +        case COMMANDTAG_CREATE_TEXT_SEARCH_DICTIONARY:
>> +        case COMMANDTAG_CREATE_TEXT_SEARCH_PARSER:
>> +        case COMMANDTAG_CREATE_TEXT_SEARCH_TEMPLATE:
>> +        case COMMANDTAG_CREATE_TRANSFORM:
>> +        case COMMANDTAG_CREATE_TRIGGER:
>> +        case COMMANDTAG_CREATE_TYPE:
>> +        case COMMANDTAG_CREATE_USER_MAPPING:
>> +        case COMMANDTAG_CREATE_VIEW:
>>
>> -    /*
>> -     * Otherwise, command should be CREATE, ALTER, or DROP.
>> -     */
>> -    if (pg_strncasecmp(tag, "CREATE ", 7) == 0)
>> -        obtypename = tag + 7;
>> -    else if (pg_strncasecmp(tag, "ALTER ", 6) == 0)
>> -        obtypename = tag + 6;
>> -    else if (pg_strncasecmp(tag, "DROP ", 5) == 0)
>> -        obtypename = tag + 5;
>> -    else
>> -        return EVENT_TRIGGER_COMMAND_TAG_NOT_RECOGNIZED;
>> +            /*
>> +             * Supported ALTER commands
>> +             */
>> +        case COMMANDTAG_ALTER_ACCESS_METHOD:
>> +        case COMMANDTAG_ALTER_AGGREGATE:
>> +        case COMMANDTAG_ALTER_CAST:
>> +        case COMMANDTAG_ALTER_COLLATION:
>> +        case COMMANDTAG_ALTER_CONSTRAINT:
>> +        case COMMANDTAG_ALTER_CONVERSION:
>> +        case COMMANDTAG_ALTER_DOMAIN:
>> +        case COMMANDTAG_ALTER_EXTENSION:
>> +        case COMMANDTAG_ALTER_FOREIGN_DATA_WRAPPER:
>> +        case COMMANDTAG_ALTER_FOREIGN_TABLE:
>> +        case COMMANDTAG_ALTER_FUNCTION:
>> +        case COMMANDTAG_ALTER_INDEX:
>> +        case COMMANDTAG_ALTER_LANGUAGE:
>> +        case COMMANDTAG_ALTER_MATERIALIZED_VIEW:
>> +        case COMMANDTAG_ALTER_OPERATOR:
>> +        case COMMANDTAG_ALTER_OPERATOR_CLASS:
>> +        case COMMANDTAG_ALTER_OPERATOR_FAMILY:
>> +        case COMMANDTAG_ALTER_POLICY:
>> +        case COMMANDTAG_ALTER_PROCEDURE:
>> +        case COMMANDTAG_ALTER_PUBLICATION:
>> +        case COMMANDTAG_ALTER_ROUTINE:
>> +        case COMMANDTAG_ALTER_RULE:
>> +        case COMMANDTAG_ALTER_SCHEMA:
>> +        case COMMANDTAG_ALTER_SEQUENCE:
>> +        case COMMANDTAG_ALTER_SERVER:
>> +        case COMMANDTAG_ALTER_STATISTICS:
>> +        case COMMANDTAG_ALTER_SUBSCRIPTION:
>> +        case COMMANDTAG_ALTER_TABLE:
>> +        case COMMANDTAG_ALTER_TEXT_SEARCH_CONFIGURATION:
>> +        case COMMANDTAG_ALTER_TEXT_SEARCH_DICTIONARY:
>> +        case COMMANDTAG_ALTER_TEXT_SEARCH_PARSER:
>> +        case COMMANDTAG_ALTER_TEXT_SEARCH_TEMPLATE:
>> +        case COMMANDTAG_ALTER_TRANSFORM:
>> +        case COMMANDTAG_ALTER_TRIGGER:
>> +        case COMMANDTAG_ALTER_TYPE:
>> +        case COMMANDTAG_ALTER_USER_MAPPING:
>> +        case COMMANDTAG_ALTER_VIEW:
>>
>> -    /*
>> -     * ...and the object type should be something recognizable.
>> -     */
>> -    for (etsd = event_trigger_support; etsd->obtypename != NULL; etsd++)
>> -        if (pg_strcasecmp(etsd->obtypename, obtypename) == 0)
>> +            /*
>> +             * Supported DROP commands
>> +             */
>> +        case COMMANDTAG_DROP_ACCESS_METHOD:
>> +        case COMMANDTAG_DROP_AGGREGATE:
>> +        case COMMANDTAG_DROP_CAST:
>> +        case COMMANDTAG_DROP_COLLATION:
>> +        case COMMANDTAG_DROP_CONSTRAINT:
>> +        case COMMANDTAG_DROP_CONVERSION:
>> +        case COMMANDTAG_DROP_DOMAIN:
>> +        case COMMANDTAG_DROP_EXTENSION:
>> +        case COMMANDTAG_DROP_FOREIGN_DATA_WRAPPER:
>> +        case COMMANDTAG_DROP_FOREIGN_TABLE:
>> +        case COMMANDTAG_DROP_FUNCTION:
>> +        case COMMANDTAG_DROP_INDEX:
>> +        case COMMANDTAG_DROP_LANGUAGE:
>> +        case COMMANDTAG_DROP_MATERIALIZED_VIEW:
>> +        case COMMANDTAG_DROP_OPERATOR:
>> +        case COMMANDTAG_DROP_OPERATOR_CLASS:
>> +        case COMMANDTAG_DROP_OPERATOR_FAMILY:
>> +        case COMMANDTAG_DROP_POLICY:
>> +        case COMMANDTAG_DROP_PROCEDURE:
>> +        case COMMANDTAG_DROP_PUBLICATION:
>> +        case COMMANDTAG_DROP_ROUTINE:
>> +        case COMMANDTAG_DROP_RULE:
>> +        case COMMANDTAG_DROP_SCHEMA:
>> +        case COMMANDTAG_DROP_SEQUENCE:
>> +        case COMMANDTAG_DROP_SERVER:
>> +        case COMMANDTAG_DROP_STATISTICS:
>> +        case COMMANDTAG_DROP_SUBSCRIPTION:
>> +        case COMMANDTAG_DROP_TABLE:
>> +        case COMMANDTAG_DROP_TEXT_SEARCH_CONFIGURATION:
>> +        case COMMANDTAG_DROP_TEXT_SEARCH_DICTIONARY:
>> +        case COMMANDTAG_DROP_TEXT_SEARCH_PARSER:
>> +        case COMMANDTAG_DROP_TEXT_SEARCH_TEMPLATE:
>> +        case COMMANDTAG_DROP_TRANSFORM:
>> +        case COMMANDTAG_DROP_TRIGGER:
>> +        case COMMANDTAG_DROP_TYPE:
>> +        case COMMANDTAG_DROP_USER_MAPPING:
>> +        case COMMANDTAG_DROP_VIEW:
>> +            return EVENT_TRIGGER_COMMAND_TAG_OK;
>> +
>> +            /*
>> +             * Unsupported CREATE commands
>> +             */
>> +        case COMMANDTAG_CREATE_DATABASE:
>> +        case COMMANDTAG_CREATE_EVENT_TRIGGER:
>> +        case COMMANDTAG_CREATE_ROLE:
>> +        case COMMANDTAG_CREATE_TABLESPACE:
>> +
>> +            /*
>> +             * Unsupported ALTER commands
>> +             */
>> +        case COMMANDTAG_ALTER_DATABASE:
>> +        case COMMANDTAG_ALTER_EVENT_TRIGGER:
>> +        case COMMANDTAG_ALTER_ROLE:
>> +        case COMMANDTAG_ALTER_TABLESPACE:
>> +
>> +            /*
>> +             * Unsupported DROP commands
>> +             */
>> +        case COMMANDTAG_DROP_DATABASE:
>> +        case COMMANDTAG_DROP_EVENT_TRIGGER:
>> +        case COMMANDTAG_DROP_ROLE:
>> +        case COMMANDTAG_DROP_TABLESPACE:
>> +
>> +            /*
>> +             * Other unsupported commands.  These used to return
>> +             * EVENT_TRIGGER_COMMAND_TAG_NOT_RECOGNIZED prior to the
>> +             * conversion of commandTag from string to enum.
>> +             */
>> +        case COMMANDTAG_ALTER_SYSTEM:
>> +        case COMMANDTAG_ANALYZE:
>> +        case COMMANDTAG_BEGIN:
>> +        case COMMANDTAG_CALL:
>> +        case COMMANDTAG_CHECKPOINT:
>> +        case COMMANDTAG_CLOSE:
>> +        case COMMANDTAG_CLOSE_CURSOR:
>> +        case COMMANDTAG_CLOSE_CURSOR_ALL:
>> +        case COMMANDTAG_CLUSTER:
>> +        case COMMANDTAG_COMMIT:
>> +        case COMMANDTAG_COMMIT_PREPARED:
>> +        case COMMANDTAG_COPY:
>> +        case COMMANDTAG_COPY_FROM:
>> +        case COMMANDTAG_DEALLOCATE:
>> +        case COMMANDTAG_DEALLOCATE_ALL:
>> +        case COMMANDTAG_DECLARE_CURSOR:
>> +        case COMMANDTAG_DELETE:
>> +        case COMMANDTAG_DISCARD:
>> +        case COMMANDTAG_DISCARD_ALL:
>> +        case COMMANDTAG_DISCARD_PLANS:
>> +        case COMMANDTAG_DISCARD_SEQUENCES:
>> +        case COMMANDTAG_DISCARD_TEMP:
>> +        case COMMANDTAG_DO:
>> +        case COMMANDTAG_DROP_REPLICATION_SLOT:
>> +        case COMMANDTAG_EXECUTE:
>> +        case COMMANDTAG_EXPLAIN:
>> +        case COMMANDTAG_FETCH:
>> +        case COMMANDTAG_GRANT_ROLE:
>> +        case COMMANDTAG_INSERT:
>> +        case COMMANDTAG_LISTEN:
>> +        case COMMANDTAG_LOAD:
>> +        case COMMANDTAG_LOCK_TABLE:
>> +        case COMMANDTAG_MOVE:
>> +        case COMMANDTAG_NOTIFY:
>> +        case COMMANDTAG_PREPARE:
>> +        case COMMANDTAG_PREPARE_TRANSACTION:
>> +        case COMMANDTAG_REASSIGN_OWNED:
>> +        case COMMANDTAG_REINDEX:
>> +        case COMMANDTAG_RELEASE:
>> +        case COMMANDTAG_RESET:
>> +        case COMMANDTAG_REVOKE_ROLE:
>> +        case COMMANDTAG_ROLLBACK:
>> +        case COMMANDTAG_ROLLBACK_PREPARED:
>> +        case COMMANDTAG_SAVEPOINT:
>> +        case COMMANDTAG_SELECT:
>> +        case COMMANDTAG_SELECT_FOR_KEY_SHARE:
>> +        case COMMANDTAG_SELECT_FOR_NO_KEY_UPDATE:
>> +        case COMMANDTAG_SELECT_FOR_SHARE:
>> +        case COMMANDTAG_SELECT_FOR_UPDATE:
>> +        case COMMANDTAG_SET:
>> +        case COMMANDTAG_SET_CONSTRAINTS:
>> +        case COMMANDTAG_SHOW:
>> +        case COMMANDTAG_START_TRANSACTION:
>> +        case COMMANDTAG_TRUNCATE_TABLE:
>> +        case COMMANDTAG_UNLISTEN:
>> +        case COMMANDTAG_UPDATE:
>> +        case COMMANDTAG_VACUUM:
>> +            return EVENT_TRIGGER_COMMAND_TAG_NOT_SUPPORTED;
>> +        case COMMANDTAG_UNKNOWN:
>>             break;
>> -    if (etsd->obtypename == NULL)
>> -        return EVENT_TRIGGER_COMMAND_TAG_NOT_RECOGNIZED;
>> -    if (!etsd->supported)
>> -        return EVENT_TRIGGER_COMMAND_TAG_NOT_SUPPORTED;
>> -    return EVENT_TRIGGER_COMMAND_TAG_OK;
>> +    }
>> +    return EVENT_TRIGGER_COMMAND_TAG_NOT_RECOGNIZED;
>> }
>
> This is pretty painful.

I think it is painful in a different way.  The existing code on master is a mess of parsing logic that is harder to
reasonthrough, but fewer lines.  There are other places in the backend that have long switch statements, so I didn’t
feelI was breaking with project style to do this.  I also made the switch longer than I had too, by including all
enumeratedvalues rather than just the ones that are supported.  We could remove the extra cases, but I think that’s
onlya half measure.  Something more like a consolidated table or bitmap seems better. 

>
>> @@ -745,7 +902,7 @@ EventTriggerCommonSetup(Node *parsetree,
>>         return NIL;
>>
>>     /* Get the command tag. */
>> -    tag = CreateCommandTag(parsetree);
>> +    tag = GetCommandTagName(CreateCommandTag(parsetree));
>>
>>     /*
>>      * Filter list of event triggers by command tag, and copy them into our
>> @@ -2136,7 +2293,7 @@ pg_event_trigger_ddl_commands(PG_FUNCTION_ARGS)
>>                     /* objsubid */
>>                     values[i++] = Int32GetDatum(addr.objectSubId);
>>                     /* command tag */
>> -                    values[i++] = CStringGetTextDatum(CreateCommandTag(cmd->parsetree));
>> +                    values[i++] = CStringGetTextDatum(GetCommandTagName(CreateCommandTag(cmd->parsetree)));
>>                     /* object_type */
>>                     values[i++] = CStringGetTextDatum(type);
>>                     /* schema */
>> @@ -2161,7 +2318,7 @@ pg_event_trigger_ddl_commands(PG_FUNCTION_ARGS)
>>                 /* objsubid */
>>                 nulls[i++] = true;
>>                 /* command tag */
>> -                values[i++] = CStringGetTextDatum(CreateCommandTag(cmd->parsetree));
>> +                values[i++] = CStringGetTextDatum(GetCommandTagName(CreateCommandTag(cmd->parsetree)));
>>                 /* object_type */
>>                 values[i++] = CStringGetTextDatum(stringify_adefprivs_objtype(cmd->d.defprivs.objtype));
>>                 /* schema */
>
> So GetCommandTagName we commonly do twice for some reason? Once in
> EventTriggerCommonSetup() and then again in
> pg_event_trigger_ddl_commands()? Why is EventTriggerData.tag still the
> string?

It is not a string after applying v2-0003….   The main issue I see in the code you are quoting is that
CreateCommandTag(cmd->parsetree)is called more than once, and that’s not the consequence of this patch.  That’s
pre-existing. I didn’t look into it, though I can if you think it is relevant to this patch set.  The name of the
function,CreateCommandTag, sounds like something I invented as part of this patch, but it pre-exists this patch.  I
onlychanged it’s return value from char * to CommandTag. 

>>     Assert(list_length(plan->plancache_list) == 1);
>> @@ -1469,7 +1469,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
>>                         (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>>                 /* translator: %s is a SQL statement name */
>>                          errmsg("%s is not allowed in a non-volatile function",
>> -                                CreateCommandTag((Node *) pstmt))));
>> +                                GetCommandTagName(CreateCommandTag((Node *) pstmt)))));
>
> Probably worth having a wrapper for this - these lines are pretty long,
> and there quite a number of cases like it in the patch.

Actually, I looked at that.  The number of them seemed right on the line between making a wrapper and not.  I thought
thecounter-argument was that by making a wrapper that only got used in a few places, I was creating more lines of code
andobfuscating what happens.  I’m happy to do it your way, if consensus emerges around that. 

>> @@ -172,11 +175,38 @@ EndCommand(const char *commandTag, CommandDest dest)
>>         case DestRemoteSimple:
>>
>>             /*
>> -             * We assume the commandTag is plain ASCII and therefore requires
>> -             * no encoding conversion.
>> +             * We assume the tagname is plain ASCII and therefore
>> +             * requires no encoding conversion.
>>              */
>> -            pq_putmessage('C', commandTag, strlen(commandTag) + 1);
>> -            break;
>> +            tagname = GetCommandTagName(qc->commandTag);
>> +            switch (qc->display_format)
>> +            {
>> +                case DISPLAYFORMAT_PLAIN:
>> +                    pq_putmessage('C', tagname, strlen(tagname) + 1);
>> +                    break;
>> +                case DISPLAYFORMAT_LAST_OID:
>> +                    /*
>> +                     * We no longer display LastOid, but to preserve the wire protocol,
>> +                     * we write InvalidOid where the LastOid used to be written.  For
>> +                     * efficiency in the snprintf(), hard-code InvalidOid as zero.
>> +                     */
>> +                    Assert(InvalidOid == 0);
>> +                    snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
>> +                                "%s 0 " UINT64_FORMAT,
>> +                                tagname,
>> +                                qc->nprocessed);
>> +                    pq_putmessage('C', completionTag, strlen(completionTag) + 1);
>> +                    break;
>> +                case DISPLAYFORMAT_NPROCESSED:
>> +                    snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
>> +                            "%s " UINT64_FORMAT,
>> +                            tagname,
>> +                            qc->nprocessed);
>> +                    pq_putmessage('C', completionTag, strlen(completionTag) + 1);
>> +                    break;
>> +                default:
>> +                    elog(ERROR, "Invalid display_format in EndCommand");
>> +            }
>
> Imo there should only be one pq_putmessage().  Also think this type of
> default: is a bad idea, just prevents the compiler from warning if we
> were to ever introduce a new variant of DISPLAYFORMAT_*, without
> providing any meaningful additional security.

Ok.

>> @@ -855,7 +889,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>>
>>         case T_DiscardStmt:
>>             /* should we allow DISCARD PLANS? */
>> -            CheckRestrictedOperation("DISCARD");
>> +            CheckRestrictedOperation(COMMANDTAG_DISCARD);
>>             DiscardCommand((DiscardStmt *) parsetree, isTopLevel);
>>             break;
>>
>> @@ -974,7 +1008,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>>                 if (EventTriggerSupportsObjectType(stmt->objtype))
>>                     ProcessUtilitySlow(pstate, pstmt, queryString,
>>                                        context, params, queryEnv,
>> -                                       dest, completionTag);
>> +                                       dest, qc);
>>                 else
>>                     ExecuteGrantStmt(stmt);
>>             }
>> @@ -987,7 +1021,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>>                 if (EventTriggerSupportsObjectType(stmt->removeType))
>>                     ProcessUtilitySlow(pstate, pstmt, queryString,
>>                                        context, params, queryEnv,
>> -                                       dest, completionTag);
>> +                                       dest, qc);
>>                 else
>>                     ExecDropStmt(stmt, isTopLevel);
>>             }
>> @@ -1000,7 +1034,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>>                 if (EventTriggerSupportsObjectType(stmt->renameType))
>>                     ProcessUtilitySlow(pstate, pstmt, queryString,
>>                                        context, params, queryEnv,
>> -                                       dest, completionTag);
>> +                                       dest, qc);
>>                 else
>>                     ExecRenameStmt(stmt);
>>             }
>> @@ -1013,7 +1047,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>>                 if (EventTriggerSupportsObjectType(stmt->objectType))
>>                     ProcessUtilitySlow(pstate, pstmt, queryString,
>>                                        context, params, queryEnv,
>> -                                       dest, completionTag);
>> +                                       dest, qc);
>>                 else
>>                     ExecAlterObjectDependsStmt(stmt, NULL);
>>             }
>> @@ -1026,7 +1060,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>>                 if (EventTriggerSupportsObjectType(stmt->objectType))
>>                     ProcessUtilitySlow(pstate, pstmt, queryString,
>>                                        context, params, queryEnv,
>> -                                       dest, completionTag);
>> +                                       dest, qc);
>>                 else
>>                     ExecAlterObjectSchemaStmt(stmt, NULL);
>>             }
>> @@ -1039,7 +1073,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>>                 if (EventTriggerSupportsObjectType(stmt->objectType))
>>                     ProcessUtilitySlow(pstate, pstmt, queryString,
>>                                        context, params, queryEnv,
>> -                                       dest, completionTag);
>> +                                       dest, qc);
>>                 else
>>                     ExecAlterOwnerStmt(stmt);
>>             }
>> @@ -1052,7 +1086,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>>                 if (EventTriggerSupportsObjectType(stmt->objtype))
>>                     ProcessUtilitySlow(pstate, pstmt, queryString,
>>                                        context, params, queryEnv,
>> -                                       dest, completionTag);
>> +                                       dest, qc);
>>                 else
>>                     CommentObject(stmt);
>>                 break;
>> @@ -1065,7 +1099,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>>                 if (EventTriggerSupportsObjectType(stmt->objtype))
>>                     ProcessUtilitySlow(pstate, pstmt, queryString,
>>                                        context, params, queryEnv,
>> -                                       dest, completionTag);
>> +                                       dest, qc);
>
> Not this patch's fault or task. But I hate this type of code - needing
> to touch a dozen places for new type of statement is just
> insane. utility.c should long have been rewritten to just have one
> metadata table for nearly all of this. Perhaps with a few callbacks for
> special cases.

No objection from me, though I’d have to see the alternative and what it does to performance.

>> +static const char * tag_names[] = {
>> +    "???",
>> +    "ALTER ACCESS METHOD",
>> +    "ALTER AGGREGATE",
>> +    "ALTER CAST",
>
> This seems problematic to maintain, because the order needs to match
> between this and something defined in a header - and there's no
> guarantee a misordering is immediately noticeable. We should either go
> for my metadata table idea, or at least rewrite this, even if more
> verbose, to something like
>
> static const char * tag_names[] = {
>    [COMMAND_TAG_ALTER_ACCESS_METHOD] = "ALTER ACCESS METHOD",
>    ...
>
> I think the fact that this would show up in a grep for
> COMMAND_TAG_ALTER_ACCESS_METHOD is good too.

I had something closer to what you’re asking for as part of the v1 patch and ripped it out to get the code size down.
Avoidingcode bloat was one of Tom's concerns.  What you are suggesting is admittedly better than what I ripped out,
though. 

>> +/*
>> + * Search CommandTag by name
>> + *
>> + * Returns CommandTag, or COMMANDTAG_UNKNOWN if not recognized
>> + */
>> +CommandTag
>> +GetCommandTagEnum(const char *commandname)
>> +{
>> +    const char **base, **last, **position;
>> +    int         result;
>> +
>> +    OPTIONALLY_CHECK_COMMAND_TAGS();
>> +    if (commandname == NULL || *commandname == '\0')
>> +        return COMMANDTAG_UNKNOWN;
>> +
>> +    base = tag_names;
>> +    last = tag_names + tag_name_length - 1;
>> +    while (last >= base)
>> +    {
>> +        position = base + ((last - base) >> 1);
>> +        result = pg_strcasecmp(commandname, *position);
>> +        if (result == 0)
>> +            return (CommandTag) (position - tag_names);
>> +        else if (result < 0)
>> +            last = position - 1;
>> +        else
>> +            base = position + 1;
>> +    }
>> +    return COMMANDTAG_UNKNOWN;
>> +}
>
> This seems pretty grotty - but you get rid of it later. See my comments there.
>
>
>
>> +#ifdef COMMANDTAG_CHECKING
>> +bool
>> +CheckCommandTagEnum()
>> +{
>> +    CommandTag    i, j;
>> +
>> +    if (FIRST_COMMAND_TAG < 0 || LAST_COMMAND_TAG < 0 || LAST_COMMAND_TAG < FIRST_COMMAND_TAG)
>> +    {
>> +        elog(ERROR, "FIRST_COMMAND_TAG (%u), LAST_COMMAND_TAG (%u) not reasonable",
>> +             (unsigned int) FIRST_COMMAND_TAG, (unsigned int) LAST_COMMAND_TAG);
>> +        return false;
>> +    }
>> +    if (FIRST_COMMAND_TAG != (CommandTag)0)
>> +    {
>> +        elog(ERROR, "FIRST_COMMAND_TAG (%u) != 0", (unsigned int) FIRST_COMMAND_TAG);
>> +        return false;
>> +    }
>> +    if (LAST_COMMAND_TAG != (CommandTag)(tag_name_length - 1))
>> +    {
>> +        elog(ERROR, "LAST_COMMAND_TAG (%u) != tag_name_length (%u)",
>> +             (unsigned int) LAST_COMMAND_TAG, (unsigned int) tag_name_length);
>> +        return false;
>> +    }
>
> These all seem to want to be static asserts.
>
>
>> +    for (i = FIRST_COMMAND_TAG; i < LAST_COMMAND_TAG; i++)
>> +    {
>> +        for (j = i+1; j < LAST_COMMAND_TAG; j++)
>> +        {
>> +            int cmp = strcmp(tag_names[i], tag_names[j]);
>> +            if (cmp == 0)
>> +            {
>> +                elog(ERROR, "Found duplicate tag_name: \"%s\"",
>> +                    tag_names[i]);
>> +                return false;
>> +            }
>> +            if (cmp > 0)
>> +            {
>> +                elog(ERROR, "Found commandnames out of order: \"%s\" before \"%s\"",
>> +                    tag_names[i], tag_names[j]);
>> +                return false;
>> +            }
>> +        }
>> +    }
>> +    return true;
>> +}
>
> And I think we could get rid of this with my earlier suggestions?
>
>
>> +/*
>> + * BEWARE: These are in sorted order, but ordered by their printed
>> + *         values in the tag_name list (see common/commandtag.c).
>> + *         In particular it matters because the sort ordering changes
>> + *         when you replace a space with an underscore.  To wit:
>> + *
>> + *             "CREATE TABLE"
>> + *             "CREATE TABLE AS"
>> + *             "CREATE TABLESPACE"
>> + *
>> + *         but...
>> + *
>> + *             CREATE_TABLE
>> + *             CREATE_TABLESPACE
>> + *             CREATE_TABLE_AS
>> + *
>> + *         It also matters that COMMANDTAG_UNKNOWN is written "???".
>> + *
>> + * If you add a value here, add it in common/commandtag.c also, and
>> + * be careful to get the ordering right.  You can build with
>> + * COMMANDTAG_CHECKING to have this automatically checked
>> + * at runtime, but that adds considerable overhead, so do so sparingly.
>> + */
>> +typedef enum CommandTag
>> +{
>
> This seems pretty darn nightmarish.
>
>
>> +#define FIRST_COMMAND_TAG COMMANDTAG_UNKNOWN
>> +    COMMANDTAG_UNKNOWN,
>> +    COMMANDTAG_ALTER_ACCESS_METHOD,
>> +    COMMANDTAG_ALTER_AGGREGATE,
>> +    COMMANDTAG_ALTER_CAST,
>> +    COMMANDTAG_ALTER_COLLATION,
>> +    COMMANDTAG_ALTER_CONSTRAINT,
>> +    COMMANDTAG_ALTER_CONVERSION,
>> +    COMMANDTAG_ALTER_DATABASE,
>> +    COMMANDTAG_ALTER_DEFAULT_PRIVILEGES,
>> +    COMMANDTAG_ALTER_DOMAIN,
>> [...]
>
> I'm a bit worried that this basically duplicates a good portion of NodeTag, without having otherwise much of a point?

I never quite came up with a one-size-fits-all enumeration.  There are lots of places where these enumerations seem to
almostmap onto each other, but with special cases that don’t line up.  I’m open to suggestions. 

>> From a70b0cadc1142e92b2354a0ca3cd47aaeb0c148e Mon Sep 17 00:00:00 2001
>> From: Mark Dilger <mark.dilger@enterprisedb.com>
>> Date: Tue, 4 Feb 2020 12:25:05 -0800
>> Subject: [PATCH v2 2/3] Using a Bitmapset of tags rather than a string array.
>> MIME-Version: 1.0
>> Content-Type: text/plain; charset=UTF-8
>> Content-Transfer-Encoding: 8bit
>>
>> EventTriggerCacheItem no longer holds an array of palloc’d tag strings
>> in sorted order, but rather just a Bitmapset over the CommandTags.  This
>> makes the code a little simpler and easier to read, in my opinion.  In
>> filter_event_trigger, rather than running bsearch through a sorted array
>> of strings, it just runs bms_is_member.
>> ---
>
> It seems weird to add the bsearch just to remove it immediately again a
> patch later. This probably should just go first?

I’m not sure what you mean.  That bsearch is pre-existing, not mine.

>> diff --git a/src/test/regress/sql/event_trigger.sql b/src/test/regress/sql/event_trigger.sql
>> index 346168673d..cad02212ad 100644
>> --- a/src/test/regress/sql/event_trigger.sql
>> +++ b/src/test/regress/sql/event_trigger.sql
>> @@ -10,6 +10,13 @@ BEGIN
>> END
>> $$ language plpgsql;
>>
>> +-- OK
>> +create function test_event_trigger2() returns event_trigger as $$
>> +BEGIN
>> +    RAISE NOTICE 'test_event_trigger2: % %', tg_event, tg_tag;
>> +END
>> +$$ LANGUAGE plpgsql;
>> +
>> -- should fail, event triggers cannot have declared arguments
>> create function test_event_trigger_arg(name text)
>> returns event_trigger as $$ BEGIN RETURN 1; END $$ language plpgsql;
>> @@ -82,6 +89,783 @@ create event trigger regress_event_trigger2 on ddl_command_start
>> -- OK
>> comment on event trigger regress_event_trigger is 'test comment';
>>
>> +-- These are all unsupported
>> +create event trigger regress_event_triger_NULL on ddl_command_start
>> +   when tag in ('')
>> +   execute procedure test_event_trigger2();
>> +
>> +create event trigger regress_event_triger_UNKNOWN on ddl_command_start
>> +   when tag in ('???')
>> +   execute procedure test_event_trigger2();
>> +
>> +create event trigger regress_event_trigger_ALTER_DATABASE on ddl_command_start
>> +   when tag in ('ALTER DATABASE')
>> +   execute procedure test_event_trigger2();
> [...]
>
> There got to be a more maintainable way to write this.

Yeah, I already conceded to Tom in his review that I’m not wedded to committing this test in any form, let alone in
thisform.  That’s part of why I kept it as a separate patch file.  But if you like what it is doing, and just don’t
likethe verbosity, I can try harder to compress it. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Portal->commandTag as an enum

От
Mark Dilger
Дата:
Andres,

The previous patch set seemed to cause confusion, having separated changes into multiple files.  The latest patch,
heavilyinfluenced by your review, is all in one file, attached. 

> On Feb 4, 2020, at 7:34 PM, Andres Freund <andres@anarazel.de> wrote:
>
>
>> These are replaced by switch() case statements over the possible commandTags:
>>
>> +       switch (commandTag)
>> +       {
>> +                       /*
>> +                        * Supported idiosyncratic special cases.
>> +                        */
>> +               case COMMANDTAG_ALTER_DEFAULT_PRIVILEGES:
>> +               case COMMANDTAG_ALTER_LARGE_OBJECT:
>> +               case COMMANDTAG_COMMENT:
>> +               case COMMANDTAG_CREATE_TABLE_AS:
>> +               case COMMANDTAG_DROP_OWNED:
>> +               case COMMANDTAG_GRANT:
>> +               case COMMANDTAG_IMPORT_FOREIGN_SCHEMA:
>> +               case COMMANDTAG_REFRESH_MATERIALIZED_VIEW:
>> +               case COMMANDTAG_REVOKE:
>> +               case COMMANDTAG_SECURITY_LABEL:
>> +               case COMMANDTAG_SELECT_INTO:
>
> The number of these makes me wonder if we should just have a metadata
> table in one place, instead of needing to edit multiple
> locations. Something like
>
> const ... CommandTagBehaviour[] = {
>    [COMMANDTAG_INSERT] = {
>        .display_processed = true, .display_oid = true, ...},
>    [COMMANDTAG_CREATE_TABLE_AS] = {
>        .event_trigger = true, ...},
>
> with the zero initialized defaults being the common cases.
>
> Not sure if it's worth going there. But it's maybe worth thinking about
> for a minute?

The v3 patch does something like you suggest.

The only gotcha I came across while reorganizing the code this way is that exec_replication_command(…) outputs “SELECT”
ratherthan “SELECT <ROWCOUNT>” as is done everywhere else.  Strangely, exec_replication_command(…) does output the
rowcountfor “COPY <ROWCOUNT>”, which matches how COPY is handled elsewhere.  I can’t see any logic in this.  I’m
concernedthat outputting “SELECT 0” from exec_replication_command rather than “SELECT” as is currently done will break
someclient somewhere, though none that I can find. 

Putting the display information into the CommandTag behavior table forces the behavior per tag to be the same
everywhere,which forces this change on exec_replication_command. 

To get around this, I’ve added an extremely bogus extra boolean argument to EndCommand, force_undecorated_output, that
isfalse from all callers except exec_replication_command(…) in the one spot I described. 

I don’t know whether the code should be committed this way, but I need something as a placeholder until I get a better
understandingof why exec_replication_command(…) behaves as it does and what I should do about it in the patch. 

>> Averages for test set 1 by scale:
>> set    scale    tps    avg_latency    90%<    max_latency
>> 1    1    3741    1.734    3.162    133.718
>> 1    9    6124    0.904    1.05    230.547
>> 1    81    5921    0.931    1.015    67.023
>>
>> Averages for test set 1 by clients:
>> set    clients    tps    avg_latency    90%<    max_latency
>> 1    1    2163    0.461    0.514    24.414
>> 1    4    5968    0.675    0.791    40.354
>> 1    16    7655    2.433    3.922    366.519
>>
>>
>> For command tag patch (branched from 1fd687a035):
>>
>>     postgresql % find src -type f -name "*.c" -or -name "*.h" | xargs cat | wc
>>      1482969 5691908 45281399
>>
>>     postgresql % find src -type f -name "*.o" | xargs cat | wc
>>        38209  476243 18999752
>>
>>
>> Averages for test set 1 by scale:
>> set    scale    tps    avg_latency    90%<    max_latency
>> 1    1    3877    1.645    3.066    24.973
>> 1    9    6383    0.859    1.032    64.566
>> 1    81    5945    0.925    1.023    162.9
>>
>> Averages for test set 1 by clients:
>> set    clients    tps    avg_latency    90%<    max_latency
>> 1    1    2141    0.466    0.522    11.531
>> 1    4    5967    0.673    0.783    136.882
>> 1    16    8096    2.292    3.817    104.026
>
> Not bad.
>
>
>> diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
>> index 9aa2b61600..5322c14ce4 100644
>> --- a/src/backend/commands/async.c
>> +++ b/src/backend/commands/async.c
>> @@ -594,7 +594,7 @@ pg_notify(PG_FUNCTION_ARGS)
>>         payload = text_to_cstring(PG_GETARG_TEXT_PP(1));
>>
>>     /* For NOTIFY as a statement, this is checked in ProcessUtility */
>> -    PreventCommandDuringRecovery("NOTIFY");
>> +    PreventCommandDuringRecovery(COMMANDTAG_NOTIFY);
>>
>>     Async_Notify(channel, payload);
>>
>> diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
>> index 40a8ec1abd..4828e75bd5 100644
>> --- a/src/backend/commands/copy.c
>> +++ b/src/backend/commands/copy.c
>> @@ -1063,7 +1063,7 @@ DoCopy(ParseState *pstate, const CopyStmt *stmt,
>>
>>         /* check read-only transaction and parallel mode */
>>         if (XactReadOnly && !rel->rd_islocaltemp)
>> -            PreventCommandIfReadOnly("COPY FROM");
>> +            PreventCommandIfReadOnly(COMMANDTAG_COPY_FROM);
>>
>>         cstate = BeginCopyFrom(pstate, rel, stmt->filename, stmt->is_program,
>>                                NULL, stmt->attlist, stmt->options);
>
> I'm not sure this really ought to be part of this change - seems like a
> somewhat independent change to me. With less obvious benefits.

This is changed back in v3 to be more like how it was before.

>> static event_trigger_command_tag_check_result
>> -check_ddl_tag(const char *tag)
>> +check_ddl_tag(CommandTag commandTag)
>> {
>> -    const char *obtypename;
>> -    const event_trigger_support_data *etsd;
>> +    switch (commandTag)
>> +    {
>> +            /*
>> +             * Supported idiosyncratic special cases.
>> +             */
>> +        case COMMANDTAG_ALTER_DEFAULT_PRIVILEGES:
>> +        case COMMANDTAG_ALTER_LARGE_OBJECT:
>> +        case COMMANDTAG_COMMENT:
>> +        case COMMANDTAG_CREATE_TABLE_AS:
>> +        case COMMANDTAG_DROP_OWNED:
>> +        case COMMANDTAG_GRANT:
>> +        case COMMANDTAG_IMPORT_FOREIGN_SCHEMA:
>> +        case COMMANDTAG_REFRESH_MATERIALIZED_VIEW:
>> +        case COMMANDTAG_REVOKE:
>> +        case COMMANDTAG_SECURITY_LABEL:
>> +        case COMMANDTAG_SELECT_INTO:
>>
>> -    /*
>> -     * Handle some idiosyncratic special cases.
>> -     */
>> -    if (pg_strcasecmp(tag, "CREATE TABLE AS") == 0 ||
>> -        pg_strcasecmp(tag, "SELECT INTO") == 0 ||
>> -        pg_strcasecmp(tag, "REFRESH MATERIALIZED VIEW") == 0 ||
>> -        pg_strcasecmp(tag, "ALTER DEFAULT PRIVILEGES") == 0 ||
>> -        pg_strcasecmp(tag, "ALTER LARGE OBJECT") == 0 ||
>> -        pg_strcasecmp(tag, "COMMENT") == 0 ||
>> -        pg_strcasecmp(tag, "GRANT") == 0 ||
>> -        pg_strcasecmp(tag, "REVOKE") == 0 ||
>> -        pg_strcasecmp(tag, "DROP OWNED") == 0 ||
>> -        pg_strcasecmp(tag, "IMPORT FOREIGN SCHEMA") == 0 ||
>> -        pg_strcasecmp(tag, "SECURITY LABEL") == 0)
>> -        return EVENT_TRIGGER_COMMAND_TAG_OK;
>> +            /*
>> +             * Supported CREATE commands
>> +             */
>> +        case COMMANDTAG_CREATE_ACCESS_METHOD:
>> +        case COMMANDTAG_CREATE_AGGREGATE:
>> +        case COMMANDTAG_CREATE_CAST:
>> +        case COMMANDTAG_CREATE_COLLATION:
>> +        case COMMANDTAG_CREATE_CONSTRAINT:
>> +        case COMMANDTAG_CREATE_CONVERSION:
>> +        case COMMANDTAG_CREATE_DOMAIN:
>> +        case COMMANDTAG_CREATE_EXTENSION:
>> +        case COMMANDTAG_CREATE_FOREIGN_DATA_WRAPPER:
>> +        case COMMANDTAG_CREATE_FOREIGN_TABLE:
>> +        case COMMANDTAG_CREATE_FUNCTION:
>> +        case COMMANDTAG_CREATE_INDEX:
>> +        case COMMANDTAG_CREATE_LANGUAGE:
>> +        case COMMANDTAG_CREATE_MATERIALIZED_VIEW:
>> +        case COMMANDTAG_CREATE_OPERATOR:
>> +        case COMMANDTAG_CREATE_OPERATOR_CLASS:
>> +        case COMMANDTAG_CREATE_OPERATOR_FAMILY:
>> +        case COMMANDTAG_CREATE_POLICY:
>> +        case COMMANDTAG_CREATE_PROCEDURE:
>> +        case COMMANDTAG_CREATE_PUBLICATION:
>> +        case COMMANDTAG_CREATE_ROUTINE:
>> +        case COMMANDTAG_CREATE_RULE:
>> +        case COMMANDTAG_CREATE_SCHEMA:
>> +        case COMMANDTAG_CREATE_SEQUENCE:
>> +        case COMMANDTAG_CREATE_SERVER:
>> +        case COMMANDTAG_CREATE_STATISTICS:
>> +        case COMMANDTAG_CREATE_SUBSCRIPTION:
>> +        case COMMANDTAG_CREATE_TABLE:
>> +        case COMMANDTAG_CREATE_TEXT_SEARCH_CONFIGURATION:
>> +        case COMMANDTAG_CREATE_TEXT_SEARCH_DICTIONARY:
>> +        case COMMANDTAG_CREATE_TEXT_SEARCH_PARSER:
>> +        case COMMANDTAG_CREATE_TEXT_SEARCH_TEMPLATE:
>> +        case COMMANDTAG_CREATE_TRANSFORM:
>> +        case COMMANDTAG_CREATE_TRIGGER:
>> +        case COMMANDTAG_CREATE_TYPE:
>> +        case COMMANDTAG_CREATE_USER_MAPPING:
>> +        case COMMANDTAG_CREATE_VIEW:
>>
>> -    /*
>> -     * Otherwise, command should be CREATE, ALTER, or DROP.
>> -     */
>> -    if (pg_strncasecmp(tag, "CREATE ", 7) == 0)
>> -        obtypename = tag + 7;
>> -    else if (pg_strncasecmp(tag, "ALTER ", 6) == 0)
>> -        obtypename = tag + 6;
>> -    else if (pg_strncasecmp(tag, "DROP ", 5) == 0)
>> -        obtypename = tag + 5;
>> -    else
>> -        return EVENT_TRIGGER_COMMAND_TAG_NOT_RECOGNIZED;
>> +            /*
>> +             * Supported ALTER commands
>> +             */
>> +        case COMMANDTAG_ALTER_ACCESS_METHOD:
>> +        case COMMANDTAG_ALTER_AGGREGATE:
>> +        case COMMANDTAG_ALTER_CAST:
>> +        case COMMANDTAG_ALTER_COLLATION:
>> +        case COMMANDTAG_ALTER_CONSTRAINT:
>> +        case COMMANDTAG_ALTER_CONVERSION:
>> +        case COMMANDTAG_ALTER_DOMAIN:
>> +        case COMMANDTAG_ALTER_EXTENSION:
>> +        case COMMANDTAG_ALTER_FOREIGN_DATA_WRAPPER:
>> +        case COMMANDTAG_ALTER_FOREIGN_TABLE:
>> +        case COMMANDTAG_ALTER_FUNCTION:
>> +        case COMMANDTAG_ALTER_INDEX:
>> +        case COMMANDTAG_ALTER_LANGUAGE:
>> +        case COMMANDTAG_ALTER_MATERIALIZED_VIEW:
>> +        case COMMANDTAG_ALTER_OPERATOR:
>> +        case COMMANDTAG_ALTER_OPERATOR_CLASS:
>> +        case COMMANDTAG_ALTER_OPERATOR_FAMILY:
>> +        case COMMANDTAG_ALTER_POLICY:
>> +        case COMMANDTAG_ALTER_PROCEDURE:
>> +        case COMMANDTAG_ALTER_PUBLICATION:
>> +        case COMMANDTAG_ALTER_ROUTINE:
>> +        case COMMANDTAG_ALTER_RULE:
>> +        case COMMANDTAG_ALTER_SCHEMA:
>> +        case COMMANDTAG_ALTER_SEQUENCE:
>> +        case COMMANDTAG_ALTER_SERVER:
>> +        case COMMANDTAG_ALTER_STATISTICS:
>> +        case COMMANDTAG_ALTER_SUBSCRIPTION:
>> +        case COMMANDTAG_ALTER_TABLE:
>> +        case COMMANDTAG_ALTER_TEXT_SEARCH_CONFIGURATION:
>> +        case COMMANDTAG_ALTER_TEXT_SEARCH_DICTIONARY:
>> +        case COMMANDTAG_ALTER_TEXT_SEARCH_PARSER:
>> +        case COMMANDTAG_ALTER_TEXT_SEARCH_TEMPLATE:
>> +        case COMMANDTAG_ALTER_TRANSFORM:
>> +        case COMMANDTAG_ALTER_TRIGGER:
>> +        case COMMANDTAG_ALTER_TYPE:
>> +        case COMMANDTAG_ALTER_USER_MAPPING:
>> +        case COMMANDTAG_ALTER_VIEW:
>>
>> -    /*
>> -     * ...and the object type should be something recognizable.
>> -     */
>> -    for (etsd = event_trigger_support; etsd->obtypename != NULL; etsd++)
>> -        if (pg_strcasecmp(etsd->obtypename, obtypename) == 0)
>> +            /*
>> +             * Supported DROP commands
>> +             */
>> +        case COMMANDTAG_DROP_ACCESS_METHOD:
>> +        case COMMANDTAG_DROP_AGGREGATE:
>> +        case COMMANDTAG_DROP_CAST:
>> +        case COMMANDTAG_DROP_COLLATION:
>> +        case COMMANDTAG_DROP_CONSTRAINT:
>> +        case COMMANDTAG_DROP_CONVERSION:
>> +        case COMMANDTAG_DROP_DOMAIN:
>> +        case COMMANDTAG_DROP_EXTENSION:
>> +        case COMMANDTAG_DROP_FOREIGN_DATA_WRAPPER:
>> +        case COMMANDTAG_DROP_FOREIGN_TABLE:
>> +        case COMMANDTAG_DROP_FUNCTION:
>> +        case COMMANDTAG_DROP_INDEX:
>> +        case COMMANDTAG_DROP_LANGUAGE:
>> +        case COMMANDTAG_DROP_MATERIALIZED_VIEW:
>> +        case COMMANDTAG_DROP_OPERATOR:
>> +        case COMMANDTAG_DROP_OPERATOR_CLASS:
>> +        case COMMANDTAG_DROP_OPERATOR_FAMILY:
>> +        case COMMANDTAG_DROP_POLICY:
>> +        case COMMANDTAG_DROP_PROCEDURE:
>> +        case COMMANDTAG_DROP_PUBLICATION:
>> +        case COMMANDTAG_DROP_ROUTINE:
>> +        case COMMANDTAG_DROP_RULE:
>> +        case COMMANDTAG_DROP_SCHEMA:
>> +        case COMMANDTAG_DROP_SEQUENCE:
>> +        case COMMANDTAG_DROP_SERVER:
>> +        case COMMANDTAG_DROP_STATISTICS:
>> +        case COMMANDTAG_DROP_SUBSCRIPTION:
>> +        case COMMANDTAG_DROP_TABLE:
>> +        case COMMANDTAG_DROP_TEXT_SEARCH_CONFIGURATION:
>> +        case COMMANDTAG_DROP_TEXT_SEARCH_DICTIONARY:
>> +        case COMMANDTAG_DROP_TEXT_SEARCH_PARSER:
>> +        case COMMANDTAG_DROP_TEXT_SEARCH_TEMPLATE:
>> +        case COMMANDTAG_DROP_TRANSFORM:
>> +        case COMMANDTAG_DROP_TRIGGER:
>> +        case COMMANDTAG_DROP_TYPE:
>> +        case COMMANDTAG_DROP_USER_MAPPING:
>> +        case COMMANDTAG_DROP_VIEW:
>> +            return EVENT_TRIGGER_COMMAND_TAG_OK;
>> +
>> +            /*
>> +             * Unsupported CREATE commands
>> +             */
>> +        case COMMANDTAG_CREATE_DATABASE:
>> +        case COMMANDTAG_CREATE_EVENT_TRIGGER:
>> +        case COMMANDTAG_CREATE_ROLE:
>> +        case COMMANDTAG_CREATE_TABLESPACE:
>> +
>> +            /*
>> +             * Unsupported ALTER commands
>> +             */
>> +        case COMMANDTAG_ALTER_DATABASE:
>> +        case COMMANDTAG_ALTER_EVENT_TRIGGER:
>> +        case COMMANDTAG_ALTER_ROLE:
>> +        case COMMANDTAG_ALTER_TABLESPACE:
>> +
>> +            /*
>> +             * Unsupported DROP commands
>> +             */
>> +        case COMMANDTAG_DROP_DATABASE:
>> +        case COMMANDTAG_DROP_EVENT_TRIGGER:
>> +        case COMMANDTAG_DROP_ROLE:
>> +        case COMMANDTAG_DROP_TABLESPACE:
>> +
>> +            /*
>> +             * Other unsupported commands.  These used to return
>> +             * EVENT_TRIGGER_COMMAND_TAG_NOT_RECOGNIZED prior to the
>> +             * conversion of commandTag from string to enum.
>> +             */
>> +        case COMMANDTAG_ALTER_SYSTEM:
>> +        case COMMANDTAG_ANALYZE:
>> +        case COMMANDTAG_BEGIN:
>> +        case COMMANDTAG_CALL:
>> +        case COMMANDTAG_CHECKPOINT:
>> +        case COMMANDTAG_CLOSE:
>> +        case COMMANDTAG_CLOSE_CURSOR:
>> +        case COMMANDTAG_CLOSE_CURSOR_ALL:
>> +        case COMMANDTAG_CLUSTER:
>> +        case COMMANDTAG_COMMIT:
>> +        case COMMANDTAG_COMMIT_PREPARED:
>> +        case COMMANDTAG_COPY:
>> +        case COMMANDTAG_COPY_FROM:
>> +        case COMMANDTAG_DEALLOCATE:
>> +        case COMMANDTAG_DEALLOCATE_ALL:
>> +        case COMMANDTAG_DECLARE_CURSOR:
>> +        case COMMANDTAG_DELETE:
>> +        case COMMANDTAG_DISCARD:
>> +        case COMMANDTAG_DISCARD_ALL:
>> +        case COMMANDTAG_DISCARD_PLANS:
>> +        case COMMANDTAG_DISCARD_SEQUENCES:
>> +        case COMMANDTAG_DISCARD_TEMP:
>> +        case COMMANDTAG_DO:
>> +        case COMMANDTAG_DROP_REPLICATION_SLOT:
>> +        case COMMANDTAG_EXECUTE:
>> +        case COMMANDTAG_EXPLAIN:
>> +        case COMMANDTAG_FETCH:
>> +        case COMMANDTAG_GRANT_ROLE:
>> +        case COMMANDTAG_INSERT:
>> +        case COMMANDTAG_LISTEN:
>> +        case COMMANDTAG_LOAD:
>> +        case COMMANDTAG_LOCK_TABLE:
>> +        case COMMANDTAG_MOVE:
>> +        case COMMANDTAG_NOTIFY:
>> +        case COMMANDTAG_PREPARE:
>> +        case COMMANDTAG_PREPARE_TRANSACTION:
>> +        case COMMANDTAG_REASSIGN_OWNED:
>> +        case COMMANDTAG_REINDEX:
>> +        case COMMANDTAG_RELEASE:
>> +        case COMMANDTAG_RESET:
>> +        case COMMANDTAG_REVOKE_ROLE:
>> +        case COMMANDTAG_ROLLBACK:
>> +        case COMMANDTAG_ROLLBACK_PREPARED:
>> +        case COMMANDTAG_SAVEPOINT:
>> +        case COMMANDTAG_SELECT:
>> +        case COMMANDTAG_SELECT_FOR_KEY_SHARE:
>> +        case COMMANDTAG_SELECT_FOR_NO_KEY_UPDATE:
>> +        case COMMANDTAG_SELECT_FOR_SHARE:
>> +        case COMMANDTAG_SELECT_FOR_UPDATE:
>> +        case COMMANDTAG_SET:
>> +        case COMMANDTAG_SET_CONSTRAINTS:
>> +        case COMMANDTAG_SHOW:
>> +        case COMMANDTAG_START_TRANSACTION:
>> +        case COMMANDTAG_TRUNCATE_TABLE:
>> +        case COMMANDTAG_UNLISTEN:
>> +        case COMMANDTAG_UPDATE:
>> +        case COMMANDTAG_VACUUM:
>> +            return EVENT_TRIGGER_COMMAND_TAG_NOT_SUPPORTED;
>> +        case COMMANDTAG_UNKNOWN:
>>             break;
>> -    if (etsd->obtypename == NULL)
>> -        return EVENT_TRIGGER_COMMAND_TAG_NOT_RECOGNIZED;
>> -    if (!etsd->supported)
>> -        return EVENT_TRIGGER_COMMAND_TAG_NOT_SUPPORTED;
>> -    return EVENT_TRIGGER_COMMAND_TAG_OK;
>> +    }
>> +    return EVENT_TRIGGER_COMMAND_TAG_NOT_RECOGNIZED;
>> }
>
> This is pretty painful.

Yeah, and it’s gone in v3.  This sort of logic now lives in the behavior table in src/backend/utils/misc/commandtag.c.

>> @@ -745,7 +902,7 @@ EventTriggerCommonSetup(Node *parsetree,
>>         return NIL;
>>
>>     /* Get the command tag. */
>> -    tag = CreateCommandTag(parsetree);
>> +    tag = GetCommandTagName(CreateCommandTag(parsetree));
>>
>>     /*
>>      * Filter list of event triggers by command tag, and copy them into our
>> @@ -2136,7 +2293,7 @@ pg_event_trigger_ddl_commands(PG_FUNCTION_ARGS)
>>                     /* objsubid */
>>                     values[i++] = Int32GetDatum(addr.objectSubId);
>>                     /* command tag */
>> -                    values[i++] = CStringGetTextDatum(CreateCommandTag(cmd->parsetree));
>> +                    values[i++] = CStringGetTextDatum(GetCommandTagName(CreateCommandTag(cmd->parsetree)));
>>                     /* object_type */
>>                     values[i++] = CStringGetTextDatum(type);
>>                     /* schema */
>> @@ -2161,7 +2318,7 @@ pg_event_trigger_ddl_commands(PG_FUNCTION_ARGS)
>>                 /* objsubid */
>>                 nulls[i++] = true;
>>                 /* command tag */
>> -                values[i++] = CStringGetTextDatum(CreateCommandTag(cmd->parsetree));
>> +                values[i++] = CStringGetTextDatum(GetCommandTagName(CreateCommandTag(cmd->parsetree)));
>>                 /* object_type */
>>                 values[i++] = CStringGetTextDatum(stringify_adefprivs_objtype(cmd->d.defprivs.objtype));
>>                 /* schema */
>
> So GetCommandTagName we commonly do twice for some reason? Once in
> EventTriggerCommonSetup() and then again in
> pg_event_trigger_ddl_commands()? Why is EventTriggerData.tag still the
> string?

EventTriggerCommonSetup() gets the command tag enum, not the string, at least in v3.

>
>>     Assert(list_length(plan->plancache_list) == 1);
>> @@ -1469,7 +1469,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
>>                         (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>>                 /* translator: %s is a SQL statement name */
>>                          errmsg("%s is not allowed in a non-volatile function",
>> -                                CreateCommandTag((Node *) pstmt))));
>> +                                GetCommandTagName(CreateCommandTag((Node *) pstmt)))));
>
> Probably worth having a wrapper for this - these lines are pretty long,
> and there quite a number of cases like it in the patch.

I was having some trouble figuring out what to name the wrapper.  “CreateCommandTagAndGetName" is nearly as long as the
twofunction names it replaces.  “CreateCommandTagName” sounds like you’re creating a name rather than a CommandTag,
whichis misleading.  But then I realized this function was poorly named to begin with.  “Create” is an entirely
inappropriateverb for what this function does.  Even before this patch, it wasn’t creating anything.  I was looking up
aconstant string name.  Now it is looking up an enum. 

I went with CreateCommandName(…), but I think this leaves a lot to be desired.  Thoughts?

>> @@ -172,11 +175,38 @@ EndCommand(const char *commandTag, CommandDest dest)
>>         case DestRemoteSimple:
>>
>>             /*
>> -             * We assume the commandTag is plain ASCII and therefore requires
>> -             * no encoding conversion.
>> +             * We assume the tagname is plain ASCII and therefore
>> +             * requires no encoding conversion.
>>              */
>> -            pq_putmessage('C', commandTag, strlen(commandTag) + 1);
>> -            break;
>> +            tagname = GetCommandTagName(qc->commandTag);
>> +            switch (qc->display_format)
>> +            {
>> +                case DISPLAYFORMAT_PLAIN:
>> +                    pq_putmessage('C', tagname, strlen(tagname) + 1);
>> +                    break;
>> +                case DISPLAYFORMAT_LAST_OID:
>> +                    /*
>> +                     * We no longer display LastOid, but to preserve the wire protocol,
>> +                     * we write InvalidOid where the LastOid used to be written.  For
>> +                     * efficiency in the snprintf(), hard-code InvalidOid as zero.
>> +                     */
>> +                    Assert(InvalidOid == 0);
>> +                    snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
>> +                                "%s 0 " UINT64_FORMAT,
>> +                                tagname,
>> +                                qc->nprocessed);
>> +                    pq_putmessage('C', completionTag, strlen(completionTag) + 1);
>> +                    break;
>> +                case DISPLAYFORMAT_NPROCESSED:
>> +                    snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
>> +                            "%s " UINT64_FORMAT,
>> +                            tagname,
>> +                            qc->nprocessed);
>> +                    pq_putmessage('C', completionTag, strlen(completionTag) + 1);
>> +                    break;
>> +                default:
>> +                    elog(ERROR, "Invalid display_format in EndCommand");
>> +            }
>
> Imo there should only be one pq_putmessage().  Also think this type of
> default: is a bad idea, just prevents the compiler from warning if we
> were to ever introduce a new variant of DISPLAYFORMAT_*, without
> providing any meaningful additional security.

This is fixed in v3.

>> @@ -855,7 +889,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>>
>>         case T_DiscardStmt:
>>             /* should we allow DISCARD PLANS? */
>> -            CheckRestrictedOperation("DISCARD");
>> +            CheckRestrictedOperation(COMMANDTAG_DISCARD);
>>             DiscardCommand((DiscardStmt *) parsetree, isTopLevel);
>>             break;
>>
>> @@ -974,7 +1008,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>>                 if (EventTriggerSupportsObjectType(stmt->objtype))
>>                     ProcessUtilitySlow(pstate, pstmt, queryString,
>>                                        context, params, queryEnv,
>> -                                       dest, completionTag);
>> +                                       dest, qc);
>>                 else
>>                     ExecuteGrantStmt(stmt);
>>             }
>> @@ -987,7 +1021,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>>                 if (EventTriggerSupportsObjectType(stmt->removeType))
>>                     ProcessUtilitySlow(pstate, pstmt, queryString,
>>                                        context, params, queryEnv,
>> -                                       dest, completionTag);
>> +                                       dest, qc);
>>                 else
>>                     ExecDropStmt(stmt, isTopLevel);
>>             }
>> @@ -1000,7 +1034,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>>                 if (EventTriggerSupportsObjectType(stmt->renameType))
>>                     ProcessUtilitySlow(pstate, pstmt, queryString,
>>                                        context, params, queryEnv,
>> -                                       dest, completionTag);
>> +                                       dest, qc);
>>                 else
>>                     ExecRenameStmt(stmt);
>>             }
>> @@ -1013,7 +1047,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>>                 if (EventTriggerSupportsObjectType(stmt->objectType))
>>                     ProcessUtilitySlow(pstate, pstmt, queryString,
>>                                        context, params, queryEnv,
>> -                                       dest, completionTag);
>> +                                       dest, qc);
>>                 else
>>                     ExecAlterObjectDependsStmt(stmt, NULL);
>>             }
>> @@ -1026,7 +1060,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>>                 if (EventTriggerSupportsObjectType(stmt->objectType))
>>                     ProcessUtilitySlow(pstate, pstmt, queryString,
>>                                        context, params, queryEnv,
>> -                                       dest, completionTag);
>> +                                       dest, qc);
>>                 else
>>                     ExecAlterObjectSchemaStmt(stmt, NULL);
>>             }
>> @@ -1039,7 +1073,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>>                 if (EventTriggerSupportsObjectType(stmt->objectType))
>>                     ProcessUtilitySlow(pstate, pstmt, queryString,
>>                                        context, params, queryEnv,
>> -                                       dest, completionTag);
>> +                                       dest, qc);
>>                 else
>>                     ExecAlterOwnerStmt(stmt);
>>             }
>> @@ -1052,7 +1086,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>>                 if (EventTriggerSupportsObjectType(stmt->objtype))
>>                     ProcessUtilitySlow(pstate, pstmt, queryString,
>>                                        context, params, queryEnv,
>> -                                       dest, completionTag);
>> +                                       dest, qc);
>>                 else
>>                     CommentObject(stmt);
>>                 break;
>> @@ -1065,7 +1099,7 @@ standard_ProcessUtility(PlannedStmt *pstmt,
>>                 if (EventTriggerSupportsObjectType(stmt->objtype))
>>                     ProcessUtilitySlow(pstate, pstmt, queryString,
>>                                        context, params, queryEnv,
>> -                                       dest, completionTag);
>> +                                       dest, qc);
>
> Not this patch's fault or task. But I hate this type of code - needing
> to touch a dozen places for new type of statement is just
> insane. utility.c should long have been rewritten to just have one
> metadata table for nearly all of this. Perhaps with a few callbacks for
> special cases.

I’ve decided not to touch this issue.  There are no changes here from how it was done in v2.

>> +static const char * tag_names[] = {
>> +    "???",
>> +    "ALTER ACCESS METHOD",
>> +    "ALTER AGGREGATE",
>> +    "ALTER CAST",
>
> This seems problematic to maintain, because the order needs to match
> between this and something defined in a header - and there's no
> guarantee a misordering is immediately noticeable. We should either go
> for my metadata table idea, or at least rewrite this, even if more
> verbose, to something like
>
> static const char * tag_names[] = {
>    [COMMAND_TAG_ALTER_ACCESS_METHOD] = "ALTER ACCESS METHOD",
>    ...
>
> I think the fact that this would show up in a grep for
> COMMAND_TAG_ALTER_ACCESS_METHOD is good too.

Rewriting this as you suggest does not prevent tag names from being out of sorted order.  Version 3 of the patch adds a
perlscript that reads commandtag.h and commandtag.c during the build process and stops the build with a brief error
messageif they don’t match, are malformed, or if the sorting order is wrong.  The script does not modify the code.  It
justreviews it for correctness.  As such, it probably doesn’t matter whether it runs on all platforms.  I did not look
intowhether this runs on Windows, but if there is any difficulty there, it could simply be disabled on that platform. 

It also doesn’t matter if this perl script gets committed.  There is a trade-off here between maintaining the script
vs.manually maintaining the enum ordering. 

>> +/*
>> + * Search CommandTag by name
>> + *
>> + * Returns CommandTag, or COMMANDTAG_UNKNOWN if not recognized
>> + */
>> +CommandTag
>> +GetCommandTagEnum(const char *commandname)
>> +{
>> +    const char **base, **last, **position;
>> +    int         result;
>> +
>> +    OPTIONALLY_CHECK_COMMAND_TAGS();
>> +    if (commandname == NULL || *commandname == '\0')
>> +        return COMMANDTAG_UNKNOWN;
>> +
>> +    base = tag_names;
>> +    last = tag_names + tag_name_length - 1;
>> +    while (last >= base)
>> +    {
>> +        position = base + ((last - base) >> 1);
>> +        result = pg_strcasecmp(commandname, *position);
>> +        if (result == 0)
>> +            return (CommandTag) (position - tag_names);
>> +        else if (result < 0)
>> +            last = position - 1;
>> +        else
>> +            base = position + 1;
>> +    }
>> +    return COMMANDTAG_UNKNOWN;
>> +}
>
> This seems pretty grotty - but you get rid of it later. See my comments there.

I kept a form of GetCommandTagEnum.

>> +#ifdef COMMANDTAG_CHECKING
>> +bool
>> +CheckCommandTagEnum()
>> +{
>> +    CommandTag    i, j;
>> +
>> +    if (FIRST_COMMAND_TAG < 0 || LAST_COMMAND_TAG < 0 || LAST_COMMAND_TAG < FIRST_COMMAND_TAG)
>> +    {
>> +        elog(ERROR, "FIRST_COMMAND_TAG (%u), LAST_COMMAND_TAG (%u) not reasonable",
>> +             (unsigned int) FIRST_COMMAND_TAG, (unsigned int) LAST_COMMAND_TAG);
>> +        return false;
>> +    }
>> +    if (FIRST_COMMAND_TAG != (CommandTag)0)
>> +    {
>> +        elog(ERROR, "FIRST_COMMAND_TAG (%u) != 0", (unsigned int) FIRST_COMMAND_TAG);
>> +        return false;
>> +    }
>> +    if (LAST_COMMAND_TAG != (CommandTag)(tag_name_length - 1))
>> +    {
>> +        elog(ERROR, "LAST_COMMAND_TAG (%u) != tag_name_length (%u)",
>> +             (unsigned int) LAST_COMMAND_TAG, (unsigned int) tag_name_length);
>> +        return false;
>> +    }
>
> These all seem to want to be static asserts.

This is all gone now, either to the perl script or to a StaticAssert, or to a bit of both.

>> +    for (i = FIRST_COMMAND_TAG; i < LAST_COMMAND_TAG; i++)
>> +    {
>> +        for (j = i+1; j < LAST_COMMAND_TAG; j++)
>> +        {
>> +            int cmp = strcmp(tag_names[i], tag_names[j]);
>> +            if (cmp == 0)
>> +            {
>> +                elog(ERROR, "Found duplicate tag_name: \"%s\"",
>> +                    tag_names[i]);
>> +                return false;
>> +            }
>> +            if (cmp > 0)
>> +            {
>> +                elog(ERROR, "Found commandnames out of order: \"%s\" before \"%s\"",
>> +                    tag_names[i], tag_names[j]);
>> +                return false;
>> +            }
>> +        }
>> +    }
>> +    return true;
>> +}
>
> And I think we could get rid of this with my earlier suggestions?

This is now handled by the perl script, also.

>> +/*
>> + * BEWARE: These are in sorted order, but ordered by their printed
>> + *         values in the tag_name list (see common/commandtag.c).
>> + *         In particular it matters because the sort ordering changes
>> + *         when you replace a space with an underscore.  To wit:
>> + *
>> + *             "CREATE TABLE"
>> + *             "CREATE TABLE AS"
>> + *             "CREATE TABLESPACE"
>> + *
>> + *         but...
>> + *
>> + *             CREATE_TABLE
>> + *             CREATE_TABLESPACE
>> + *             CREATE_TABLE_AS
>> + *
>> + *         It also matters that COMMANDTAG_UNKNOWN is written "???".
>> + *
>> + * If you add a value here, add it in common/commandtag.c also, and
>> + * be careful to get the ordering right.  You can build with
>> + * COMMANDTAG_CHECKING to have this automatically checked
>> + * at runtime, but that adds considerable overhead, so do so sparingly.
>> + */
>> +typedef enum CommandTag
>> +{
>
> This seems pretty darn nightmarish.

Well, it does get automatically checked for you.

>> +#define FIRST_COMMAND_TAG COMMANDTAG_UNKNOWN
>> +    COMMANDTAG_UNKNOWN,
>> +    COMMANDTAG_ALTER_ACCESS_METHOD,
>> +    COMMANDTAG_ALTER_AGGREGATE,
>> +    COMMANDTAG_ALTER_CAST,
>> +    COMMANDTAG_ALTER_COLLATION,
>> +    COMMANDTAG_ALTER_CONSTRAINT,
>> +    COMMANDTAG_ALTER_CONVERSION,
>> +    COMMANDTAG_ALTER_DATABASE,
>> +    COMMANDTAG_ALTER_DEFAULT_PRIVILEGES,
>> +    COMMANDTAG_ALTER_DOMAIN,
>> [...]
>
> I'm a bit worried that this basically duplicates a good portion of NodeTag, without having otherwise much of a point?

There is not enough overlap between NodeTag and CommandTag for any obvious consolidation.  Feel free to recommend
somethingspecific. 

>> From a70b0cadc1142e92b2354a0ca3cd47aaeb0c148e Mon Sep 17 00:00:00 2001
>> From: Mark Dilger <mark.dilger@enterprisedb.com>
>> Date: Tue, 4 Feb 2020 12:25:05 -0800
>> Subject: [PATCH v2 2/3] Using a Bitmapset of tags rather than a string array.
>> MIME-Version: 1.0
>> Content-Type: text/plain; charset=UTF-8
>> Content-Transfer-Encoding: 8bit
>>
>> EventTriggerCacheItem no longer holds an array of palloc’d tag strings
>> in sorted order, but rather just a Bitmapset over the CommandTags.  This
>> makes the code a little simpler and easier to read, in my opinion.  In
>> filter_event_trigger, rather than running bsearch through a sorted array
>> of strings, it just runs bms_is_member.
>> ---
>
> It seems weird to add the bsearch just to remove it immediately again a
> patch later. This probably should just go first?

I still don’t know what this comment means.

>> diff --git a/src/test/regress/sql/event_trigger.sql b/src/test/regress/sql/event_trigger.sql
>> index 346168673d..cad02212ad 100644
>> --- a/src/test/regress/sql/event_trigger.sql
>> +++ b/src/test/regress/sql/event_trigger.sql
>> @@ -10,6 +10,13 @@ BEGIN
>> END
>> $$ language plpgsql;
>>
>> +-- OK
>> +create function test_event_trigger2() returns event_trigger as $$
>> +BEGIN
>> +    RAISE NOTICE 'test_event_trigger2: % %', tg_event, tg_tag;
>> +END
>> +$$ LANGUAGE plpgsql;
>> +
>> -- should fail, event triggers cannot have declared arguments
>> create function test_event_trigger_arg(name text)
>> returns event_trigger as $$ BEGIN RETURN 1; END $$ language plpgsql;
>> @@ -82,6 +89,783 @@ create event trigger regress_event_trigger2 on ddl_command_start
>> -- OK
>> comment on event trigger regress_event_trigger is 'test comment';
>>
>> +-- These are all unsupported
>> +create event trigger regress_event_triger_NULL on ddl_command_start
>> +   when tag in ('')
>> +   execute procedure test_event_trigger2();
>> +
>> +create event trigger regress_event_triger_UNKNOWN on ddl_command_start
>> +   when tag in ('???')
>> +   execute procedure test_event_trigger2();
>> +
>> +create event trigger regress_event_trigger_ALTER_DATABASE on ddl_command_start
>> +   when tag in ('ALTER DATABASE')
>> +   execute procedure test_event_trigger2();
> [...]
>
> There got to be a more maintainable way to write this.

This has all been removed from version 3 of the patch set.




—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Вложения

Re: Portal->commandTag as an enum

От
Alvaro Herrera
Дата:
On 2020-Feb-07, Mark Dilger wrote:

> Andres,
> 
> The previous patch set seemed to cause confusion, having separated
> changes into multiple files.  The latest patch, heavily influenced by
> your review, is all in one file, attached.

Cool stuff.

I think is a little confused about what is source and what is generated.
I'm not clear why commandtag.c is a C file at all; wouldn't it be
simpler to have it as a Data::Dumper file or some sort of Perl struct,
so that it can be read easily by the Perl file?  Similar to the
src/include/catalog/pg_*.dat files.  That script can then generate all
the needed .c and .h files, which are not going to be part of the source
tree, and will always be in-sync and won't have the formatting
strictness about it.  And you won't have the Martian syntax you had to
use in the commandtag.c file.

As for API, please don't shorten things such as SetQC, just use
SetQueryCompletion.  Perhaps call it SetCompletionTag or SetCommandTag?.
(I'm not sure about the name "QueryCompletionData"; maybe CommandTag or
QueryTag would work better for that struct.  There seems to be a lot of
effort in separating QueryCompletion from CommandTag; is that really
necessary?)  Lastly, we have a convention that we have a struct called
FooData, and a typedef FooData *Foo, then use the latter in the API.
We don't adhere to that 100%, and some people dislike it, but I'd rather
be consistent and not be passing "FooData *" around; it's just noisier.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Portal->commandTag as an enum

От
Mark Dilger
Дата:

> On Feb 11, 2020, at 11:09 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> On 2020-Feb-07, Mark Dilger wrote:
>
>> Andres,
>>
>> The previous patch set seemed to cause confusion, having separated
>> changes into multiple files.  The latest patch, heavily influenced by
>> your review, is all in one file, attached.
>
> Cool stuff.

Thanks for the review!

> I think is a little confused about what is source and what is generated.

The perl file generates nothing.  It merely checks that the .h and .c files are valid and consistent with each other.

> I'm not clear why commandtag.c is a C file at all; wouldn't it be
> simpler to have it as a Data::Dumper file or some sort of Perl struct,
> so that it can be read easily by the Perl file?  Similar to the
> src/include/catalog/pg_*.dat files.  That script can then generate all
> the needed .c and .h files, which are not going to be part of the source
> tree, and will always be in-sync and won't have the formatting
> strictness about it.  And you won't have the Martian syntax you had to
> use in the commandtag.c file.

I thought about generating the files rather than merely checking them.  I could see arguments both ways.  I wasn’t sure
whetherthere would be broad support for having yet another perl script generating source files, nor for the maintenance
burdenof having to do that on all platforms.  Having a perl script that merely sanity checks the source files has the
advantagethat there is no requirement for it to function on all platforms.  There’s not even a requirement for it to be
committedto the tree, since you could also argue that the maintenance burden of the script outweighs the burden of
gettingthe source files right by hand. 

> As for API, please don't shorten things such as SetQC, just use
> SetQueryCompletion.  Perhaps call it SetCompletionTag or SetCommandTag?.
> (I'm not sure about the name "QueryCompletionData"; maybe CommandTag or
> QueryTag would work better for that struct.

I am happy to rename it as SetQueryCompletion.

> There seems to be a lot of
> effort in separating QueryCompletion from CommandTag; is that really
> necessary?)

For some code paths, prior to this patch, the commandTag gets changed before returning, and I’m not just talking about
thechange where the rowcount gets written into the commandTag string.  I have a work-in-progress patch to provide
systemviews to track the number of commands executed of each type, and that patch includes the ability to distinguish
betweenwhat the command started as and what it completed as, so I do want to keep those concepts separate.  I rejected
the“SetCommandTag” naming suggestion above because we’re really setting information about the completion (what it
finishedas) and not the command (what it started as).  I rejected the “SetCompletionTag” naming because it’s not just
thetag that is being set, but both the tag and the row count.  I am happy with “SetQueryCompletion” because that naming
isconsistent with setting the pair of values. 

> Lastly, we have a convention that we have a struct called
> FooData, and a typedef FooData *Foo, then use the latter in the API.
> We don't adhere to that 100%, and some people dislike it, but I'd rather
> be consistent and not be passing "FooData *" around; it's just noisier.

I’m familiar with the convention, and don’t like it, so I’ll have to look at a better way of naming this.  I
specificallydon’t like it because it makes a mess of using the const qualifier. 

Once again, thanks for the review!  I will work to get another version of this patch posted around the time I post
(separately)the command stats patch. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Portal->commandTag as an enum

От
Alvaro Herrera
Дата:
On 2020-Feb-11, Mark Dilger wrote:

> I thought about generating the files rather than merely checking them.
> I could see arguments both ways.  I wasn’t sure whether there would be
> broad support for having yet another perl script generating source
> files, nor for the maintenance burden of having to do that on all
> platforms.  Having a perl script that merely sanity checks the source
> files has the advantage that there is no requirement for it to
> function on all platforms.  There’s not even a requirement for it to
> be committed to the tree, since you could also argue that the
> maintenance burden of the script outweighs the burden of getting the
> source files right by hand.

No thanks.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Portal->commandTag as an enum

От
Mark Dilger
Дата:

> On Feb 11, 2020, at 12:50 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> On 2020-Feb-11, Mark Dilger wrote:
>
>> I thought about generating the files rather than merely checking them.
>> I could see arguments both ways.  I wasn’t sure whether there would be
>> broad support for having yet another perl script generating source
>> files, nor for the maintenance burden of having to do that on all
>> platforms.  Having a perl script that merely sanity checks the source
>> files has the advantage that there is no requirement for it to
>> function on all platforms.  There’s not even a requirement for it to
>> be committed to the tree, since you could also argue that the
>> maintenance burden of the script outweighs the burden of getting the
>> source files right by hand.
>
> No thanks.

I’m not sure which option you are voting for:

(Option #1) Have the perl script generate the .c and .h file from a .dat file

(Option #2) Have the perl script validate but not generate the .c and .h files

(Option #3) Have no perl script, with all burden on the programmer to get the .c and .h files right by hand.

I think you’re voting against #3, and I’m guessing you’re voting for #1, but I’m not certain.


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Portal->commandTag as an enum

От
Alvaro Herrera
Дата:
On 2020-Feb-11, Mark Dilger wrote:

> > No thanks.
> 
> I’m not sure which option you are voting for:
> 
> (Option #1) Have the perl script generate the .c and .h file from a .dat file
> 
> (Option #2) Have the perl script validate but not generate the .c and .h files
> 
> (Option #3) Have no perl script, with all burden on the programmer to get the .c and .h files right by hand.
> 
> I think you’re voting against #3, and I’m guessing you’re voting for #1, but I’m not certain.

I was voting against #2 (burden the programmer with consistency checks
that must be fixed by hand, without actually doing the programmatically-
doable work), but I don't like #3 either.  I do like #1.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Portal->commandTag as an enum

От
Mark Dilger
Дата:

> On Feb 11, 2020, at 1:02 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> On 2020-Feb-11, Mark Dilger wrote:
>
>>> No thanks.
>>
>> I’m not sure which option you are voting for:
>>
>> (Option #1) Have the perl script generate the .c and .h file from a .dat file
>>
>> (Option #2) Have the perl script validate but not generate the .c and .h files
>>
>> (Option #3) Have no perl script, with all burden on the programmer to get the .c and .h files right by hand.
>>
>> I think you’re voting against #3, and I’m guessing you’re voting for #1, but I’m not certain.
>
> I was voting against #2 (burden the programmer with consistency checks
> that must be fixed by hand, without actually doing the programmatically-
> doable work), but I don't like #3 either.  I do like #1.

Option #1 works for me.  If I don’t see any contrary votes before I get back to this patch, I’ll implement it that way
forthe next version. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Portal->commandTag as an enum

От
Mark Dilger
Дата:

> On Feb 11, 2020, at 1:05 PM, Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>
>> On Feb 11, 2020, at 1:02 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>>
>> On 2020-Feb-11, Mark Dilger wrote:
>>
>>>> No thanks.
>>>
>>> I’m not sure which option you are voting for:
>>>
>>> (Option #1) Have the perl script generate the .c and .h file from a .dat file
>>>
>>> (Option #2) Have the perl script validate but not generate the .c and .h files
>>>
>>> (Option #3) Have no perl script, with all burden on the programmer to get the .c and .h files right by hand.
>>>
>>> I think you’re voting against #3, and I’m guessing you’re voting for #1, but I’m not certain.
>>
>> I was voting against #2 (burden the programmer with consistency checks
>> that must be fixed by hand, without actually doing the programmatically-
>> doable work), but I don't like #3 either.  I do like #1.
>
> Option #1 works for me.  If I don’t see any contrary votes before I get back to this patch, I’ll implement it that
wayfor the next version. 

While investigating how best to implement option #1, I took a look at how Catalog.pm does it.

Catalog.pm reads data files and eval()s chunks of them to vivify perl data.

                # We're treating the input line as a piece of Perl, so we
                # need to use string eval here. Tell perlcritic we know what
                # we're doing.
                eval '$hash_ref = ' . $_;   ## no critic (ProhibitStringyEval)

This would only make sense to me if the string held in $_ had already been checked for safety, but Catalog.pm does very
littleto verify that the string is safe to eval.  The assumption here, so far as I can infer, is that we don’t embed
anythingdangerous in our .dat files, so this should be ok.  That may be true for the moment, but I can imagine a day
whenwe start embedding perl functions as quoted text inside a data file, or shell commands as text which look enough
likeperl for eval() to be able to execute them.  Developers who edit these .dat files and mess up the quoting, and then
rerun‘make’ to get the new .c and .h files generated, may not like the side effects.  Perhaps I’m being overly
paranoid…. 

Rather than add more code generation logic based on the design of Catalog.pm, I wrote a perl based data file parser
thatparses .dat files and returns vivified perl data, as Catalog.pm does, but with much stricter parsing logic to make
certainnothing dangerous gets eval()ed.  I put the new module in DataFile.pm.  The commandtag data has been
consolidatedinto a single .dat file.  A new perl script, gencommandtag.pl, parses the commandtag.dat file and generates
the.c and .h files.  So far, only gencommandtag.pl uses DataFile.pm, but I’ve checked that it can parse all the .dat
filescurrently in the source tree. 

The new parser is more flexible about the structure of the data, which seems good to me for making it easier to add or
modifydata files in the future.  The new parser does not yet have a means of hacking up the data to add autogenerated
fieldsand such that Catalog.pm does, but I think a more clean break between parsing and autovivifying fields would be
goodanyway.  If I get generally favorable reviews of DataFile.pm, I might go refactor Catalog.pm.  For now, I’m just
leavingCatalog.pm alone.  

> That script can then generate all
> the needed .c and .h files, which are not going to be part of the source
> tree, and will always be in-sync and won't have the formatting
> strictness about it.  And you won't have the Martian syntax you had to
> use in the commandtag.c file.

I still have the “Martian syntax”, though now it is generated by the perl script.  I can get rid of it, but I think
Andresliked the Martian stuff. 

> We don't adhere to that 100%, and some people dislike it, but I'd rather
> be consistent and not be passing "FooData *" around; it's just noisier.

I renamed QueryCompletionData to just QueryCompletion, and I’m passing pointers to that.



—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Вложения

Re: Portal->commandTag as an enum

От
John Naylor
Дата:
Hi Mark,

On Wed, Feb 19, 2020 at 10:40 AM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
> This would only make sense to me if the string held in $_ had already been checked for safety, but Catalog.pm does
verylittle to verify that the string is safe to eval.  The assumption here, so far as I can infer, is that we don’t
embedanything dangerous in our .dat files, so this should be ok.  That may be true for the moment, but I can imagine a
daywhen we start embedding perl functions as quoted text inside a data file, or shell commands as text which look
enoughlike perl for eval() to be able to execute them.  Developers who edit these .dat files and mess up the quoting,
andthen rerun ‘make’ to get the new .c and .h files generated, may not like the side effects.  Perhaps I’m being overly
paranoid….

The use case for that seems slim. However, at a brief glance your
module seems more robust in other ways.

> Rather than add more code generation logic based on the design of Catalog.pm, I wrote a perl based data file parser
thatparses .dat files and returns vivified perl data, as Catalog.pm does, but with much stricter parsing logic to make
certainnothing dangerous gets eval()ed.  I put the new module in DataFile.pm. 
> [...]
> The new parser is more flexible about the structure of the data, which seems good to me for making it easier to add
ormodify data files in the future.  The new parser does not yet have a means of hacking up the data to add
autogeneratedfields and such that Catalog.pm does, but I think a more clean break between parsing and autovivifying
fieldswould be good anyway. 

Separation of concerns sounds like a good idea, but I've not fully
thought it through. For one advantage, I think it might be nicer to
have indexing.dat and toasting.dat instead of having to dig the info
out of C macros. This was evident while recently experimenting with
generating catcache control data.

As for the patch, I have not done a full review, but I have some
comments based on a light read-through:

utils/Makefile:

+# location of commandtag.dat
+headerdir = $(top_srcdir)/src/include/utils

This variable name is too generic for what the comment says it is. A
better abstraction, if we want one, would be the full path to the
commandtag input file. The other script invocations in this Makefile
don't do it this way, but that's a separate patch.

+# location to write generated headers
+sourcedir = $(top_srcdir)/src/backend/utils

Calling the output the source is bound to confuse people. The comment
implies all generated headers, not just the ones introduced by the
patch. I would just output to the current directory (i.e. have an
--output option with a default empty string). Also, if we want to
output somewhere else, I would imagine it'd be under the top builddir,
not srcdir.

+$(PERL) -I $(top_srcdir)/src/include/utils $<
--headerdir=$(headerdir) --sourcedir=$(sourcedir)
--inputfile=$(headerdir)/commandtag.dat

1. headerdir is entirely unused by the script
2. We can default to working dir for the output as mentioned above
3. -I $(top_srcdir)/src/include/utils is supposed to point to the dir
containing DataFile.pm, but since gencommandtag.pl has "use lib..."
it's probably not needed here. I don't recall why we keep the "-I"
elsewhere. (ditto in Solution.pm)

I'm thinking it would look something like this:

+$(PERL) $<  --inputfile=$(top_srcdir)/src/include/utils/commandtag.dat

--
utils/misc/Makefile

+all: distprep
+
 # Note: guc-file.c is not deleted by 'make clean',
 # since we want to ship it in distribution tarballs.
 clean:
  @rm -f lex.yy.c
+
+maintainer-clean: clean

Seems non-functional.

--
DataFiles.pm

I haven't studied this in detail, but I'll suggest that if this meant
to have wider application, maybe it should live in src/tools ?

I'm not familiar with using different IO routines depending on the OS
-- what's the benefit of that?

--
gencommandtag.pl

slurp_without_comments() is unused.

sanity_check_data() seems longer than the main body of the script
(minus header boilerplate), and I wonder if we can pare it down some.
For one, I have difficulty imagining anyone would accidentally type an
unprintable or non-ascii character in a command tag and somehow not
realize it. For another, duplicating checks that were done earlier
seems like a maintenance headache.

dataerror() is defined near the top, but other functions are defined
at the bottom.

+# Generate all output internally before outputting anything, to avoid
+# partially overwriting generated files under error conditions

My personal preference is, having this as a design requirement
sacrifices readability for unclear gain, especially since a "chunk"
also includes things like header boilerplate. That said, the script is
also short enough that it doesn't make a huge difference either way.
Speaking of boilerplate, it's better for readability to separate that
from actual code such as:

typedef enum CommandTag
{
#define FIRST_COMMANDTAG COMMANDTAG_$sorted[0]->{taglabel})

--
tcop/dest.c

+ * We no longer display LastOid, but to preserve the wire protocol,
+ * we write InvalidOid where the LastOid used to be written.  For
+ * efficiency in the snprintf(), hard-code InvalidOid as zero.

Hmm, is hard-coding zero going to make any difference here?

--
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Portal->commandTag as an enum

От
Mark Dilger
Дата:

> On Feb 19, 2020, at 3:31 AM, John Naylor <john.naylor@2ndquadrant.com> wrote:
>
> Hi Mark,

Hi John, thanks for reviewing!

> On Wed, Feb 19, 2020 at 10:40 AM Mark Dilger
> <mark.dilger@enterprisedb.com> wrote:
>> This would only make sense to me if the string held in $_ had already been checked for safety, but Catalog.pm does
verylittle to verify that the string is safe to eval.  The assumption here, so far as I can infer, is that we don’t
embedanything dangerous in our .dat files, so this should be ok.  That may be true for the moment, but I can imagine a
daywhen we start embedding perl functions as quoted text inside a data file, or shell commands as text which look
enoughlike perl for eval() to be able to execute them.  Developers who edit these .dat files and mess up the quoting,
andthen rerun ‘make’ to get the new .c and .h files generated, may not like the side effects.  Perhaps I’m being overly
paranoid….
>
> The use case for that seems slim. However, at a brief glance your
> module seems more robust in other ways.
>
>> Rather than add more code generation logic based on the design of Catalog.pm, I wrote a perl based data file parser
thatparses .dat files and returns vivified perl data, as Catalog.pm does, but with much stricter parsing logic to make
certainnothing dangerous gets eval()ed.  I put the new module in DataFile.pm. 
>> [...]
>> The new parser is more flexible about the structure of the data, which seems good to me for making it easier to add
ormodify data files in the future.  The new parser does not yet have a means of hacking up the data to add
autogeneratedfields and such that Catalog.pm does, but I think a more clean break between parsing and autovivifying
fieldswould be good anyway. 
>
> Separation of concerns sounds like a good idea, but I've not fully
> thought it through. For one advantage, I think it might be nicer to
> have indexing.dat and toasting.dat instead of having to dig the info
> out of C macros. This was evident while recently experimenting with
> generating catcache control data.

I guess you mean macros DECLARE_UNIQUE_INDEX and DECLARE_TOAST.  I don’t mind converting that to .dat files, though I’m
mindfulof Tom’s concern expressed early in this thread about the amount of code churn.  Is there sufficient demand for
refactoringthis stuff?  There are more reasons in the conversation below to refactor the perl modules and code
generationscripts. 

> As for the patch, I have not done a full review, but I have some
> comments based on a light read-through:
>
> utils/Makefile:
>
> +# location of commandtag.dat
> +headerdir = $(top_srcdir)/src/include/utils
>
> This variable name is too generic for what the comment says it is. A
> better abstraction, if we want one, would be the full path to the
> commandtag input file. The other script invocations in this Makefile
> don't do it this way, but that's a separate patch.
>
> +# location to write generated headers
> +sourcedir = $(top_srcdir)/src/backend/utils
>
> Calling the output the source is bound to confuse people. The comment
> implies all generated headers, not just the ones introduced by the
> patch. I would just output to the current directory (i.e. have an
> --output option with a default empty string). Also, if we want to
> output somewhere else, I would imagine it'd be under the top builddir,
> not srcdir.
>
> +$(PERL) -I $(top_srcdir)/src/include/utils $<
> --headerdir=$(headerdir) --sourcedir=$(sourcedir)
> --inputfile=$(headerdir)/commandtag.dat
>
> 1. headerdir is entirely unused by the script
> 2. We can default to working dir for the output as mentioned above
> 3. -I $(top_srcdir)/src/include/utils is supposed to point to the dir
> containing DataFile.pm, but since gencommandtag.pl has "use lib..."
> it's probably not needed here. I don't recall why we keep the "-I"
> elsewhere. (ditto in Solution.pm)
>
> I'm thinking it would look something like this:
>
> +$(PERL) $<  --inputfile=$(top_srcdir)/src/include/utils/commandtag.dat\

I have taken all this advice in v5 of the patch.  --inputfile and --outputdir (previously named --sourcedir) are now
optionalwith the defaults as you suggested. 

> --
> utils/misc/Makefile
>
> +all: distprep
> +
> # Note: guc-file.c is not deleted by 'make clean',
> # since we want to ship it in distribution tarballs.
> clean:
>  @rm -f lex.yy.c
> +
> +maintainer-clean: clean
>
> Seems non-functional.

Yeah, I also had an unnecessary addition to .gitignore in that directory.  I had originally placed the commandtag stuff
herebefore moving it one directory up.  Thanks for catching that. 

> --
> DataFiles.pm
>
> I haven't studied this in detail, but I'll suggest that if this meant
> to have wider application, maybe it should live in src/tools ?

We don’t seem to have a standard place for perl modules.  src/test/perl has some that are specifically for tap testing,
andsrc/backend/catalog has some for catalog data file processing.  I put DataFiles.pm in src/backend/catalog because
that’swhere most data file processing currently is located.  src/tools has PerfectHash.pm, and a bunch of Windows
specificmodules under src/tools/msvc. 

> I'm not familiar with using different IO routines depending on the OS
> -- what's the benefit of that?

I think you are talking about the slurp_file routine.  That came directly from the TestLib.pm module.  I don't have
enoughperl-on-windows experience to comment about why it does things that way.  I was reluctant to have DataFile.pm
'useTestLib', since DataFile has absolutely nothing to do with regression testing.  I don't like copying the function,
either,though I chose that as the lesser evil.  Which is more evil is debateable. 

src/test/perl/ contains SimpleTee.pm and RecursiveCopy.pm, neither of which contain functionality limited to just
testing. I think they could be moved to src/tools.  src/test/perl/TestLib.pm contains a mixture of testing specific
functionsand more general purpose functions.  For instance, TestLib.pm contains functions to read in a file or
directory(slurp_file(filepath) and slurp_dir(dirpath), respectively).  I think we should have just one implementation
ofthose in just one place.  Neither TestLib nor DataFile seem appropriate, nor does src/test/perl seem right.  I
checkedwhether Perl ships with core module support for this and didn't find anything.  There is a cpan module named
File::Slurp,but it is not a core module so far as I can tell, and it does more than we want. 

Should I submit a separate patch refactoring the location of perl modules and functions of general interest into
src/tools? src/tools/perl? 

I am not changing DataFile.pm's duplicate copy of slurp_file in v5 of the patch, as I don't yet know the best way to
approachthe problem.  I expect there will have to be a v6 once this has been adequately debated. 

> --
> gencommandtag.pl
>
> slurp_without_comments() is unused.

Right.  An earlier version of gencommandtag.pl didn't use DataFile.pm, and I neglected to remove this function when I
transitionedto using DataFile.pm.  Thanks for noticing! 

> sanity_check_data() seems longer than the main body of the script
> (minus header boilerplate), and I wonder if we can pare it down some.
> For one, I have difficulty imagining anyone would accidentally type an
> unprintable or non-ascii character in a command tag and somehow not
> realize it.

I'm uncertain about that.  There is logic in EndCommand in tcop/dest.c that specifically warns that no encoding
conversionwill be performed due to the assumption that command tags contain only 7-bit ascii.  I think that's a
perfectlyreasonable assumption in the C-code, but it needs to be checked by gencommandtag.pl because the bugs that
mightensue from inserting an accent character or whatever could be subtle enough to not be caught right away.  Such
mistakesonly get easier as time goes by, as the tendency for editors to change your quotes into "smart quotes" and such
getsmore common, and potentially as the assumption that PostgreSQL has been internationalized gets more common.
Hopefully,we're moving more and more towards supporting non-ascii in more and more places.  It might be less obvious to
acontributor some years hence that they cannot stick an accented character into a command tag.  (Compare, for example,
thatit used to be widely accepted that you shouldn't stick spaces and hyphens into file names, but now a fair number of
programmerswill do that without flinching.) 

As for checking for unprintable characters, the case is weaker.  I'm not too motivated to remove the check, though.

> For another, duplicating checks that were done earlier
> seems like a maintenance headache.

Hmmm.  As long as gencommandtag.pl is the only user of DataFile.pm, I'm inclined to agree that we're double-checking
somethings.  The code comments I wrote certainly say so.  But if DataFile.pm gets wider adoption, it might start to
acceptmore varied input, and then gencommandtag.pl will need to assert its own set of validation.  There is also the
distinctionbetween checking that the input data file meets the syntax requirements of the *parser* vs. making certain
thatthe vivified perl structures meet the semantic requirements of the *code generator*.  You may at this point be able
toassert that meeting the first guarantees meeting the second, but that can't be expected to hold indefinitely. 

It would be easier to decide these matters if we knew whether commandtag logic will ever be removed and whether
DataFilewill ever gain wider adoption for code generation purposes.... 

> dataerror() is defined near the top, but other functions are defined
> at the bottom.

Moved.

> +# Generate all output internally before outputting anything, to avoid
> +# partially overwriting generated files under error conditions
>
> My personal preference is, having this as a design requirement
> sacrifices readability for unclear gain, especially since a "chunk"
> also includes things like header boilerplate. That said, the script is
> also short enough that it doesn't make a huge difference either way.

Catalog.pm writes a temporary file and then moves it to the final file name at the end.  DataFile buffers the output
andonly writes it after all the code generation has succeeded.  There is no principled basis for these two modules
tacklingthe same problem in two different ways.  Perhaps that's another argument for pulling this kind of functionality
outof random places and consolidating it in one or more modules in src/tools. 

> Speaking of boilerplate, it's better for readability to separate that
> from actual code such as:
>
> typedef enum CommandTag
> {
> #define FIRST_COMMANDTAG COMMANDTAG_$sorted[0]->{taglabel})

Good idea.  While I was doing this, I also consolidated the duplicated boilerplate into just one function.  I think
thisfunction, too, should go in just one perl module somewhere.  See boilerplate_header() for details. 

> --
> tcop/dest.c
>
> + * We no longer display LastOid, but to preserve the wire protocol,
> + * we write InvalidOid where the LastOid used to be written.  For
> + * efficiency in the snprintf(), hard-code InvalidOid as zero.
>
> Hmm, is hard-coding zero going to make any difference here?

Part of the value of refactoring the commandtag logic is to make it easier to remove the whole ugly mess later.  Having
snprintfwrite the Oid into the string obfuscates the stupidity of what is really being done here.  Putting the zero
directlyinto the format string makes it clearer, to my eyes, that nothing clever is afoot. 

I have removed the sentence about efficiency.  Thanks for mentioning it.




—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Вложения

Re: Portal->commandTag as an enum

От
John Naylor
Дата:
On Thu, Feb 20, 2020 at 5:16 AM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
>
> > On Feb 19, 2020, at 3:31 AM, John Naylor <john.naylor@2ndquadrant.com> wrote:
> > thought it through. For one advantage, I think it might be nicer to
> > have indexing.dat and toasting.dat instead of having to dig the info
> > out of C macros. This was evident while recently experimenting with
> > generating catcache control data.
>
> I guess you mean macros DECLARE_UNIQUE_INDEX and DECLARE_TOAST.  I don’t mind converting that to .dat files, though
I’mmindful of Tom’s concern expressed early in this thread about the amount of code churn.  Is there sufficient demand
forrefactoring this stuff?  There are more reasons in the conversation below to refactor the perl modules and code
generationscripts. 

Yes, I was referring to those macros, but I did not mean to imply you
should convert them, either now or ever. I was just thinking out loud
about the possibility. :-)

That said, if we ever want Catalog.pm to delegate vivification to
DataFile.pm, there will eventually need to be a way to optionally
preserve comments and whitespace.

> I have taken all this advice in v5 of the patch.  --inputfile and --outputdir (previously named --sourcedir) are now
optionalwith the defaults as you suggested. 

+my $inputfile = '../../include/utils/commandtag.dat';

I think we should skip the default for the input file, since the
relative path is fragile and every such script I've seen requires it
to be passed in.

> > DataFiles.pm
> > [...]
> > I'm not familiar with using different IO routines depending on the OS
> > -- what's the benefit of that?
>
> I think you are talking about the slurp_file routine.  That came directly from the TestLib.pm module.  I don't have
enoughperl-on-windows experience to comment about why it does things that way. 

Yes, that one, sorry I wasn't explicit.

> Should I submit a separate patch refactoring the location of perl modules and functions of general interest into
src/tools? src/tools/perl? 

We may have accumulated enough disparate functionality by now to
consider this, but it sounds like PG14 material in any case.

> I expect there will have to be a v6 once this has been adequately debated.

So far I haven't heard any justification for why it should remain in
src/backend/catalog, when it has nothing to do with catalogs. We don't
have to have a standard other-place for there to be a better
other-place.

> > --
> > gencommandtag.pl

> > sanity_check_data() seems longer than the main body of the script
> > (minus header boilerplate), and I wonder if we can pare it down some.
> > For one, I have difficulty imagining anyone would accidentally type an
> > unprintable or non-ascii character in a command tag and somehow not
> > realize it.
> > [...]
> > For another, duplicating checks that were done earlier
> > seems like a maintenance headache.
>
> [ detailed rebuttals about the above points ]

Those were just examples that stuck out at me, so even if I were
convinced to your side on those, my broader point was: the sanity
check seems way over-engineered for something that spits out an enum
and an array. Maybe I'm not imaginative enough. I found it hard to
read in any case.

> Catalog.pm writes a temporary file and then moves it to the final file name at the end.  DataFile buffers the output
andonly writes it after all the code generation has succeeded.  There is no principled basis for these two modules
tacklingthe same problem in two different ways. 

Not the same problem. The temp files were originally for parallel Make
hazards, and now to prevent large portions of the backend to be
rebuilt. I actually think partially written files can be helpful for
debugging, so I don't even think it's a problem. But as I said, it
doesn't matter terribly much either way.

> > Speaking of boilerplate, it's better for readability to separate that
> > from actual code such as:
> >
> > typedef enum CommandTag
> > {
> > #define FIRST_COMMANDTAG COMMANDTAG_$sorted[0]->{taglabel})
>
> Good idea.  While I was doing this, I also consolidated the duplicated boilerplate into just one function.  I think
thisfunction, too, should go in just one perl module somewhere.  See boilerplate_header() for details. 

I like this a lot.

While thinking, I wonder if it makes sense to have a CodeGen module,
which would contain e.g. the new ParseData function,
FindDefinedSymbol, and functions for writing boilerplate.

--
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Portal->commandTag as an enum

От
John Naylor
Дата:
Thinking about this some more, would it be possible to treat these
like we do parser/kwlist.h? Something like this:

commandtag_list.h:
PG_COMMANDTAG(ALTER_ACCESS_METHOD, "ALTER ACCESS METHOD", true, false,
false, false)
...

then, just:

#define PG_COMMANDTAG(taglabel, tagname, event_trigger, table_rewrite,
display_rowcount, display_oid) label,

typedef enum CommandTag
{
#include "commandtag_list.h"
}

#undef PG_COMMANDTAG

...and then:

#define PG_COMMANDTAG(taglabel, tagname, event_trigger, table_rewrite,
display_rowcount, display_oid) \
{ tagname, event_trigger, table_rewrite, display_rowcount, display_oid },

const CommandTagBehavior tag_behavior[] =
{
#include "commandtag_list.h"
}

#undef PG_COMMANDTAG

I'm hand-waving a bit, and it doesn't have the flexibility of a .dat
file, but it's a whole lot simpler.

-- 
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Portal->commandTag as an enum

От
Alvaro Herrera
Дата:
On 2020-Feb-21, John Naylor wrote:

> Thinking about this some more, would it be possible to treat these
> like we do parser/kwlist.h? Something like this:
> 
> commandtag_list.h:
> PG_COMMANDTAG(ALTER_ACCESS_METHOD, "ALTER ACCESS METHOD", true, false,
> false, false)
> ...

I liked this idea, so I'm halfway on it now.

> I'm hand-waving a bit, and it doesn't have the flexibility of a .dat
> file, but it's a whole lot simpler.

Yeah, I for one don't want to maintain the proposed DataFile.pm.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Portal->commandTag as an enum

От
Alvaro Herrera
Дата:
On 2020-Feb-28, Alvaro Herrera wrote:

> On 2020-Feb-21, John Naylor wrote:
> 
> > Thinking about this some more, would it be possible to treat these
> > like we do parser/kwlist.h? Something like this:
> > 
> > commandtag_list.h:
> > PG_COMMANDTAG(ALTER_ACCESS_METHOD, "ALTER ACCESS METHOD", true, false,
> > false, false)
> > ...
> 
> I liked this idea, so I'm halfway on it now.

Here.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: Portal->commandTag as an enum

От
Alvaro Herrera
Дата:
I just realized that we could rename command_tag_display_last_oid() to
something like command_tag_print_a_useless_zero_for_historical_reasons() 
and nothing would be lost.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Portal->commandTag as an enum

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> I just realized that we could rename command_tag_display_last_oid() to
> something like command_tag_print_a_useless_zero_for_historical_reasons() 
> and nothing would be lost.

Is there a way to drop that logic altogether by making the tagname string
be "INSERT 0" for the INSERT case?  Or would the zero bleed into other
places where we don't want it?

            regards, tom lane



Re: Portal->commandTag as an enum

От
Alvaro Herrera
Дата:
On 2020-Feb-28, Tom Lane wrote:

> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > I just realized that we could rename command_tag_display_last_oid() to
> > something like command_tag_print_a_useless_zero_for_historical_reasons() 
> > and nothing would be lost.
> 
> Is there a way to drop that logic altogether by making the tagname string
> be "INSERT 0" for the INSERT case?  Or would the zero bleed into other
> places where we don't want it?

Hmm, interesting thought. But yeah, it would show up in ps display:

        commandTag = CreateCommandTag(parsetree->stmt);

        set_ps_display(GetCommandTagName(commandTag), false);

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Portal->commandTag as an enum

От
Mark Dilger
Дата:

> On Feb 28, 2020, at 3:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>> I just realized that we could rename command_tag_display_last_oid() to
>> something like command_tag_print_a_useless_zero_for_historical_reasons()
>> and nothing would be lost.
>
> Is there a way to drop that logic altogether by making the tagname string
> be "INSERT 0" for the INSERT case?  Or would the zero bleed into other
> places where we don't want it?

In general, I don't think we want to increase the number of distinct tags.  Which command you finished running and
whetheryou want a rowcount and/or lastoid are orthogonal issues.  We already have problems with there being different
commandtagsfor different versions of morally the same commands.  Take for example "SELECT FOR KEY SHARE" vs.  "SELECT
FORNO KEY UPDATE" vs.  "SELECT FOR SHARE" vs.  "SELECT FOR UPDATE". 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Portal->commandTag as an enum

От
Tom Lane
Дата:
Mark Dilger <mark.dilger@enterprisedb.com> writes:
>> On Feb 28, 2020, at 3:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Is there a way to drop that logic altogether by making the tagname string
>> be "INSERT 0" for the INSERT case?  Or would the zero bleed into other
>> places where we don't want it?

> In general, I don't think we want to increase the number of distinct
> tags.  Which command you finished running and whether you want a
> rowcount and/or lastoid are orthogonal issues.

Well, my thought is that last_oid is gone and it isn't ever coming back.
So the less code we use supporting a dead feature, the better.

If we can't remove the special case in EndCommand() altogether, I'd be
inclined to hard-code it as "if (tag == CMDTAG_INSERT ..." rather than
expend infrastructure on treating last_oid as a live option for commands
to have.

            regards, tom lane



Re: Portal->commandTag as an enum

От
Mark Dilger
Дата:

> On Feb 28, 2020, at 5:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Mark Dilger <mark.dilger@enterprisedb.com> writes:
>>> On Feb 28, 2020, at 3:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Is there a way to drop that logic altogether by making the tagname string
>>> be "INSERT 0" for the INSERT case?  Or would the zero bleed into other
>>> places where we don't want it?
>
>> In general, I don't think we want to increase the number of distinct
>> tags.  Which command you finished running and whether you want a
>> rowcount and/or lastoid are orthogonal issues.
>
> Well, my thought is that last_oid is gone and it isn't ever coming back.
> So the less code we use supporting a dead feature, the better.
>
> If we can't remove the special case in EndCommand() altogether, I'd be
> inclined to hard-code it as "if (tag == CMDTAG_INSERT ..." rather than
> expend infrastructure on treating last_oid as a live option for commands
> to have.

You may want to think about the embedding of InvalidOid into the EndCommand output differently from how you think about
theembedding of the rowcount into the EndCommand output, but my preference is to treat these issues the same and make a
strongdistinction between the commandtag and the embedded oid and/or rowcount.  It's hard to say how many future
featureswould be crippled by having the embedded InvalidOid in the commandtag, but as an example *right now* in the
works,we have a feature to count how many commands of a given type have been executed.  It stands to reason that
feature,whether accepted in its current form or refactored, would not want to show users a pg_stats table like this: 

   cnt   command
   ----   -------------
   5    INSERT 0
   37  SELECT

What the heck is the zero doing after the INSERT?  That's the hardcoded InvalidOid that you are apparently arguing for.
You could get around that by having the pg_sql_stats patch have its own separate set of command tag strings, but why
wouldwe intentionally design that sort of duplication into the solution? 

As for hardcoding the behavior of whether to embed a rowcount in the output from EndCommand; In
src/backend/replication/walsender.c,exec_replication_command() returns "SELECT" from EndCommand, and not "SELECT
$rowcount"like everywhere else.  The patch as submitted does not change behavior.  It only refactors the code while
preservingthe current behavior.  So we would have to agree that the patch can change how exec_replication_command()
behavesand start embedding a rowcount there, too, if we want to make SELECT behave the same everywhere. 

There is another problem, though, which is that if we're hoping to eventually abate this historical behavior and stop
embeddingInvalidOid and/or rowcount in the commandtag returned from EndCommand, it might be necessary (for backward
compatibilitywith clients) to do that incrementally, in which case we still need the distinction between commandtags
andformats to exist in the code.  How else can you say that, for example, in the next rev of the protocol that we're
notgoing to embed InvalidOid anymore, but we will continue to return it for clients who connect via the older protocol?
What if the next rev of the protocol still returns rowcount, but in a way that doesn't require the clients to implement
(orlink to) a parser that extracts the rowcount by parsing a string? 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Portal->commandTag as an enum

От
Alvaro Herrera
Дата:
On 2020-Feb-29, Mark Dilger wrote:

> You may want to think about the embedding of InvalidOid into the EndCommand output differently from how you think
aboutthe embedding of the rowcount into the EndCommand output, but my preference is to treat these issues the same and
makea strong distinction between the commandtag and the embedded oid and/or rowcount.  It's hard to say how many future
featureswould be crippled by having the embedded InvalidOid in the commandtag, but as an example *right now* in the
works,we have a feature to count how many commands of a given type have been executed.  It stands to reason that
feature,whether accepted in its current form or refactored, would not want to show users a pg_stats table like this:
 
> 
>    cnt   command
>    ----   -------------
>    5    INSERT 0
>    37  SELECT
>     
> What the heck is the zero doing after the INSERT?  That's the hardcoded InvalidOid that you are apparently arguing
for. You could get around that by having the pg_sql_stats patch have its own separate set of command tag strings, but
whywould we intentionally design that sort of duplication into the solution?
 

This is what I think Tom means to use in EndCommand:

            if (command_tag_display_rowcount(tag) && !force_undecorated_output)
                snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
                         tag == CMDTAG_INSERT ?
                         "%s 0 " UINT64_FORMAT : "%s " UINT64_FORMAT,
                         tagname, qc->nprocessed);
            else
                ... no rowcount ...

The point is not to change the returned tag in any way -- just to make
the method to arrive at it not involve the additional data column in the
data file, instead hardcode the behavior in EndCommand.  I don't
understand your point of pg_stats_sql having to deal with this in a
particular way.  How is that patch obtaining the command tags?  I would
hope it calls GetCommandTagName() rather than call CommandEnd, but maybe
I misunderstand where it hooks.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Portal->commandTag as an enum

От
Mark Dilger
Дата:

> On Mar 2, 2020, at 8:12 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> On 2020-Feb-29, Mark Dilger wrote:
>
>> You may want to think about the embedding of InvalidOid into the EndCommand output differently from how you think
aboutthe embedding of the rowcount into the EndCommand output, but my preference is to treat these issues the same and
makea strong distinction between the commandtag and the embedded oid and/or rowcount.  It's hard to say how many future
featureswould be crippled by having the embedded InvalidOid in the commandtag, but as an example *right now* in the
works,we have a feature to count how many commands of a given type have been executed.  It stands to reason that
feature,whether accepted in its current form or refactored, would not want to show users a pg_stats table like this: 
>>
>>   cnt   command
>>   ----   -------------
>>   5    INSERT 0
>>   37  SELECT
>>
>> What the heck is the zero doing after the INSERT?  That's the hardcoded InvalidOid that you are apparently arguing
for. You could get around that by having the pg_sql_stats patch have its own separate set of command tag strings, but
whywould we intentionally design that sort of duplication into the solution? 
>
> This is what I think Tom means to use in EndCommand:
>
>            if (command_tag_display_rowcount(tag) && !force_undecorated_output)
>                snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
>                         tag == CMDTAG_INSERT ?
>                         "%s 0 " UINT64_FORMAT : "%s " UINT64_FORMAT,
>                         tagname, qc->nprocessed);
>            else
>                ... no rowcount ...
>
> The point is not to change the returned tag in any way -- just to make
> the method to arrive at it not involve the additional data column in the
> data file, instead hardcode the behavior in EndCommand.

Thanks, Álvaro, I think I get it now.  I thought Tom was arguing to have "INSERT 0" rather than "INSERT" be the
commandtag.

> I don't
> understand your point of pg_stats_sql having to deal with this in a
> particular way.  How is that patch obtaining the command tags?  I would
> hope it calls GetCommandTagName() rather than call CommandEnd, but maybe
> I misunderstand where it hooks.

My objection was based on my misunderstanding of what Tom was requesting.

I can rework the patch the way Tom suggests.


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Portal->commandTag as an enum

От
Alvaro Herrera
Дата:
On 2020-Mar-02, Mark Dilger wrote:

> > I don't
> > understand your point of pg_stats_sql having to deal with this in a
> > particular way.  How is that patch obtaining the command tags?  I would
> > hope it calls GetCommandTagName() rather than call CommandEnd, but maybe
> > I misunderstand where it hooks.
> 
> My objection was based on my misunderstanding of what Tom was requesting.
> 
> I can rework the patch the way Tom suggests.

I already did it :-)  Posting in a jiffy

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Portal->commandTag as an enum

От
Alvaro Herrera
Дата:
Here's the patch I propose for commit.  I also rewrote the commit
message.

There are further refinements that can be done, but they don't need to
be in the first patch.  Notably, the event trigger code can surely do a
lot better now by translating the tag list to a bitmapset earlier.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: Portal->commandTag as an enum

От
Alvaro Herrera
Дата:
On 2020-Mar-02, Alvaro Herrera wrote:

> Here's the patch I propose for commit.  I also rewrote the commit
> message.

BTW I wonder if we should really change the definition of
EventTriggerData.  ISTM that it would be sensible to keep it the same
for now ...

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Portal->commandTag as an enum

От
Mark Dilger
Дата:

> On Mar 2, 2020, at 9:08 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> On 2020-Mar-02, Alvaro Herrera wrote:
>
>> Here's the patch I propose for commit.  I also rewrote the commit
>> message.
>
> BTW I wonder if we should really change the definition of
> EventTriggerData.  ISTM that it would be sensible to keep it the same
> for now ...

I think it is more natural to change event trigger code to reason natively about CommandTags rather than continuing to
reasonabout strings.  The conversion back-and-forth between the enum and the string representation serves no useful
purposethat I can see.  But I understand if you are just trying to have the patch change fewer parts of the code, and
ifyou feel more comfortable about reverting that part of the patch, as the committer, I think that's your prerogative. 

Did you want to do that yourself, or have me do it and resubmit?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Portal->commandTag as an enum

От
Alvaro Herrera
Дата:
On 2020-Mar-02, Mark Dilger wrote:

> I think it is more natural to change event trigger code to reason
> natively about CommandTags rather than continuing to reason about
> strings.  The conversion back-and-forth between the enum and the
> string representation serves no useful purpose that I can see.  But I
> understand if you are just trying to have the patch change fewer parts
> of the code, and if you feel more comfortable about reverting that
> part of the patch, as the committer, I think that's your prerogative.

Nah -- after reading it again, that would make no sense.  With the
change, we're forcing all writers of event trigger functions in C to
adapt their functions to the new API, but that's okay -- I don't expect
there will be many, and we're doing other things to the API anyway.

I pushed it now.

I made very small changes before pushing: notably, I removed the
InitializeQueryCompletion() call from standard_ProcessUtility; instead
its callers are supposed to do it.  Updated ProcessUtility's comment to
that effect.

Also, the affected plancache.c functions (CreateCachedPlan and
CreateOneShotCachedPlan) had not had their comments updated.  Previously
they received compile-time constants, but that was important because
they were strings.  No longer.

I noticed some other changes that could perhaps be made here, but didn't
do them; for instance, in pg_stat_statement we have a comparison to
CMDTAG_COPY that seems pointless; we could just acquire the value
always.  I left it alone for now but I think the change is without side
effects (notably so because most actual DML does not go through
ProcessUtility anyway).  Also, event_trigger.c could resolve command
strings to the tag enum earlier.

There's also a lot of nonsense in the pquery.c functions, such as this,

                /*
                 * Now fetch desired portion of results.
                 */
                nprocessed = PortalRunSelect(portal, true, count, dest);

                /*
                 * If the portal result contains a command tag and the caller
                 * gave us a pointer to store it, copy it and update the
                 * rowcount.
                 */
                if (qc && portal->qc.commandTag != CMDTAG_UNKNOWN)
                {
                    CopyQueryCompletion(qc, &portal->qc);
                    qc->nprocessed = nprocessed;
                }

I think we could simplify that by passing the qc.

Similar consideration with DoCopy; instead of a 'uint64 nprocessed' we
could have a *qc to fill in and avoid this bit of silliness,

                DoCopy(pstate, (CopyStmt *) parsetree,
                       pstmt->stmt_location, pstmt->stmt_len,
                       &processed);
                if (qc)
                    SetQueryCompletion(qc, CMDTAG_COPY, processed);

I see no reason to have PortalRun() initialize the qc; ISTM that its
callers should do so.

And so on.

Nothing of that is critical.

Thanks for your patch,

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Portal->commandTag as an enum

От
Mark Dilger
Дата:

> On Mar 2, 2020, at 1:57 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> On 2020-Mar-02, Mark Dilger wrote:
>
>> I think it is more natural to change event trigger code to reason
>> natively about CommandTags rather than continuing to reason about
>> strings.  The conversion back-and-forth between the enum and the
>> string representation serves no useful purpose that I can see.  But I
>> understand if you are just trying to have the patch change fewer parts
>> of the code, and if you feel more comfortable about reverting that
>> part of the patch, as the committer, I think that's your prerogative.
>
> Nah -- after reading it again, that would make no sense.  With the
> change, we're forcing all writers of event trigger functions in C to
> adapt their functions to the new API, but that's okay -- I don't expect
> there will be many, and we're doing other things to the API anyway.
>
> I pushed it now.

Thanks!  I greatly appreciate your time.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Portal->commandTag as an enum

От
Mark Dilger
Дата:

> On Mar 2, 2020, at 1:57 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> I pushed it now.

Thanks again!  While rebasing some other work on top, I noticed one of your comments is out of date:

--- a/src/include/tcop/cmdtaglist.h
+++ b/src/include/tcop/cmdtaglist.h
@@ -23,7 +23,7 @@
  * textual name, so that we can bsearch on it; see GetCommandTagEnum().
  */

-/* symbol name, textual name, event_trigger_ok, table_rewrite_ok, rowcount, last_oid */
+/* symbol name, textual name, event_trigger_ok, table_rewrite_ok, rowcount */
 PG_CMDTAG(CMDTAG_UNKNOWN, "???", false, false, false)
 PG_CMDTAG(CMDTAG_ALTER_ACCESS_METHOD, "ALTER ACCESS METHOD", true, false, false)
 PG_CMDTAG(CMDTAG_ALTER_AGGREGATE, "ALTER AGGREGATE", true, false, false)


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Portal->commandTag as an enum

От
Alvaro Herrera
Дата:
On 2020-Mar-04, Mark Dilger wrote:

> 
> 
> > On Mar 2, 2020, at 1:57 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> > 
> > I pushed it now.
> 
> Thanks again!  While rebasing some other work on top, I noticed one of your comments is out of date:
> 
> --- a/src/include/tcop/cmdtaglist.h
> +++ b/src/include/tcop/cmdtaglist.h
> @@ -23,7 +23,7 @@
>   * textual name, so that we can bsearch on it; see GetCommandTagEnum().
>   */
>  
> -/* symbol name, textual name, event_trigger_ok, table_rewrite_ok, rowcount, last_oid */
> +/* symbol name, textual name, event_trigger_ok, table_rewrite_ok, rowcount */

Oops.  Pushed, thanks.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services