Re: Support logical replication of DDLs
От | Peter Smith |
---|---|
Тема | Re: Support logical replication of DDLs |
Дата | |
Msg-id | CAHut+PtzpuuRFrLnjkQePq296ip_0WfmQ4CtydM9JDR6gEf=Qw@mail.gmail.com обсуждение исходный текст |
Ответ на | RE: Support logical replication of DDLs ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>) |
Список | pgsql-hackers |
Hi, here are some review comments for the patch 0002-2023_04_07-2 Note: This is a WIP review. The patch is quite large and I have managed to only look at ~50% of it. I will continue reviewing this same 0002-2023_04_07-2 and send more comments at a later time. Meanwhile, here are the review comments I have so far... ====== General 1. Field/Code order I guess it makes little difference but it was a bit disconcerting that the new DDL field member is popping up in all different order everywhere. e.g. In pg_publication.h, FormData_pg_publication comes last e.g. In describe.c: it comes immediately after the "All Tables" column e.g. In pg_publication.c, GetPublication: it comes after truncated and before viaroot. IMO it is better to try to keep the same consistent order everywhere unless there is some reason not to. ~~~ 2. Inconsistent acronym case Use consistent uppercase for JSON and DDL instead of sometimes json and ddl. There are quite a few random examples in the commit message but might be worth searching the entire patch to make all comments use consistent case. ====== src/backend/replication/logical/proto.c 3. logicalrep_read_ddl +/* + * Read DDL MESSAGE from stream + */ +char * +logicalrep_read_ddl(StringInfo in, XLogRecPtr *lsn, + const char **prefix, + Size *sz) Should this just say "Read DDL from stream"? (It matches the function name better, and none of the other Read XXX say Read XXX MESSAGE) Alternatively, maybe that comment is correct, but in that case, perhaps change the function name --> logicalrep_read_ddl_message(). ~~~~ 4. logicalrep_write_ddl +/* + * Write DDL MESSAGE to stream + */ +void +logicalrep_write_ddl(StringInfo out, XLogRecPtr lsn, + const char *prefix, Size sz, const char *message) Ditto previous review comment #3 ====== src/backend/tcop/cmdtag.c 5. GetCommandTagsForDDLRepl +CommandTag * +GetCommandTagsForDDLRepl(int *ncommands) +{ + CommandTag *ddlrepl_commands = palloc0(COMMAND_TAG_NEXTTAG * sizeof(CommandTag)); + *ncommands = 0; + + for(int i = 0; i < COMMAND_TAG_NEXTTAG; i++) + { + if (tag_behavior[i].ddl_replication_ok) + ddlrepl_commands[(*ncommands)++] = (CommandTag) i; + } + + return ddlrepl_commands; +} 5a. I felt that maybe it would be better to iterate using CommandTag enums instead of int indexes. ~ 5b. I saw there is another code fragment in GetCommandTagEnum() that uses lengthof(tag_behaviour) instead of checking the COMMAND_TAG_NEXTTAG. It might be more consistent to use similar code here too. Something like: const int ntags = lengthof(tag_behavior) - 1; CommandTag *ddlrepl_commands = palloc0(ntags * sizeof(CommandTag)); *ncommands = 0; for(CommandTag tag = 0; i < nTags; tag++) if (tag_behavior[tag].ddl_replication_ok) ddlrepl_commands[(*ncommands)++] = tag; ====== src/bin/pg_dump/pg_dump.c 6. @@ -4213,7 +4224,10 @@ dumpPublication(Archive *fout, const PublicationInfo *pubinfo) first = false; } - appendPQExpBufferChar(query, '\''); + appendPQExpBufferStr(query, "'"); + + if (pubinfo->pubddl_table) + appendPQExpBufferStr(query, ", ddl = 'table'"); The change from appendPQExpBufferChar to appendPQExpBufferStr did not seem a necessary part of this patch. ~~~ 7. getPublicationEventTriggers +/* + * getPublicationEventTriggers + * get the publication event triggers that should be skipped + */ +static void +getPublicationEventTriggers(Archive *fout, SimpleStringList *skipTriggers) Given the way this function is invoked, it might be more appropriate to name it like getEventTriggersToBeSkipped(), with the comment saying that for now we just we skip the PUBLICATION DDL event triggers. ~~~ 8. getEventTriggers /* Decide whether we want to dump it */ - selectDumpableObject(&(evtinfo[i].dobj), fout); + if (simple_string_list_member(&skipTriggers, evtinfo[i].evtname)) + evtinfo[i].dobj.dump= DUMP_COMPONENT_NONE; + else + selectDumpableObject(&(evtinfo[i].dobj), fout); } + simple_string_list_destroy(&skipTriggers); + 8a. Missing whitespace before '=' ~ 8b. Scanning a list within a loop may not be efficient. I suppose this code must be assuming that there are not 1000s of publications, and therefore the skipTriggers string list will be short enough to ignore such inefficiency concerns. IMO a simpler alternative be to throw away the getPublicationEventTriggers() and the list scanning logic, and instead simply skip any event triggers where the name starts with PUB_EVENT_TRIG_PREFIX (e.g. the correct prefix, not the current code one -- see other review comment for pg_publication.h). Are there any problems doing it that way? ====== src/bin/pg_dump/t/002_pg_dump.pl 9. create_sql => 'CREATE PUBLICATION pub2 FOR ALL TABLES - WITH (publish = \'\');', + WITH (publish = \'\', ddl = \'\');', regexp => qr/^ \QCREATE PUBLICATION pub2 FOR ALL TABLES WITH (publish = '');\E 9a. I was not sure of the purpose of this test. Is it for showing that ddl='' is equivalent to not having any ddl option? ~ 9b. Missing test cases for dumping other values? e.g. ddl='table' ====== src/bin/psql/describe.c 10. listPublications printfPQExpBuffer(&buf, "SELECT pubname AS \"%s\",\n" " pg_catalog.pg_get_userbyid(pubowner) AS \"%s\",\n" - " puballtables AS \"%s\",\n" - " pubinsert AS \"%s\",\n" - " pubupdate AS \"%s\",\n" - " pubdelete AS \"%s\"", + " puballtables AS \"%s\"", gettext_noop("Name"), gettext_noop("Owner"), - gettext_noop("All tables"), + gettext_noop("All tables")); + if (pset.sversion >= 160000) + appendPQExpBuffer(&buf, + ",\n pubddl_table AS \"%s\"", + gettext_noop("Table DDLs")); + appendPQExpBuffer(&buf, + ",\n pubinsert AS \"%s\",\n" + " pubupdate AS \"%s\",\n" + " pubdelete AS \"%s\"", gettext_noop("Inserts"), gettext_noop("Updates"), gettext_noop("Deletes")) 10a. IMO the \n and ',' are strangely positioned. I thought they should be consistently at the ends of the strings e.g. " puballtables AS \"%s\",\n" e.g. " pubddl_table AS \"%s\",\n" ~ 10b. IIUC this DDL for tables is only the first of the kinds of values for this new option might take. So, maybe it is better to name the column "DDL tables" so that later when there are more kinds of DDL at least the column names will all consistently start with "DDL" ~~~ 11. listPublications appendPQExpBuffer(&buf, ",\n pubviaroot AS \"%s\"", gettext_noop("Via root")); - appendPQExpBufferStr(&buf, "\nFROM pg_catalog.pg_publication\n"); This whitespace change seems unrelated to the patch. ~~~ 12. describePublications printfPQExpBuffer(&buf, "SELECT oid, pubname,\n" " pg_catalog.pg_get_userbyid(pubowner) AS owner,\n" - " puballtables, pubinsert, pubupdate, pubdelete"); + " puballtables"); + if (has_pubddl) + appendPQExpBufferStr(&buf, + ", pubddl_table"); + appendPQExpBufferStr(&buf, + ", pubinsert, pubupdate, pubdelete"); Because the insert/update/delete are not optional columns I felt there is no reason to put the commas (,) at the beginning of these strings. It would be simpler to put them at the end as usual: e.g. " puballtables" --> "puballtables, " e.g. ", pubddl_table" --> "pubddl_table, " e.g. ", pubinsert, pubupdate, pubdelete" --> "pubinsert, pubupdate, pubdelete" ====== src/include/catalog/pg_publication.h 13. +/* Publication trigger events */ +#define PUB_TRIG_EVENT1 "ddl_command_end" +#define PUB_TRIG_EVENT2 "ddl_command_start" +#define PUB_TRIG_EVENT3 "table_rewrite" +#define PUB_TRIG_EVENT4 "table_init_write" These seemed overly generic macro names. Would names like below make their usage more readable? #define PUB_TRIG_DDL_CMD_START "ddl_command_end" #define PUB_TRIG_DDL_CMD_END "ddl_command_start" #define PUB_TRIG_TBL_REWRITE "table_rewrite" #define PUB_TRIG_TBL_INIT_WRITE "table_init_write" ~~~ 14. +/* Publication event trigger prefix */ +#define PUB_EVENT_TRIG_PREFIX "pg_deparse_trig_%s_%u" But this is not even a prefix; it is a format string! I think better macros might be like: #define PUB_EVENT_TRIG_PREFIX "pg_deparse_trig" #define PUB_EVENT_TRIG_FORMAT "pg_deparse_trig_%s_%u" ====== src/include/replication/ddlmessage.h 15. +typedef enum DeparsedCommandType +{ + DCT_SimpleCmd, + DCT_TableDropStart, + DCT_TableDropEnd, + DCT_TableAlter, + DCT_ObjectCreate, + DCT_ObjectDrop +} DeparsedCommandType; Better to be in alphabetical order? ~~~ 16. +typedef struct xl_logical_ddl_message Missing from typedefs.list? ====== src/include/replication/output_plugin.h 17. +/* + * Called for the logical decoding DDL messages. + */ +typedef void (*LogicalDecodeDDLMessageCB) (struct LogicalDecodingContext *ctx, + ReorderBufferTXN *txn, + XLogRecPtr message_lsn, + const char *prefix, + Oid relid, + DeparsedCommandType cmdtype, + Size message_size, + const char *message); + Should that comment say "Callback for" instead of "Called for"? ====== src/include/replication/reorderbuffer.h 18. ReorderBufferChangeType @@ -65,6 +67,7 @@ typedef enum ReorderBufferChangeType REORDER_BUFFER_CHANGE_INSERT, REORDER_BUFFER_CHANGE_UPDATE, REORDER_BUFFER_CHANGE_DELETE, + REORDER_BUFFER_CHANGE_DDL, REORDER_BUFFER_CHANGE_MESSAGE, In other code changes of this patch the new DDL generally seemed to come *after* the MESSAGE stuff. So probably this should too just for consistency if no other reason. ====== src/include/tcop/cmdtag.h 19. typedef enum CommandTag { #include "tcop/cmdtaglist.h" COMMAND_TAG_NEXTTAG } CommandTag; I know it is not part of this patch, but IMO it will be an improvement to rename that last enum (COMMAND_TAG_NEXTTAG) to a name like NUM_COMMAND_TAGS. This enum wasn't used much before, but now in this patch, you are using it within the new function like GetCommandTagsForDDLRepl() so keeping the current enum name COMMAND_TAG_NEXTTAG with that usage looked strange. Alternatively, leave this alone but change GetCommandTagsForDDLRepl() so that it does not even refer to this enum value. See other review comment #5b ====== src/include/tcop/cmdtaglist.h 20. -PG_CMDTAG(CMDTAG_VACUUM, "VACUUM", false, false, false) +/* symbol name, textual name, event_trigger_ok, table_rewrite_ok, rowcount, ddlreplok */ +v(CMDTAG_UNKNOWN, "???", false, false, false, false) Although these are not strictly the same as the PG_CMDTAG macro arg names, it might be better in this comment to call this "ddl_repl_ok" instead of "ddlreplok" for consistency with the rest of the comment. ====== src/test/regress/expected/publication.out 21. The \dRp+ now exposes a new column called "Table DDLS" I expected to see some tests for t/f values but I did not find any test where the expected output for this column was 't'. ====== src/tools/pgindent/typedefs.list 22. LogicalDecodeCommitPreparedCB +LogicalDecodeDDLMessageCB +LogicalDecodeStreamDDLMessageCB LogicalDecodeFilterByOriginCB The alphabetical order is not correct for LogicalDecodeStreamDDLMessageCB ~~~ 23. ReorderBufferCommitPreparedCB +ReorderBufferDDLMessageCB +ReorderBufferStreamDDLMessageCB ReorderBufferDiskChange The alphabetical order is not correct for ReorderBufferStreamDDLMessageCB ------ Kind Regards, Peter Smith. Fujitsu Australia
В списке pgsql-hackers по дате отправления:
Следующее
От: Kyotaro HoriguchiДата:
Сообщение: Re: Issue in postgres_fdw causing unnecessary wait for cancel request reply