RE: speed up a logical replica setup
От | Hayato Kuroda (Fujitsu) |
---|---|
Тема | RE: speed up a logical replica setup |
Дата | |
Msg-id | TY3PR01MB98896B904BB259B559622773F5422@TY3PR01MB9889.jpnprd01.prod.outlook.com обсуждение исходный текст |
Ответ на | Re: speed up a logical replica setup ("Euler Taveira" <euler@eulerto.com>) |
Ответы |
Re: speed up a logical replica setup
(Shubham Khanna <khannashubham1197@gmail.com>)
RE: speed up a logical replica setup ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>) Re: speed up a logical replica setup (Shlok Kyal <shlok.kyal.oss@gmail.com>) Re: speed up a logical replica setup ("Euler Taveira" <euler@eulerto.com>) |
Список | pgsql-hackers |
Dear Euler, Thanks for updating the patch! > I'm still working on the data structures to group options. I don't like the way it was grouped in v13-0005. There is too many levels to reach database name. The setup_subscriber() function requires the 3 data structures. > Right, your refactoring looks fewer stack. So I pause to revise my refactoring patch. > The documentation update is almost there. I will include the modifications in the next patch. > OK. I think it should be modified before native speakers will attend to the thread. > Regarding v13-0004, it seems a good UI that's why I wrote a comment about it. However, it comes with a restriction that requires a similar HBA rule for both regular and replication connections. Is it an acceptable restriction? We might paint ourselves into the corner. A reasonable proposal is not to remove this option. Instead, it should be optional. If it is not provided, primary_conninfo is used. > I didn't have such a point of view. However, it is not related whether -P exists or not. Even v14-0001 requires primary to accept both normal/replication connections. If we want to avoid it, the connection from pg_createsubscriber can be restored to replication-connection. (I felt we do not have to use replication protocol even if we change the connection mode) The motivation why -P is not needed is to ensure the consistency of nodes. pg_createsubscriber assumes that the -P option can connect to the upstream node, but no one checks it. Parsing two connection strings may be a solution but be confusing. E.g., what if some options are different? I think using a same parameter is a simplest solution. And below part contains my comments for v14. 01. ``` char temp_replslot[NAMEDATALEN] = {0}; ``` I found that no one refers the name of temporary slot. Can we remove the variable? 02. ``` CreateSubscriberOptions opt; ... memset(&opt, 0, sizeof(CreateSubscriberOptions)); /* Default settings */ opt.subscriber_dir = NULL; opt.pub_conninfo_str = NULL; opt.sub_conninfo_str = NULL; opt.database_names = (SimpleStringList) { NULL, NULL }; opt.retain = false; opt.recovery_timeout = 0; ``` Initialization by `CreateSubscriberOptions opt = {0};` seems enough. All values are set to 0x0. 03. ``` /* * Is the standby server ready for logical replication? */ static bool check_subscriber(LogicalRepInfo *dbinfo) ``` You said "target server must be a standby" in [1], but I cannot find checks for it. IIUC, there are two approaches: a) check the existence "standby.signal" in the data directory b) call an SQL function "pg_is_in_recovery" 04. ``` static char *pg_ctl_path = NULL; static char *pg_resetwal_path = NULL; ``` I still think they can be combined as "bindir". 05. ``` /* * Write recovery parameters. ... WriteRecoveryConfig(conn, opt.subscriber_dir, recoveryconfcontents); ``` WriteRecoveryConfig() writes GUC parameters to postgresql.auto.conf, but not sure it is good. These settings would remain on new subscriber even after the pg_createsubscriber. Can we avoid it? I come up with passing these parameters via pg_ctl -o option, but it requires parsing output from GenerateRecoveryConfig() (all GUCs must be allign like "-c XXX -c XXX -c XXX..."). 06. ``` static LogicalRepInfo *store_pub_sub_info(SimpleStringList dbnames, const char *pub_base_conninfo, const char *sub_base_conninfo); ... static void modify_subscriber_sysid(const char *pg_resetwal_path, CreateSubscriberOptions opt); ... static void wait_for_end_recovery(const char *conninfo, CreateSubscriberOptions opt); ``` Functions arguments should not be struct because they are passing by value. They should be a pointer. Or, for modify_subscriber_sysid and wait_for_end_recovery, we can pass a value which would be really used. 07. ``` static char *get_base_conninfo(char *conninfo, char *dbname, const char *noderole); ``` Not sure noderole should be passed here. It is used only for the logging. Can we output string before calling the function? (The parameter is not needed anymore if -P is removed) 08. The terminology is still not consistent. Some functions call the target as standby, but others call it as subscriber. 09. v14 does not work if the standby server has already been set recovery_target* options. PSA the reproducer. I considered two approaches: a) raise an ERROR when these parameter were set. check_subscriber() can do it b) overwrite these GUCs as empty strings. 10. The execution always fails if users execute --dry-run just before. Because pg_createsubscriber stops the standby anyway. Doing dry run first is quite normal use-case, so current implementation seems not user-friendly. How should we fix? Below bullets are my idea: a) avoid stopping the standby in case of dry_run: seems possible. b) accept even if the standby is stopped: seems possible. c) start the standby at the end of run: how arguments like pg_ctl -l should be specified? My top-up patches fixes some issues. v15-0001: same as v14-0001 === experimental patches === v15-0002: Use replication connections when we connects to the primary. Connections to standby is not changed because the standby/subscriber does not require such type of connection, in principle. If we can accept connecting to subscriber with replication mode, this can be simplified. v15-0003: Remove -P and use primary_conninfo instead. Same as v13-0004 v15-0004: Check whether the target is really standby. This is done by pg_is_in_recovery() v15-0005: Avoid stopping/starting standby server in dry_run mode. I.e., approach a). in #10 is used. v15-0006: Overwrite recovery parameters. I.e., aproach b). in #9 is used. [1]: https://www.postgresql.org/message-id/b315c7da-7ab1-4014-a2a9-8ab6ae26017c%40app.fastmail.com Best Regards, Hayato Kuroda FUJITSU LIMITED https://www.fujitsu.com/global/
Вложения
- v15-0004-Check-whether-the-target-is-really-standby.patch
- v15-0001-Creates-a-new-logical-replica-from-a-standby-ser.patch
- v15-0002-Use-replication-connection-when-we-connect-to-th.patch
- v15-0003-Remove-P-and-use-primary_conninfo-instead.patch
- v15-0005-Avoid-stopping-starting-standby-server-in-dry_ru.patch
- v15-0006-Overwrite-recovery-parameters.patch
В списке pgsql-hackers по дате отправления: