Re: speed up a logical replica setup
От | Tomas Vondra |
---|---|
Тема | Re: speed up a logical replica setup |
Дата | |
Msg-id | 6423dfeb-a729-45d3-b71e-7bf1b3adb0c9@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: speed up a logical replica setup ("Euler Taveira" <euler@eulerto.com>) |
Ответы |
RE: speed up a logical replica setup
("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
Re: speed up a logical replica setup (Amit Kapila <amit.kapila16@gmail.com>) |
Список | pgsql-hackers |
Hi, I decided to take a quick look on this patch today, to see how it works and do some simple tests. I've only started to get familiar with it, so I have only some comments / questions regarding usage, not on the code. It's quite possible I didn't understand some finer points, or maybe it was already discussed earlier in this very long thread, so please feel free to push back or point me to the past discussion. Also, some of this is rather opinionated, but considering I didn't see this patch before, my opinions may easily be wrong ... 1) SGML docs It seems the SGML docs are more about explaining how this works on the inside, rather than how to use the tool. Maybe that's intentional, but as someone who didn't work with pg_createsubscriber before I found it confusing and not very helpful. For example, the first half of the page is prerequisities+warning, and sure those are useful details, but prerequisities are checked by the tool (so I can't really miss this) and warnings go into a lot of details about different places where things may go wrong. Sure, worth knowing and including in the docs, but maybe not right at the beginning, before I learn how to even run the tool? Maybe that's just me, though. Also, I'm sure it's not the only part of our docs like this. Perhaps it'd be good to reorganize the content a bit to make the "how to use" stuff more prominent? 2) this is a bit vague ... pg_createsubscriber will check a few times if the connection has been reestablished to stream the required WAL. After a few attempts, it terminates with an error. What does "a few times" mean, and how many is "a few attempts"? Seems worth knowing when using this tool in environments where disconnections can happen. Maybe this should be configurable? 3) publication info For a while I was quite confused about which tables get replicated, until I realized the publication is FOR ALL TABLES. But I learned that from this thread, the docs say nothing about this. Surely that's an important detail that should be mentioned? 4) Is FOR ALL TABLES a good idea? I'm not sure FOR ALL TABLES is a good idea. Or said differently, I'm sure it won't work for a number of use cases. I know large databases it's common to create "work tables" (not necessarily temporary) as part of a batch job, but there's no need to replicate those tables. AFAIK that'd break this FOR ALL TABLES publication, because the tables will qualify for replication, but won't be present on the subscriber. Or did I miss something? I do understand that FOR ALL TABLES is the simplest approach, and for v1 it may be an acceptable limitation, but maybe it'd be good to also support restricting which tables should be replicated (e.g. blacklist or whitelist based on table/schema name?). BTW if I'm right and creating a table breaks the subscriber creation, maybe it'd be good to explicitly mention that in the docs. Note: I now realize this might fall under the warning about DDL, which says this: Executing DDL commands on the source server while running pg_createsubscriber is not recommended. If the target server has already been converted to logical replica, the DDL commands must not be replicated so an error would occur. But I find this confusing. Surely there are many DDL commands that have absolutely no impact on logical replication (like creating an index or view, various ALTER TABLE flavors, and so on). And running such DDL certainly does not trigger error, right? 5) slot / publication / subscription name I find it somewhat annoying it's not possible to specify names for objects created by the tool - replication slots, publication and subscriptions. If this is meant to be a replica running for a while, after a while I'll have no idea what pg_createsubscriber_569853 or pg_createsubscriber_459548_2348239 was meant for. This is particularly annoying because renaming these objects later is either not supported at all (e.g. for replication slots), or may be quite difficult (e.g. publications). I do realize there are challenges with custom names (say, if there are multiple databases to replicate), but can't we support some simple formatting with basic placeholders? So we could specify --slot-name "myslot_%d_%p" or something like that? BTW what will happen if we convert multiple standbys? Can't they all get the same slot name (they all have the same database OID, and I'm not sure how much entropy the PID has)? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: