Conflict detection and logging in logical replication
От | Zhijie Hou (Fujitsu) |
---|---|
Тема | Conflict detection and logging in logical replication |
Дата | |
Msg-id | OS0PR01MB5716352552DFADB8E9AD1D8994C92@OS0PR01MB5716.jpnprd01.prod.outlook.com обсуждение исходный текст |
Ответы |
RE: Conflict detection and logging in logical replication
|
Список | pgsql-hackers |
Hi hackers, Cc people involved in the original thread[1]. I am starting a new thread to share and discuss the implementation of conflict detection and logging in logical replication, as well as the collection of statistics related to these conflicts. In the original conflict resolution thread[1], we have decided to split this work into multiple patches to facilitate incremental progress towards supporting conflict resolution in logical replication. This phased approach will allow us to address simpler tasks first. The overall work plan involves: 1. conflict detection (detect and log conflicts like 'insert_exists', 'update_differ', 'update_missing', and 'delete_missing') 2. implement simple built-in resolution strategies like 'apply(remote_apply)' and 'skip(keep_local)'. 3. monitor capability for conflicts and resolutions in statistics or history table. Following the feedback received from PGconf.dev and discussions in the conflict resolution thread, features 1 and 3 are important independently. So, we start a separate thread for them. Here are the basic designs for the detection and statistics: - The detail of the conflict detection We add a new parameter detect_conflict for CREATE and ALTER subscription commands. This new parameter will decide if subscription will go for confict detection. By default, conflict detection will be off for a subscription. When conflict detection is enabled, additional logging is triggered in the following conflict scenarios: insert_exists: Inserting a row that violates a NOT DEFERRABLE unique constraint. update_differ: updating a row that was previously modified by another origin. update_missing: The tuple to be updated is missing. delete_missing: The tuple to be deleted is missing. For insert_exists conflict, the log can include origin and commit timestamp details of the conflicting key with track_commit_timestamp enabled. And update_differ conflict can only be detected when track_commit_timestamp is enabled. Regarding insert_exists conflicts, the current design is to pass noDupErr=true in ExecInsertIndexTuples() to prevent immediate error handling on duplicate key violation. After calling ExecInsertIndexTuples(), if there was any potential conflict in the unique indexes, we report an ERROR for the insert_exists conflict along with additional information (origin, committs, key value) for the conflicting row. Another way for this is to conduct a pre-check for duplicate key violation before applying the INSERT operation, but this could introduce overhead for each INSERT even in the absence of conflicts. We welcome any alternative viewpoints on this matter. - The detail of statistics collection We add columns(insert_exists_count, update_differ_count, update_missing_count, delete_missing_count) in view pg_stat_subscription_workers to shows information about the conflict which occur during the application of logical replication changes. The conflicts will be tracked when track_conflict option of the subscription is enabled. Additionally, update_differ can be detected only when track_commit_timestamp is enabled. The patches for above features are attached. Suggestions and comments are highly appreciated. [1] https://www.postgresql.org/message-id/CAA4eK1LgPyzPr_Vrvvr4syrde4hyT%3DQQnGjdRUNP-tz3eYa%3DGQ%40mail.gmail.com Best Regards, Hou Zhijie
Вложения
В списке pgsql-hackers по дате отправления: