Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
От | Önder Kalacı |
---|---|
Тема | Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher |
Дата | |
Msg-id | CACawEhUUHcSVhp-bODSAnYNUQHY=4gvY29DXM=yuphgrkcqqmQ@mail.gmail.com обсуждение исходный текст |
Ответ на | RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher ("shiy.fnst@fujitsu.com" <shiy.fnst@fujitsu.com>) |
Ответы |
Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
|
Список | pgsql-hackers |
Hi Shi yu, all
In execReplication.c:
+ TypeCacheEntry **eq = NULL; /* only used when the index is not unique */
Maybe the comment here should be changed. Now it is used when the index is not
primary key or replica identity index.
makes sense, updated
2.
+# wait until the index is created
+$node_subscriber->poll_query_until(
+ 'postgres', q{select count(*)=1 from pg_stat_all_indexes where indexrelname = 'test_replica_id_full_idx';}
+) or die "Timed out while waiting for check subscriber tap_sub_rep_full updates one row via index";
The message doesn't seem right, should it be changed to "Timed out while
waiting for creating index test_replica_id_full_idx"?
yes, updated
3.
+# now, ingest more data and create index on column y which has higher cardinality
+# then create an index on column y so that future commands uses the index on column
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_replica_id_full SELECT 50, i FROM generate_series(0,3100)i;");
The comment say "create (an) index on column y" twice, maybe it can be changed
to:
now, ingest more data and create index on column y which has higher cardinality,
so that future commands will use the index on column y
fixed
4.
+# deletes 200 rows
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM test_replica_id_full WHERE x IN (5, 6);");
+
+# wait until the index is used on the subscriber
+$node_subscriber->poll_query_until(
+ 'postgres', q{select (idx_scan = 200) from pg_stat_all_indexes where indexrelname = 'test_replica_id_full_idx';}
+) or die "Timed out while waiting for check subscriber tap_sub_rep_full updates 200 rows via index";
It would be better to call wait_for_catchup() after DELETE. (And some other
places in this file.)
Hmm, I cannot follow this easily.
Why do you think wait_for_catchup() should be called? In general, I tried to follow a pattern where we call poll_query_until() so that we are sure that all the changes are replicated via the index. And then, an additional check with `is($result, ..` such that we also verify the correctness of the data.
One alternative could be to use wait_for_catchup() and then have multiple `is($result, ..` to check both pg_stat_all_indexes and the correctness of the data.
One minor advantage I see with the current approach is that every `is($result, ..` adds one step to the test. So, if I use `is($result, ..` for pg_stat_all_indexes queries, then I'd be adding multiple steps for a single test. It felt it is more natural/common to test roughly once with `is($result, ..` on each test. Or, at least do not add additional ones for pg_stat_all_indexes checks.
Besides, the "updates" in the message should be "deletes".
fixed
5.
+# wait until the index is used on the subscriber
+$node_subscriber->poll_query_until(
+ 'postgres', q{select sum(idx_scan)=10 from pg_stat_all_indexes where indexrelname ilike 'users_table_part_%';}
+) or die "Timed out while waiting for check subscriber tap_sub_rep_full updates partitioned table";
Maybe we should say "updates partitioned table with index" in this message.
Fixed
Attached v20.
Thanks!
Onder KALACI
Вложения
В списке pgsql-hackers по дате отправления: