RE: [Patch] Use *other* indexes on the subscriber when REPLICA IDENTITY is FULL

Поиск
Список
Период
Сортировка
От Hayato Kuroda (Fujitsu)
Тема RE: [Patch] Use *other* indexes on the subscriber when REPLICA IDENTITY is FULL
Дата
Msg-id TYAPR01MB58663009FA044B8E37488EA9F536A@TYAPR01MB5866.jpnprd01.prod.outlook.com
обсуждение исходный текст
Ответ на Re: [Patch] Use *other* indexes on the subscriber when REPLICA IDENTITY is FULL  (Peter Smith <smithpb2250@gmail.com>)
Ответы Re: [Patch] Use *other* indexes on the subscriber when REPLICA IDENTITY is FULL  (Amit Kapila <amit.kapila16@gmail.com>)
Re: [Patch] Use *other* indexes on the subscriber when REPLICA IDENTITY is FULL  (Peter Smith <smithpb2250@gmail.com>)
Список pgsql-hackers
Dear Peter,

Thank you for reviewing! PSA new version.

> 1.
> 89e46d allowed using indexes other than PRIMARY KEY or REPLICA IDENTITY
> on the
> subscriber, but only the BTree index could be used. This commit extends the
> limitation, now the Hash index can be also used.
> 
> ~
> 
> Before giving details about the problems of the other index types it
> might be good to summarize why these 2 types are OK.
> 
> SUGGESTION
> These 2 types of indexes are the only ones supported because only these
> - use fix strategy numbers
> - implement the "equality" strategy
> - implement the function amgettuple()

Added.

> 
> 2.
> I'm not sure why the next paragraphs are numbered 1) and 2). Is that
> necessary? It seems maybe a cut/paste hangover from the similar code
> comment.

Yeah, this was just copied from code comments. Numbers were removed.

> 3.
> 1) Other indexes do not have a fixed set of strategy numbers at all. In
> build_replindex_scan_key(), the operator which corresponds to
> "equality comparison"
> is searched by using the strategy number, but it is difficult for such indexes.
> For example, GiST index operator classes for two-dimensional geometric
> objects have
> a comparison "same", but its strategy number is different from the
> gist_int4_ops,
> which is a GiST index operator class that implements the B-tree equivalent.
> 
> ~
> 
> Don't need to say "at all"

Removed.

> 4.
> 2) Some other indexes do not implement "amgettuple".
> RelationFindReplTupleByIndex()
> assumes that tuples could be fetched one by one via
> index_getnext_slot(), but such
> indexes are not supported.
> 
> ~
> 
> 4a.
> "Some other indexes..." --> Maybe give an example here (e.g. XXX, YYY)
> of indexes that do not implement the function.

I clarified like "BRIN and GIN indexes do not implement...", which are the built-in
indexes. Actually the bloom index cannot be supported due to the same reason, but
I did not mention because it is not in core.

> 4b.
> BEFORE
> RelationFindReplTupleByIndex() assumes that tuples could be fetched
> one by one via index_getnext_slot(), but such indexes are not
> supported.
> 
> AFTER
> RelationFindReplTupleByIndex() assumes that tuples will be fetched one
> by one via index_getnext_slot(), but this currently requires the
> "amgetuple" function.


Changed.

> src/backend/executor/execReplication.c
> 
> 5.
> + * 2) Some other indexes do not implement "amgettuple".
> + * RelationFindReplTupleByIndex() assumes that tuples could be fetched one by
> + * one via index_getnext_slot(), but such indexes are not supported. To make it
> + * use index_getbitmap() must be used, but not done yet because of the above
> + * reason.
> + */
> +int
> +get_equal_strategy_number_for_am(Oid am)
> 
> ~
> 
> Change this text to better match exactly in the commit message (see
> previous review comments above).

Copied from commit message.

> Also I am not sure it is necessary to
> say how it *might* be implemented in the future if somebody wanted to
> enhance it to work also for "amgetbitmap" function. E.g. do we need
> that sentence "To make it..."

Added, how do you think?

> 6. get_equal_strategy_number_for_am
> 
> + case GIST_AM_OID:
> + case SPGIST_AM_OID:
> + case GIN_AM_OID:
> + case BRIN_AM_OID:
> + default:
> 
> I am not sure it is necessary to spell out all these not-supported
> cases in the switch. If seems sufficient just to say 'default:'
> doesn't it?

Yeah, it's sufficient. This is a garbage for previous PoC.

> 7. get_equal_strategy_number
> 
> Two blank lines are following this function

Removed.

> 8. build_replindex_scan_key
> 
> - * This is not generic routine, it expects the idxrel to be a btree,
> non-partial
> - * and have at least one column reference (i.e. cannot consist of only
> - * expressions).
> + * This is not generic routine, it expects the idxrel to be a btree or a hash,
> + * non-partial and have at least one column reference (i.e. cannot consist of
> + * only expressions).
> 
> Take care. AFAIK this change will clash with changes Sawawa-san is
> making to the same code comment in another thread here [1].

Thanks for reminder. I thought that this change seems not needed anymore if the patch
is pushed. But anyway I kept it because this may be pushed first.

> src/backend/replication/logical/relation.c
> 
> 9. FindUsableIndexForReplicaIdentityFull
> 
>   * Returns the oid of an index that can be used by the apply worker to scan
> - * the relation. The index must be btree, non-partial, and have at least
> - * one column reference (i.e. cannot consist of only expressions). These
> + * the relation. The index must be btree or hash, non-partial, and have at
> + * least one column reference (i.e. cannot consist of only expressions). These
>   * limitations help to keep the index scan similar to PK/RI index scans.
> 
> ~
> 
> Take care. AFAIK this change will clash with changes Sawawa-san is
> making to the same code comment in another thread here [1].

Thanks for reminder. I thought that this change must be kept even if it will be
pushed. We must check the thread...

> 10.
> + /* Check whether the index is supported or not */
> + is_suitable_type = ((get_equal_strategy_number_for_am(indexInfo->ii_Am)
> + != InvalidStrategy));
> +
> + is_partial = (indexInfo->ii_Predicate != NIL);
> + is_only_on_expression = IsIndexOnlyOnExpression(indexInfo);
> +
> + return is_suitable_type && !is_partial && !is_only_on_expression;
> 
> I am not sure if the function IsIndexOnlyExpression() even needed
> anymore. Isn't it sufficient just to check up-front is the leftmost
> INDEX field is a column and that covers this condition also? Actually,
> this same question is also open in the Sawada-san thread [1].
> 
> ======
> .../subscription/t/032_subscribe_use_index.pl
> 
> 11.
> AFAIK there is no test to verify that the leftmost index field must be
> a column (e.g. cannot be an expression). Yes, there are some tests for
> *only* expressions like (expr, expr, expr), but IMO there should be
> another test for an index like (expr, expr, column) which should also
> fail because the column is not the leftmost field.

I'm not sure they should be fixed in the patch. You have raised these points
at the Sawada-san's thread[1] and not sure he has done.
Furthermore, these points are not related with our initial motivation.
Fork, or at least it should be done by another patch. 

[1]: https://www.postgresql.org/message-id/CAHut%2BPv3AgAnP%2BJTsPseuU-CT%2BOrSGiqzxqw4oQmWeKLkAea4Q%40mail.gmail.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andy Fan
Дата:
Сообщение: Re: The same 2PC data maybe recovered twice
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: Cleaning up threading code