Re: Amcheck verification of GiST and GIN

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Amcheck verification of GiST and GIN
Дата
Msg-id c0577791-0cbd-4548-b809-ff056c7be61d@vondra.me
обсуждение исходный текст
Ответ на Amcheck verification of GiST and GIN  (Andrey Borodin <x4mmm@yandex-team.ru>)
Ответы Re: Amcheck verification of GiST and GIN
Список pgsql-hackers
On 5/29/25 13:53, Arseniy Mukhin wrote:
> On Mon, May 26, 2025 at 7:28 PM Arseniy Mukhin
> <arseniy.mukhin.dev@gmail.com> wrote:
>> On Mon, May 26, 2025 at 1:27 PM Tomas Vondra <tomas@vondra.me> wrote:
>>> Also, I've noticed that the TAP test passes even with some (most) of the
>>> verify_gin.c changes reverted. See the 0002 patch - this does not break
>>> the TAP test. Of course, that does not prove the changes are wrong and
>>> I'm not claiming that. But can we improve the TAP test to trigger this
>>> too? To show the current code (in master) misses this?
>>
>> Yes, changes in the undo patch is about posting tree check part (6, 7 points)
>> and I haven't written tests for it, because to break posting tree you need to
>> manipulate with tids which is not as easy as replace "aaaa" with "cccc" as tests
>> for entry tree do. Probably it would be much easier to use page api to
>> corrupt some
>> posting tree pages, but I don't know, is it impossible in TAP tests?
> 
> I added the test for the posting tree parent_key check. Now applying
> 'undo patch' results in a test failure.

Great, thank you.

I noticed git-am complaining about a couple whitespace issues in the
test, mostly about mixing spaces/tabs. The v4 fixes them (in a separate
part, but should be merged into 0001). It's a detail, but might be good
to try git-am on patches ;-)

> Also I realized that the test 'invalid_entry_columns_order_test' will
> fail on big endian machines,
> because varlena len encoding is different for little endian and big
> endian, so I changed the test a little bit.
> Now the test doesn't use varlena len byte in regex.

I think it'd make sense to split this into smaller patches, each fixing
a different issue. Not one patch for each of the 11 items in your
original message, that would be an overkill ...

I propose to split it like this, into three parts, each addressing a
particular type of mistake:

1) gin_check_posting_tree_parent_keys_consistency

2) gin_check_parent_keys_consistency / att comparisons

3) gin_check_parent_keys_consistency / setting ptr->parenttup (at the end)

Does this make sense to you? If yes, can you split the patch series like
this, including a commit message for each part, explaining the fix? We'd
need the commit message even with a single patch, ofc.


> I also remove the blksize hardcode and start getting it from the
> cluster configuration. But anyway some tests
> will fail with not standard block size (probably all tests where tree
> growth is expected).
> 

I think that's fine. AFAIK we don't expect tests to be 100% stable with
other block sizes. It shouldn't crash / segfault, ofc, but some tests
may be sensitive to this.


BTW I hoped to get this fix pushed this week, but that didn't happen and
I'll be away most of next week :-( Let's try to get this sorted so that
I can push it on June 16 or so.


regards

-- 
Tomas Vondra

Вложения

В списке pgsql-hackers по дате отправления: