Обсуждение: [BUG] false positive in bt_index_check in case of short 4B varlena datum

Поиск
Список
Период
Сортировка

[BUG] false positive in bt_index_check in case of short 4B varlena datum

От
Michael Zhilin
Дата:
Hi,

Following example produces error raised from bt_index_check.

drop table if exists t;
create table t (v text);
alter table t alter column v set storage plain;
insert into t values ('x');
copy t to '/tmp/1.lst';
copy t from '/tmp/1.lst';
create index t_idx on t(v);
create extension if not exists amcheck;
select bt_index_check('t_idx', true);

postgres=# select bt_index_check('t_idx', true);
ERROR:  heap tuple (0,2) from table "t" lacks matching index tuple within index "t_idx"
HINT:  Retrying verification using the function bt_index_parent_check() might provide a more specific error.

As result table contains 2 logically identical tuples:
 - one contains varlena 'x' with 1B (1-byte) header (added by INSERT statement)
 - one contains varlena 'x' with 4B (4-bytes) header (added by COPY statement)
CREATE INDEX statement builds index with posting list referencing both heap tuples.
The function bt_index_check calculates fingerprints of 1B and 4B header datums,
they are different and function returns error.

The attached patch allows to avoid such kind of false positives by converting short
4B datums to 1B before fingerprinting. Also it contains test for provided case.

Thank you,
 Michael

--
Michael Zhilin
Postgres Professional
Вложения

Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum

От
Alexander Lakhin
Дата:
Hi Michael,

14.12.2023 19:18, Michael Zhilin wrote:
> Hi,
>
> Following example produces error raised from bt_index_check.
>
> drop table if exists t;
> create table t (v text);
> alter table t alter column v set storage plain;
> insert into t values ('x');
> copy t to '/tmp/1.lst';
> copy t from '/tmp/1.lst';
> create index t_idx on t(v);
> create extension if not exists amcheck;
> select bt_index_check('t_idx', true);
>
> postgres=# select bt_index_check('t_idx', true);
> ERROR:  heap tuple (0,2) from table "t" lacks matching index tuple within index "t_idx"
> HINT:  Retrying verification using the function bt_index_parent_check() might provide a more specific error.
>
> As result table contains 2 logically identical tuples:
>  - one contains varlena 'x' with 1B (1-byte) header (added by INSERT statement)
>  - one contains varlena 'x' with 4B (4-bytes) header (added by COPY statement)
> CREATE INDEX statement builds index with posting list referencing both heap tuples.
> The function bt_index_check calculates fingerprints of 1B and 4B header datums,
> they are different and function returns error.
>
> The attached patch allows to avoid such kind of false positives by converting short
> 4B datums to 1B before fingerprinting. Also it contains test for provided case.

By changing the storage mode for a column, you can also get another error:
CREATE TABLE t(f1 text);
CREATE INDEX t_idx ON t(f1);
INSERT INTO t VALUES(repeat('1234567890', 1000));
ALTER TABLE t ALTER COLUMN f1 SET STORAGE plain;

CREATE EXTENSION amcheck;
SELECT bt_index_check('t_idx', true);

ERROR:  index row requires 10016 bytes, maximum size is 8191

Best regards,
Alexander



Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum

От
"Andrey M. Borodin"
Дата:

> On 14 Dec 2023, at 21:18, Michael Zhilin <m.zhilin@postgrespro.ru> wrote:

I've checked that:
* bug is reproduced by the test in the patch
* bug is fixed by the patch
* fix seems idiomatic, similar to nearby code

Patch needed a rebase, so please find attached rebased version. I did not change anything.

I see that using a temp file in PG_ABS_SRCDIR is common approach. But still I want to ask, maybe can we develop some
cleverway to reproduce the bug without external file? 
Also, maybe nearby code would be slightly more readable, if normalized[i] was a local variable.
And one last question about the line:
char *data = palloc(len);
what if data is somehow corrupted here... are there enough sanity checks that we won't palloc(-1) or something like
that?
Won't we memcpy() from some other memory when len is bogus?

Besides this paranoid questions, I think that this patch is ready for committer.

Thanks!


Best regards, Andrey Borodin.



Вложения

Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum

От
"Andrey M. Borodin"
Дата:

> On 7 Jan 2024, at 23:04, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>
> I see that using a temp file in PG_ABS_SRCDIR is common approach. But still I want to ask, maybe can we develop some
cleverway to reproduce the bug without external file? 

BTW this stuff is causing problems in CFbot [0,1]
COPY varlena_bug TO :'filename';
+ERROR:  could not open file "/tmp/cirrus-ci-build/contrib/amcheck/results/varlena_bug.dmp" for writing: No such file
ordirectory 
+HINT:  COPY TO instructs the PostgreSQL server process to write a file. You may want a client-side facility such as
psql's\copy. 


Best regards, Andrey Borodin.

[0] https://api.cirrus-ci.com/v1/artifact/task/4880592609738752/testrun/build/testrun/amcheck/regress/regression.diffs
[1] https://github.com/x4m/postgres_g/runs/20240081462




Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum

От
Alexander Lakhin
Дата:
Hello Andrey,

07.01.2024 21:04, Andrey M. Borodin wrote:
>> On 14 Dec 2023, at 21:18, Michael Zhilin <m.zhilin@postgrespro.ru> wrote:
> I've checked that:
> * bug is reproduced by the test in the patch
> * bug is fixed by the patch
> * fix seems idiomatic, similar to nearby code
>

What is your opinion regarding similar failures, which are not addressed
by the patch? Besides the case shown above, there is another one:
CREATE TABLE tbl(i int4, t text);
ALTER TABLE tbl ALTER COLUMN t SET STORAGE plain;
CREATE INDEX tbl_idx ON tbl (t, i) WITH (fillfactor = 10);
INSERT INTO tbl SELECT g, repeat('Test', 250) FROM generate_series(1, 130) g;
ALTER TABLE tbl ALTER COLUMN t SET STORAGE extended;

CREATE EXTENSION amcheck;
SELECT bt_index_check('tbl_idx', true);

ERROR:  heap tuple (0,1) from table "tbl" lacks matching index tuple within index "tbl_idx"
HINT:  Retrying verification using the function bt_index_parent_check() might provide a more specific error.

Best regards,
Alexander




Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum

От
"Andrey M. Borodin"
Дата:
Hi Alexander!

I think both cases are very interesting and deserve two separate bug items.

> On 14 Dec 2023, at 22:17, Alexander Lakhin <exclusion@gmail.com> wrote:
>
>
> By changing the storage mode for a column, you can also get another error:
> CREATE TABLE t(f1 text);
> CREATE INDEX t_idx ON t(f1);
> INSERT INTO t VALUES(repeat('1234567890', 1000));
> ALTER TABLE t ALTER COLUMN f1 SET STORAGE plain;
>
> CREATE EXTENSION amcheck;
> SELECT bt_index_check('t_idx', true);
>
> ERROR:  index row requires 10016 bytes, maximum size is 8191
>

I think In this case we should warn user that index contains tuples with datums that are not insertable anymore. And
abortheapallindexed. 


> On 8 Jan 2024, at 00:00, Alexander Lakhin <exclusion@gmail.com> wrote:
>
> What is your opinion regarding similar failures, which are not addressed
> by the patch? Besides the case shown above, there is another one:
> CREATE TABLE tbl(i int4, t text);
> ALTER TABLE tbl ALTER COLUMN t SET STORAGE plain;
> CREATE INDEX tbl_idx ON tbl (t, i) WITH (fillfactor = 10);
> INSERT INTO tbl SELECT g, repeat('Test', 250) FROM generate_series(1, 130) g;
> ALTER TABLE tbl ALTER COLUMN t SET STORAGE extended;
>
> CREATE EXTENSION amcheck;
> SELECT bt_index_check('tbl_idx', true);
>
> ERROR:  heap tuple (0,1) from table "tbl" lacks matching index tuple within index "tbl_idx"
> HINT:  Retrying verification using the function bt_index_parent_check() might provide a more specific error.

IMO In this case we should handle VARATT_IS_EXTENDED in bt_normalize_tuple().

BTW CI fails of the original patch ITT are related to the fact that COPY in\out file is created in PG_ABS_SRCDIR
insteadof PG_ABS_BUILDDIR. In an off-list conversation I recommended Michael to mimic tests regress/largeobject.sql.
Though,there might be better ideas. 

Thanks!


Best regards, Andrey Borodin.


Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum

От
Michael Zhilin
Дата:
Hi,

Thank you, Andrey, for review and advice!

Here is rebased version (v2) of patch supposed to make CF bot happy.

Best regards,
  Michael

On 1/8/24 17:34, Andrey M. Borodin wrote:
Hi Alexander!

I think both cases are very interesting and deserve two separate bug items.

On 14 Dec 2023, at 22:17, Alexander Lakhin <exclusion@gmail.com> wrote:


By changing the storage mode for a column, you can also get another error:
CREATE TABLE t(f1 text);
CREATE INDEX t_idx ON t(f1);
INSERT INTO t VALUES(repeat('1234567890', 1000));
ALTER TABLE t ALTER COLUMN f1 SET STORAGE plain;

CREATE EXTENSION amcheck;
SELECT bt_index_check('t_idx', true);

ERROR:  index row requires 10016 bytes, maximum size is 8191

I think In this case we should warn user that index contains tuples with datums that are not insertable anymore. And abort heapallindexed.


On 8 Jan 2024, at 00:00, Alexander Lakhin <exclusion@gmail.com> wrote:

What is your opinion regarding similar failures, which are not addressed
by the patch? Besides the case shown above, there is another one:
CREATE TABLE tbl(i int4, t text);
ALTER TABLE tbl ALTER COLUMN t SET STORAGE plain;
CREATE INDEX tbl_idx ON tbl (t, i) WITH (fillfactor = 10);
INSERT INTO tbl SELECT g, repeat('Test', 250) FROM generate_series(1, 130) g;
ALTER TABLE tbl ALTER COLUMN t SET STORAGE extended;

CREATE EXTENSION amcheck;
SELECT bt_index_check('tbl_idx', true);

ERROR:  heap tuple (0,1) from table "tbl" lacks matching index tuple within index "tbl_idx"
HINT:  Retrying verification using the function bt_index_parent_check() might provide a more specific error.
IMO In this case we should handle VARATT_IS_EXTENDED in bt_normalize_tuple().

BTW CI fails of the original patch ITT are related to the fact that COPY in\out file is created in PG_ABS_SRCDIR instead of PG_ABS_BUILDDIR. In an off-list conversation I recommended Michael to mimic tests regress/largeobject.sql. Though, there might be better ideas.

Thanks!


Best regards, Andrey Borodin.


-- 
Michael Zhilin
Postgres Professional
+7(925)3366270
https://www.postgrespro.ru
Вложения

Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum

От
Andrey Borodin
Дата:

> On 9 Jan 2024, at 22:59, Michael Zhilin <m.zhilin@postgrespro.ru> wrote:
>
> Here is rebased version (v2) of patch supposed to make CF bot happy.

I've marked CF item as ready for committer.

> On 1/8/24 17:34, Andrey M. Borodin wrote:
>> Hi Alexander!
>>
>> I think both cases are very interesting and deserve two separate bug items.

Alexander, do you plan to provide fixes for bugs you discovered?


Best regards, Andrey Borodin.




Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum

От
Alexander Lakhin
Дата:
Hi Andrey,

19.01.2024 21:16, Andrey Borodin wrote:
> I've marked CF item as ready for committer.
>> On 1/8/24 17:34, Andrey M. Borodin wrote:
>>> Hi Alexander!
>>>
>>> I think both cases are very interesting and deserve two separate bug items.
> Alexander, do you plan to provide fixes for bugs you discovered?

No, I don't have a concrete proposal how to fix those bugs. I'd thought
that fixing the whole class of such anomalies, not only one case, is a good
thing to do, but if it's too complicated, maybe other similar bugs could be
put aside.

Best regards,
Alexander



Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum

От
jian he
Дата:
On Wed, Jan 10, 2024 at 1:59 AM Michael Zhilin <m.zhilin@postgrespro.ru> wrote:
>
> Hi,
>
> Thank you, Andrey, for review and advice!
>
> Here is rebased version (v2) of patch supposed to make CF bot happy.

Hi

+--
+-- BUG: must support different header size of short varlena datum
+--
+
+CREATE TABLE varlena_bug (v text);
+ALTER TABLE varlena_bug ALTER column v SET storage plain;
+INSERT INTO varlena_bug VALUES ('x');
+\set filename :abs_builddir '/results/varlena_bug.dmp'
+COPY varlena_bug TO :'filename';
+COPY varlena_bug FROM :'filename';
+CREATE INDEX varlena_bug_idx on varlena_bug(v);
+SELECT bt_index_check('varlena_bug_idx', true);

you can simply replace
+\set filename :abs_builddir '/results/varlena_bug.dmp'
+COPY varlena_bug TO :'filename';
+COPY varlena_bug FROM :'filename';

with

COPY varlena_bug from stdin;
x
\.


In the comments, adding the postgres link
(https://postgr.es/m/7bdbe559-d61a-4ae4-a6e1-48abdf3024cc@postgrespro.ru)
would be great.



Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum

От
"Andrey M. Borodin"
Дата:

> On 20 Jan 2024, at 09:00, Alexander Lakhin <exclusion@gmail.com> wrote:
>
>> Alexander, do you plan to provide fixes for bugs you discovered?
>
> No, I don't have a concrete proposal how to fix those bugs. I'd thought
> that fixing the whole class of such anomalies, not only one case, is a good
> thing to do, but if it's too complicated, maybe other similar bugs could be
> put aside.

PFA draft fixes for both this errors. Alexander, Michael, Jian, what do you think?

I did not touch anything in first step - fix for original bug in this thread. However, I think that comments from Jian
Heworth incorporating into the fix. 


Best regards, Andrey Borodin.


Вложения

Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum

От
Michael Zhilin
Дата:
Hi,

Thank you, Jian, for nice comments!
PFA version with your recommendations.

Andrey,
I didn't yet check your patches, but at least compiler complains about added, but unused variable "
miss_oversized_tuple".

verify_nbtree.c:2898:7: warning: unused variable 'miss_oversized_tuple' [-Wunused-variable]
        bool miss_oversized_tuple = false;

So patch has been updated to fix this warning.

Attached v4, rebased version with Jian's comments & removed unused variable.

Thanks,
 Michael.

On 1/23/24 21:09, Andrey M. Borodin wrote:

On 20 Jan 2024, at 09:00, Alexander Lakhin <exclusion@gmail.com> wrote:

Alexander, do you plan to provide fixes for bugs you discovered?
No, I don't have a concrete proposal how to fix those bugs. I'd thought
that fixing the whole class of such anomalies, not only one case, is a good
thing to do, but if it's too complicated, maybe other similar bugs could be
put aside.
PFA draft fixes for both this errors. Alexander, Michael, Jian, what do you think?

I did not touch anything in first step - fix for original bug in this thread. However, I think that comments from Jian He worth incorporating into the fix.


Best regards, Andrey Borodin.


-- 
Michael Zhilin
Postgres Professional
+7(925)3366270
https://www.postgrespro.ru
Вложения

Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum

От
jian he
Дата:
On Wed, Jan 24, 2024 at 3:24 AM Michael Zhilin <m.zhilin@postgrespro.ru> wrote:
>
> Hi,
>
> Thank you, Jian, for nice comments!
> PFA version with your recommendations.
>
> Andrey,
> I didn't yet check your patches, but at least compiler complains about added, but unused variable
"miss_oversized_tuple".
>
> verify_nbtree.c:2898:7: warning: unused variable 'miss_oversized_tuple' [-Wunused-variable]
>         bool miss_oversized_tuple = false;
>
> So patch has been updated to fix this warning.
>
> Attached v4, rebased version with Jian's comments & removed unused variable.
>

this deserve some comments, given the whole C file, a large portion of
is comments
+ data_size = MAXALIGN(heap_compute_data_size(tupleDescriptor,
+   normalized, isnull)
+ + MAXALIGN(sizeof(IndexTupleData) + sizeof(IndexAttributeBitMapData)));
+ if ((data_size & INDEX_SIZE_MASK) != data_size)
+ {
+ return NULL;
+ }

whitespace error.
git apply $PATCHES/v4-0003-amcheck-avoid-failing-on-oversized-tuples.patch
/home/jian/Downloads/patches/v4-0003-amcheck-avoid-failing-on-oversized-tuples.patch:147:
trailing whitespace.
 *
warning: 1 line adds whitespace errors.

Is this part unnecessary?
+-- directory paths are passed to us in environment variables

I'm not native English speaker, but I doubt this sentence conveys the
meaning properly.
+ if (!state->has_oversized_tuples)
+ elog(NOTICE, "Index contain tuples that cannot fit into index page,
if toasted with current toast policy");

+ * If the tuple is exampt from checking due to has_oversized_tuples
this function
+ * returns NULL.
maybe
+ * If the tuple is exempt from checking due to has_oversized_tuples,
this function
+ * returns NULL.

you changed
`bool toast_free[INDEX_MAX_KEYS];`
to
`bool need_free[INDEX_MAX_KEYS];`
maybe some comments address the changes, otherwise people would say
why change (maybe I am over thinking).



Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum

От
Alexander Lakhin
Дата:
Hi Andrey,

23.01.2024 21:09, Andrey M. Borodin wrote:
> PFA draft fixes for both this errors. Alexander, Michael, Jian, what do you think?
>
> I did not touch anything in first step - fix for original bug in this thread. However, I think that comments from
JianHe worth incorporating into the fix.
 
>

I''m confused by a NOTICE added, as it printed now even for cases, which
worked before, for example:
CREATE TABLE t(f1 text);
CREATE INDEX idx ON t(f1);
INSERT INTO t VALUES(repeat('1234567890', 1000));
SELECT bt_index_check('idx', true);
NOTICE:  Index contain tuples that cannot fit into index page, if toasted with current toast policy
  bt_index_check
----------------

(1 row)

Best regards,
Alexander



Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum

От
Alexander Korotkov
Дата:
Hi!


On Fri, Jan 26, 2024 at 9:00 AM Alexander Lakhin <exclusion@gmail.com> wrote:
>
> 23.01.2024 21:09, Andrey M. Borodin wrote:
> > PFA draft fixes for both this errors. Alexander, Michael, Jian, what do you think?
> >
> > I did not touch anything in first step - fix for original bug in this thread. However, I think that comments from
JianHe worth incorporating into the fix. 
> >
>
> I''m confused by a NOTICE added, as it printed now even for cases, which
> worked before, for example:
> CREATE TABLE t(f1 text);
> CREATE INDEX idx ON t(f1);
> INSERT INTO t VALUES(repeat('1234567890', 1000));
> SELECT bt_index_check('idx', true);
> NOTICE:  Index contain tuples that cannot fit into index page, if toasted with current toast policy
>   bt_index_check
> ----------------
>
> (1 row)


Right, the patch number 0003 looks wrong to me.  It uses
heap_compute_data_size() to compute the size of the future index
tuple.  But heap_compute_data_size() doesn't apply any compression on
the attributes.  I suggest instead we just need to have a version of
index_form_tuple() that reports an oversized tuple in a return value
rather than throwing the error.

BTW, 0001 and 0002 look good to me.  I'm going to push them if no objections.

------
Regards,
Alexander Korotkov



Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum

От
Alexander Lakhin
Дата:
Hi Alexander,

20.03.2024 13:24, Alexander Korotkov wrote:
> BTW, 0001 and 0002 look good to me. I'm going to push them if no objections. 

Maybe these patches should be polished before committing:
+-- directory paths are passed to us in environment variables
looks like an irrelevant change (perhaps it was relevant in v1/v2, but
that's not so now.)

I'm also not sure about:
+-- BUG: must support different header size of short varlena datum
+-- https://postgr.es/m/7bdbe559-d61a-4ae4-a6e1-48abdf3024cc@postgrespro.ru

AFAICS, for most similar bug fixes, the bug report referenced in a commit
message only (there is no such comment in 0002, either). I also suspect
that the comment:
* Also tuple had short varlena datums with 4B header. ...
might looks incorrect for native English speakers.

This patch also adds a couple of empty lines, which may be not needed.
@@ -2973,6 +2973,7 @@ bt_normalize_tuple(BtreeCheckState *state, IndexTuple itup)
          * index without further processing, so an external varlena header
          * should never be encountered here
          */
 > +
         if (VARATT_IS_EXTERNAL(DatumGetPointer(normalized[i])))
             ereport(ERROR,
                     (errcode(ERRCODE_INDEX_CORRUPTED),
---
         }
+       /*
...
+           need_free[i] = true;
+       }
 > +
     }

Best regards,
Alexander



Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum

От
Alexander Korotkov
Дата:
On Wed, Mar 20, 2024 at 6:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
> 20.03.2024 13:24, Alexander Korotkov wrote:
> > BTW, 0001 and 0002 look good to me. I'm going to push them if no objections.
>
> Maybe these patches should be polished before committing:

Yes, Alexander. Sorry, I forgot to mention I'm going to polish
comments and commit messages before pushing anyway. I'll post it for
your review later today.

------
Regards,
Alexander Korotkov



Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum

От
"Andrey M. Borodin"
Дата:

> On 20 Mar 2024, at 15:24, Alexander Korotkov <aekorotkov@gmail.com> wrote:
>
> Right, the patch number 0003 looks wrong to me.

Indeed, step 0003 was added to the patchset after entry marked RfC. And you are right that it’s not ready yet, and,
honestly,I do not know how to fix it properly. 
Let’s proceed with other bugs and patches in this thread, and for bug of the step 0003 we will create separate thread
later.

Thanks for working on this!


Best regards, Andrey Borodin.


Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum

От
Alexander Korotkov
Дата:
On Wed, Mar 20, 2024 at 7:00 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> On Wed, Mar 20, 2024 at 6:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
> > 20.03.2024 13:24, Alexander Korotkov wrote:
> > > BTW, 0001 and 0002 look good to me. I'm going to push them if no objections.
> >
> > Maybe these patches should be polished before committing:
>
> Yes, Alexander. Sorry, I forgot to mention I'm going to polish
> comments and commit messages before pushing anyway. I'll post it for
> your review later today.

There are revised versions of patches.  Alexander, please, check them
before I push.

------
Regards,
Alexander Korotkov

Вложения

Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum

От
Alexander Korotkov
Дата:
On Sat, Mar 23, 2024 at 2:39 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> On Wed, Mar 20, 2024 at 7:00 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > On Wed, Mar 20, 2024 at 6:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
> > > 20.03.2024 13:24, Alexander Korotkov wrote:
> > > > BTW, 0001 and 0002 look good to me. I'm going to push them if no objections.
> > >
> > > Maybe these patches should be polished before committing:
> >
> > Yes, Alexander. Sorry, I forgot to mention I'm going to polish
> > comments and commit messages before pushing anyway. I'll post it for
> > your review later today.
>
> There are revised versions of patches.  Alexander, please, check them
> before I push.

Fixed typo s/much/match/ in the commit message spotted by Andrey Borodin.

------
Regards,
Alexander Korotkov

Вложения