Re: Optimize partial TOAST decompression

Поиск
Список
Период
Сортировка
От Binguo Bao
Тема Re: Optimize partial TOAST decompression
Дата
Msg-id CAL-OGkuvHceCv96H8PHk+GeDn5NxuojGhQQati7Z7uFHduZVBg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Optimize partial TOAST decompression  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: Optimize partial TOAST decompression  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers

Tomas Vondra <tomas.vondra@2ndquadrant.com> 于2019年7月5日周五 上午1:46写道:
I've done a bit of testing and benchmaring on this patch today, and
there's a bug somewhere, making it look like there are corrupted data.

What I'm seeing is this:

CREATE TABLE t (a text);

-- attached is data for one row
COPY t FROM '/tmp/t.data';


SELECT length(substr(a,1000)) from t;
psql: ERROR:  compressed data is corrupted

SELECT length(substr(a,0,1000)) from t;
 length
--------
    999
(1 row)

SELECT length(substr(a,1000)) from t;
psql: ERROR:  invalid memory alloc request size 2018785106

That's quite bizarre behavior - it does work with a prefix, but not with
suffix. And the exact ERROR changes after the prefix query. (Of course,
on master it works in all cases.)

The backtrace (with the patch applied) looks like this:

#0  toast_decompress_datum (attr=0x12572e0) at tuptoaster.c:2291
#1  toast_decompress_datum (attr=0x12572e0) at tuptoaster.c:2277
#2  0x00000000004c3b08 in heap_tuple_untoast_attr_slice (attr=<optimized out>, sliceoffset=0, slicelength=-1) at tuptoaster.c:315
#3  0x000000000085c1e5 in pg_detoast_datum_slice (datum=<optimized out>, first=<optimized out>, count=<optimized out>) at fmgr.c:1767
#4  0x0000000000833b7a in text_substring (str=133761519127512, start=0, length=<optimized out>, length_not_specified=<optimized out>) at varlena.c:956
...

I've only observed this with a very small number of rows (the  data is
generated randomly with different compressibility etc.), so I'm only
attaching one row that exhibits this issue.

My guess is toast_fetch_datum_slice() gets confused by the headers or
something, or something like that. FWIW the new code added to this
function does not adhere to our code style, and would deserve some
additional explanation of what it's doing/why. Same for the
heap_tuple_untoast_attr_slice, BTW.


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Hi, Tomas!
Thanks for your testing and the suggestion.

That's quite bizarre behavior - it does work with a prefix, but not with
suffix. And the exact ERROR changes after the prefix query.

I think bug is caused by "#2  0x00000000004c3b08 in heap_tuple_untoast_attr_slice (attr=<optimized out>, sliceoffset=0, slicelength=-1) at tuptoaster.c:315",
since I ignore the case where slicelength is negative, and I've appended some comments for heap_tuple_untoast_attr_slice for the case.

FWIW the new code added to this
function does not adhere to our code style, and would deserve some
additional explanation of what it's doing/why. Same for the
heap_tuple_untoast_attr_slice, BTW.

I've added more comments to explain the code's behavior.
Besides, I also modified the macro "TOAST_COMPRESS_RAWDATA" to "TOAST_COMPRESS_DATA" since
it is used to get toast compressed data rather than raw data.

Best Regards, Binguo Bao.
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Paul A Jungwirth
Дата:
Сообщение: Re: range_agg
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: Extending PostgreSQL with a Domain-Specific Language (DSL) -Development