Re: BUG #18080: to_tsvector fails for long text input

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: BUG #18080: to_tsvector fails for long text input
Дата
Msg-id 1056584.1695404939@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: BUG #18080: to_tsvector fails for long text input  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: BUG #18080: to_tsvector fails for long text input  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
I wrote:
> Yeah.  My thought about blocking the error had been to limit
> prs.lenwords to MaxAllocSize/sizeof(ParsedWord) in this code.

Concretely, as attached.  This allows the given test case to
complete, since it doesn't actually create very many distinct
words.  In other cases we could expect to fail when the array
has to get enlarged, but that's just a normal implementation
limitation.

I looked for other places that might initialize lenwords
to not-sane values, and didn't find any.

BTW, the field order in ParsedWord is such that there's a fair
amount of wasted pad space on 64-bit builds.  I doubt we can
get away with rearranging it in released branches; but maybe
it's worth doing something about that in HEAD, to push out
the point at which you hit the 1Gb limit.

            regards, tom lane

diff --git a/src/backend/tsearch/to_tsany.c b/src/backend/tsearch/to_tsany.c
index 3b6d41f9e8..fe39d6c4b9 100644
--- a/src/backend/tsearch/to_tsany.c
+++ b/src/backend/tsearch/to_tsany.c
@@ -252,6 +252,8 @@ to_tsvector_byid(PG_FUNCTION_ARGS)
                                                  * number */
     if (prs.lenwords < 2)
         prs.lenwords = 2;
+    else if (prs.lenwords > MaxAllocSize / sizeof(ParsedWord))
+        prs.lenwords = MaxAllocSize / sizeof(ParsedWord);
     prs.curwords = 0;
     prs.pos = 0;
     prs.words = (ParsedWord *) palloc(sizeof(ParsedWord) * prs.lenwords);

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Heikki Linnakangas
Дата:
Сообщение: Re: BUG #18129: GiST index produces incorrect query results
Следующее
От: vignesh C
Дата:
Сообщение: Re: [16+] subscription can end up in inconsistent state