Re: Some regular-expression performance hacking

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Some regular-expression performance hacking
Дата
Msg-id 1662033.1613237745@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Some regular-expression performance hacking  ("Joel Jacobson" <joel@compiler.org>)
Список pgsql-hackers
"Joel Jacobson" <joel@compiler.org> writes:
> In total, I scraped the first-page of some ~50k websites,
> which produced 45M test rows to import,
> which when GROUP BY pattern and flags was reduced
> down to 235k different regex patterns,
> and 1.5M different text string subjects.

This seems like an incredibly useful test dataset.
I'd definitely like a copy.

> No is_match differences were detected, good!

Cool ...

> However, there were 23 cases where what got captured differed:

I shall take a closer look at that.

Many thanks for doing this work!

            regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Joel Jacobson"
Дата:
Сообщение: Re: Some regular-expression performance hacking
Следующее
От: Patrick Handja
Дата:
Сообщение: How to get Relation tuples in C function