Re: GIN index build speed

Поиск

Список

Период

Сортировка

От	Teodor Sigaev
Тема	Re: GIN index build speed
Дата	2 декабря 2008 г. 08:37:59
Msg-id	49352C18.30702@sigaev.ru обсуждение исходный текст
Ответ на	GIN index build speed (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Список	pgsql-hackers

Дерево обсуждения

> The issue is that the GIN index build code accumulates the lexemes into 
> a binary tree, but there's nothing to keep the tree balanced. My test 
> case with almost monotonically increasing keys, happens to be a 
> worst-case scenario, and the tree degenerates into almost linked list 
> that every insertion has to grovel through.
Agree, but in most cases it works well. Because lexemes in documents aren't ordered.


> 
> The obvious fix is to use a balanced tree algorithm. I wrote a quick 
> patch to turn the tree into a splay tree. That fixed the degenerative 
> behavior, and the runtime of CREATE INDEX for the above test case fell 
> from 40s to 1.5s.
BTW, your patch helps to GIN's btree emulation. With typical scenarios of usage 
of btree emulation scalar column will be more or less ordered.

> 
> Magnus kindly gave me a dump of the full-text-search tables from 
> search.postgresql.org, for some real-world testing. Quick testing with 
> that suggests that the patch unfortunately makes the index build 5-10% 
> slower with that data set.
Do you see ways to  improve that?
> 
> We're in commitfest, not supposed to be submitting new features, so I'm 
> not going to pursue this further right now. Patch attached, however, 
> which seems to work fine.
Personally, I don't  object to improve that.
-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: GIN index build speed