WIP: shared ispell dictionary
От | Pavel Stehule |
---|---|
Тема | WIP: shared ispell dictionary |
Дата | |
Msg-id | 162867791003180333s1933e5b7g9208dd9a2bb681c6@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: WIP: shared ispell dictionary
Re: WIP: shared ispell dictionary |
Список | pgsql-hackers |
Hello attached patch add possibility to share ispell dictionary between processes. The reason for this is the slowness of first tsearch query and size of allocated memory per process. When I tested loading of ispell dictionary (for Czech language) I got about 500 ms and 48MB. With simple allocator it uses only 25 MB. If we remove some check and tolower string transformation from loading stage it needs only 200 ms. But with broken dict or affix file it can put wrong results. This patch significantly reduce load on servers that use ispell dictionaries. I know so Tom worries about using of share memory. I think so it unnecessarily. After loading data from dictionary are only read, never modified. Second idea - this dictionary template can be distributed as separate project (it needs a few changes in core - and simple allocator). Using: a) set shared_data = 26MB (postgres.conf) b) restart c) register dictionary with option "share=yes" CREATE TEXT SEARCH DICTIONARY cspell (template=ispell, dictfile = czech, afffile=czech, stopwords=czech, share = yes); [pavel@nemesis src]$ psql-dev3 postgres Timing is on. psql-dev3 (9.0devel) Type "help" for help. postgres=# select * from ts_debug('cs','Příliš žluťoučký kůň se napil žluté vody'); alias | description | token | dictionaries | dictionary | lexemes -----------+-------------------+-----------+-----------------+------------+------------- word | Word, all letters | Příliš | {cspell,simple} | cspell | {příliš} blank | Space symbols | | {} | | word | Word, all letters | žluťoučký | {cspell,simple} | cspell | {žluťoučký} blank | Space symbols | | {} | | word | Word, all letters | kůň | {cspell,simple} | cspell | {kůň} blank | Space symbols | | {} | | asciiword | Word, all ASCII | se | {cspell,simple} | cspell | {} blank | Space symbols | | {} | | asciiword | Word, all ASCII | napil | {cspell,simple} | cspell | {napít} blank | Space symbols | | {} | | word | Word, all letters | žluté | {cspell,simple} | cspell | {žlutý} blank | Space symbols | | {} | | asciiword | Word, all ASCII | vody | {cspell,simple} | cspell | {voda} (13 rows) Time: 8,178 ms <<-- without patch 500ms Limits and ToDo: a) it support only simple regular expressions b) it doesn't solve cache reset a shared memory deallocation Regards Pavel Stehule
Вложения
В списке pgsql-hackers по дате отправления: