Re: BUG #18362: unaccent rules and Old Greek text
От | Tom Lane |
---|---|
Тема | Re: BUG #18362: unaccent rules and Old Greek text |
Дата | |
Msg-id | 1667235.1708905577@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: BUG #18362: unaccent rules and Old Greek text (Michael Paquier <michael@paquier.xyz>) |
Список | pgsql-bugs |
Michael Paquier <michael@paquier.xyz> writes: > On Mon, Feb 26, 2024 at 12:15:57PM +1300, Thomas Munro wrote: >> That has a normal looking sequence that we can understand (α + an >> accent). If I tell the script to follow such "simple" redirections, I >> get over a thousand new mappings, including those. See attached. >> There is probably more correct terminology that I'm using here... > Ah, you've beaten me to it. Yes, that's pretty much the impression I > was getting while looking at the set of characters in Unicode.txt. I > am not entirely sure if what you are doing is the best way to do it, > but the set of characters generated in unaccent.rules makes sense > here. I am surprised to see that many, TBH. There are only about 1650 lines in our standard unaccent.rules file today. Are we concerned about adding so many more? I don't think the trie lookup logic would be slowed any, but the time to load the rules file might take a hit. regards, tom lane
В списке pgsql-bugs по дате отправления: