BUG #18362: unaccent rules and Old Greek text
От | PG Bug reporting form |
---|---|
Тема | BUG #18362: unaccent rules and Old Greek text |
Дата | |
Msg-id | 18362-be6d0cfe122b6354@postgresql.org обсуждение исходный текст |
Ответы |
Re: BUG #18362: unaccent rules and Old Greek text
|
Список | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 18362 Logged by: Cees van Zeeland Email address: cees.van.zeeland@freedom.nl PostgreSQL version: 15.6 Operating system: Windows 11 Description: I am using a Postgres Server 15.06-1 with UTF-8 I am struggling with the unaccent extension and "Old Greek" characters. To explain what behaviour I encoutered, try this: 1. Create a table with one text field CREATE TABLE IF NOT EXISTS public.test ( entry text COLLATE pg_catalog."default" NOT NULL, CONSTRAINT test_pkey PRIMARY KEY (entry) ) 2. Insert the next few greek words with (stress accents) on the vowels, or import de CSV file with the same items. ἀνήρ (== man) πέντε (== five) γίγας (== giant) γράφω (== write) δύο (== two) ἐγώ (== Ι) θεός (== god) 3. Create the next view for searching: CREATE OR REPLACE VIEW public.test_view AS SELECT test.entry, COALESCE(array_to_string(ts_lexize('unaccent'::regdictionary, replace(test.entry, 'ς'::text, 'σ'::text)), ''::text), replace(test.entry, 'ς'::text, 'σ'::text)) AS search_entry FROM test ORDER BY test.entry; 4. Try if it works: SELECT entry, search_entry FROM public.test_view; Result shows that not all diacritics are removed When I search in the unaccent.rules I see around line 530 characters that look the same but they are in fact different. f.e. Greek Small Letter Epsilon with Tonos versus Greek Small Letter Epsilon with Oxia I found here a discussion about this subject: https://ibiblio.org/bgreek/forum/viewtopic.php?t=4170 So, there are reasons to keep the current unaccent.rules as it is, but... there are other reasons to add a few lines to it, f.e. after line 955 and insert five greek vowels with Oxia Please add: ά α έ ε ή η ί ι ό ο ύ υ ώ ω It would solve the problem and make searching through old greek texts al lot easier... Thanks for your help, Cees van Zeeland
В списке pgsql-bugs по дате отправления: