match accented chars with ASCII-normalised version
От | brian |
---|---|
Тема | match accented chars with ASCII-normalised version |
Дата | |
Msg-id | 47996D7F.3070801@zijn-digital.com обсуждение исходный текст |
Ответы |
Re: match accented chars with ASCII-normalised version
Re: match accented chars with ASCII-normalised version |
Список | pgsql-general |
The client for a web application I'm working on wants certain URLs to contain the full names of members ("SEO-friendly" links). Scripts would search on, say, a member directory entry based on the name of the member, rather than the row ID. I can easily join first & last names with an underscore (and split on that later) and replace spaces with +, etc. But many of the names contain multibyte characters and so the URLs would become URL-encoded, eg: Adelina España -> Adelina_Espa%C3%B1a The client won't like this (and neither will I). I can create a conversion array to replace certain characters with 'normal' ones: Adelina_Espana However, I then run into the problem of trying to match 'Espana' to 'España'. Searching online, I found a few ideas (soundex, intuitive fuzzy something-or-other) but mostly they seem like overkill for this application. The best I can come up with is to add a 'link_name' column to the table that holds the 'normalised' version of the name ('Adelina_Espana', or even 'adelina_espana'). The duplication bugs me a little but the table currently stands at a whopping ~3500 names, so I'm not too concerned. My question is: well, does this look like the way to go, considering it's just a web app (and isn't likely to ever top 10000 names)? Or is there something clever (yet not overkill) that I'm missing? If I do go this route, I'd add an insert/update trigger to call a function (PL/Perl, I'm looking at you) that handles the conversion to link_name. brian
В списке pgsql-general по дате отправления: