Re: Hostnames, IDNs, Punycode and Unicode Case Folding
От | Andy Colson |
---|---|
Тема | Re: Hostnames, IDNs, Punycode and Unicode Case Folding |
Дата | |
Msg-id | 54A1DB0F.204@squeakycode.net обсуждение исходный текст |
Ответ на | Hostnames, IDNs, Punycode and Unicode Case Folding (Mike Cardwell <pgsql@lists.grepular.com>) |
Ответы |
Re: Hostnames, IDNs, Punycode and Unicode Case Folding
|
Список | pgsql-general |
On 12/29/2014 4:36 PM, Mike Cardwell wrote: > I'd like to store hostnames in a postgres database and I want to fully support > IDNs (Internationalised Domain Names) > > I want to be able to recover the original representation of the hostname, so I > can't just encode it with punycode and then store the ascii result. For example, > these two are the same hostnames thanks to unicode case folding [1]: > > tesst.ëxämplé.com > teßt.ëxämplé.com > > They both encode in punycode to the same thing: > > xn--tesst.xmpl.com-cib7f2a > > Don't believe me, then try visiting any domain with two s's in, whilst replacing > the s's with ß's. E.g: > > ericßon.com > nißan.com > americanexpreß.com > > So if I pull out "xn--tesst.xmpl.com-cib7f2a" from the database, I've no idea > which of those two hostnames was the original representation. > > The trouble is, if I store the unicode representation of a hostname instead, > then when I run queries with conditions like: > > WHERE hostname='nißan.com' > _IF_ Postgres had a punycode function, then you could use: WHERE punycode(hostname) = punycode('nißan.com') -Andy
В списке pgsql-general по дате отправления: