Re: Duplicate Values or Not?!
От | Martijn van Oosterhout |
---|---|
Тема | Re: Duplicate Values or Not?! |
Дата | |
Msg-id | 20050917171348.GA11697@svana.org обсуждение исходный текст |
Ответ на | Re: Duplicate Values or Not?! (Greg Stark <gsstark@mit.edu>) |
Ответы |
Re: Duplicate Values or Not?!
|
Список | pgsql-general |
On Sat, Sep 17, 2005 at 11:50:44AM -0400, Greg Stark wrote: > Hm. Some experimentation shows that at least on glibc's locale definitions the > strings that I thought compared equal don't actually compare equal. > Capitalization, punctuation, white space, while they're basically ignored in > general in non-C locales do seem to compare non-equal when they're the only > differentiating factor. > > Is this guaranteed by any spec? Or is counting on this behaviour unsafe? I don't know if it's guarenteed by spec, but it certainly seems silly for strings to compare equal when they're not. Just because a locale sorts ignoring case doesn't mean that "sun" and "Sun" are the same. The only real sensible rule is that strcoll should return 0 only if strcmp would also return zero... If you actually use strxfrm on glibc you'll see the result comes out aprroximatly twice as long. The first n bytes being sortof case-folded versions of the original characters, the second n characters being some kind of class identification. I think that all the spec guarentees is that strcoll(a,b) == strcmp(strxfrm(a),strxfrm(b)). If strcoll is returning zero for two non-identical strings, they must strxfrm to the same thing, so that may be a solution. Anyway, long term the plan is to move to a cross-platform locale library so hopefully broken locale libraries will be a thing of the pasy... -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Вложения
В списке pgsql-general по дате отправления: