Re: Curious sorting puzzle
От | Tom Lane |
---|---|
Тема | Re: Curious sorting puzzle |
Дата | |
Msg-id | 17522.1149714973@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Curious sorting puzzle (Ivan Voras <ivoras@fer.hr>) |
Ответы |
Re: Curious sorting puzzle
|
Список | pgsql-performance |
Ivan Voras <ivoras@fer.hr> writes: > The situation is this: we're using a varchar column to store > alphanumeric codes which are by themselves 7-bit clean. But we are > operating under a locale which has its own special collation rules, and > is also utf-8 encoded. Recently we've discovered a serious "d'oh!"-type > bug which we tracked down to the fact that when we sort by this column > the collation respects locale sorting rules, which is messing up other > parts of the application. > The question is: what is the most efficient way to solve this problem > (the required operation is to sort data using binary "collation" - i.e. > compare byte by byte)? Since this field gets queried a lot it must have > an index. Some of the possible solutions we thought of are: replacing > the varchar type with numeric and do magical transcoding (bad, needs > changes thoughout the application) and inserting spaces after every > character (not as bad, but still requires modifying both the application > and the data). An ideal solution would be to have a > "not-locale-affected-varchar" field type :) If you're just storing ASCII then I think bytea might work for this. Do you need any actual text operations (like concatenation), or this just a store-and-retrieve field? If you need text ops too then probably the best answer is to make your own datatype. It's not that hard --- look at the citext datatype (on pgfoundry IIRC, or else gborg) for a closely related example. regards, tom lane
В списке pgsql-performance по дате отправления: