Re: AW: Re: [BUGS] Turkish locale bug
От | Tom Lane |
---|---|
Тема | Re: AW: Re: [BUGS] Turkish locale bug |
Дата | |
Msg-id | 13047.982689823@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | AW: Re: [BUGS] Turkish locale bug (Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>) |
Список | pgsql-hackers |
Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at> writes: >> Now I'm confused. Are you saying that we *should* treat identifier case >> under ASCII rules only? That seems like a step backwards to me, but >> then I don't use any non-US locale myself... > I think we need to treat anything that is not quoted as US_ASCII, > iirc this is how Informix behaves. Users wanting locale aware identifiers > would need to double quote those, thus avoiding non ASCII case conversions > alltogether. I dug into the SQL99 spec, and I find it appears to have different rules for identifier folding than for keyword recognition. Section 5.2 syntax rules 1-12 make it perfectly clear that they have an expansive idea of what characters are allowed in identifiers (most of Unicode, it looks like ;-)). They also define the case-normalized form of an identifier in terms of Unicode case translations (rule 21). But they then say 28) For the purposes of identifying <key word>s, any <simple Latin lower case letter> contained in a candidate<key word> shall be effectively treated as the corresponding <simple Latin upper case letter>. It appears to me that to implement the SQL99 rules correctly in a non-C locale, we need to do casefolding twice. First, casefold only 'A'..'Z' and test to see if we have a keyword. If not, do the casefolding again using isupper/tolower to produce the normalized form of the identifier. This would solve Sezai's problem without adding a special case for Turkish, and it doesn't seem unreasonably slow. Anyone object to it? regards, tom lane
В списке pgsql-hackers по дате отправления: