Lexing with different charsets

Поиск
Список
Период
Сортировка
От Dennis Bjorklund
Тема Lexing with different charsets
Дата
Msg-id Pine.LNX.4.44.0404131843530.4551-100000@zigo.dhs.org
обсуждение исходный текст
Ответы Re: Lexing with different charsets  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Lexing with different charsets  (Tatsuo Ishii <t-ishii@sra.co.jp>)
Re: Lexing with different charsets  (Fabien COELHO <coelho@cri.ensmp.fr>)
Список pgsql-hackers
I've spent some more time reading specs today. Together with Peter E's
explanataion (Thanks!) I think I've got a farily good understanding of the
parts talking about locales now.

My next question is about lexing. The spec says that one can use strings 
of different charsets in the queries, like:
 ... WHERE field1 = _latin1'FooBar' and field2 = _utf8'Åäö'

I can see that the lexer either needs to be taught about all the
different charsets or this is not going to work very well.

What if one wants to include a string in utf-16 in the query, the lexer
can not handle that without understanding utf-16. The query can also be in
different charsets. If it's in utf-8 for example, then we can not embed
latin1 strings and still have a validating utf-8 query. With the above we
can not think of the query as being in a single charset anymore. That's 
strange but okay I guess.

The new wire protocol allows us to send data seperatly from the query
which is nice, but the standard talked about strings as above so it's not
a solution to the problem.

Maybe I should have adressed this to Peter directly :-)

-- 
/Dennis Björklund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephan Szabo
Дата:
Сообщение: Re: make == as = ?
Следующее
От: Jaume Teixi
Дата:
Сообщение: unsubscribe