Re: EOL characters and multibyte encodings
От | Tom Lane |
---|---|
Тема | Re: EOL characters and multibyte encodings |
Дата | |
Msg-id | 8064.1182465526@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | EOL characters and multibyte encodings (Joe Conway <mail@joeconway.com>) |
Ответы |
Re: EOL characters and multibyte encodings
|
Список | pgsql-hackers |
Joe Conway <mail@joeconway.com> writes: > I finally was able PL/R to compile and run on Windows recently. This has > lead to people using a Windows based client (typically PgAdmin III) to > create PL/R functions. Immediately I started to receive reports of > failures that turned out to be due to the carriage return (\r) used in > standard Win32 EOLs (\r\n). It seems that the R parser only accepts > newlines (\n), even on Win32 (confirmed on r-devel list with a core > developer). > My first thought on fixing this issue was to simply replace all > instances of '\r' in pg_proc.prosrc with '\n' prior to sending it to the > R parser. As far as I know, any instances of '\r' embedded in a > syntactically valid R statement must be escaped (i.e. literally the > characters "\" and "r"), so that should not be a problem. But I am > concerned about how this potentially plays against multibyte characters. > Is it safe to do this, or do I need to use a mb-aware replace algorithm? It's safe, because you'll be dealing with prosrc inside the backend, therefore using a backend-legal encoding, and those don't have any ASCII aliasing problems (all bytes of an MB character must have high bit set). However I dislike doing it exactly that way because line numbers in the R script will all get doubled. Unless R never reports errors in terms of line numbers, you'd be better off to either delete the \r characters or replace them with spaces. regards, tom lane
В списке pgsql-hackers по дате отправления: