Re: benchmarking Flex practices

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: benchmarking Flex practices
Дата
Msg-id 30156.1574782349@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: benchmarking Flex practices  (John Naylor <john.naylor@2ndquadrant.com>)
Ответы Re: benchmarking Flex practices  (John Naylor <john.naylor@2ndquadrant.com>)
Список pgsql-hackers
John Naylor <john.naylor@2ndquadrant.com> writes:
> It seems something is not quite right in v9 with the error position reporting:

>  SELECT U&'wrong: +0061' UESCAPE '+';
>  ERROR:  invalid Unicode escape character at or near "'+'"
>  LINE 1: SELECT U&'wrong: +0061' UESCAPE '+';
> -                                        ^
> +                               ^

> The caret is not pointing to the third token, or the second for that
> matter.

Interesting.  For me it points at the third token with or without
your fix ... some flex version discrepancy maybe?  Anyway, I have
no objection to your fix; it's probably cleaner than what I had.

>> * I did not do more with ecpg than get it to compile, using the
>> same hacks as in your v7.  It still fails its regression tests,
>> but now the reason is that what we've done in parser/parser.c
>> needs to be transposed into the identical functionality in
>> ecpg/preproc/parser.c.  Or at least some kind of functionality
>> there.  A problem with this approach is that it presumes we can
>> reduce a UIDENT sequence to a plain IDENT, but to do so we need
>> assumptions about the target encoding, and I'm not sure that
>> ecpg should make any such assumptions.  Maybe ecpg should just
>> reject all cases that produce non-ASCII identifiers?  (Probably
>> it could be made to do something smarter with more work, but
>> it's not clear to me that it's worth the trouble.)

> Hmm, I thought we only allowed Unicode escapes in the first place if
> the server encoding was UTF-8. Or did you mean something else?

Well, yeah, but the problem here is that ecpg would have to assume
that the client encoding that its output program will be executed
with is UTF-8.  That seems pretty action-at-a-distance-y.

I haven't looked closely at what ecpg does with the processed
identifiers.  If it just spits them out as-is, a possible solution
is to not do anything about de-escaping, but pass the sequence
U&"..." (plus UESCAPE ... if any), just like that, on to the grammar
as the value of the IDENT token.

BTW, in the back of my mind here is Chapman's point that it'd be
a large step forward in usability if we allowed Unicode escapes
when the backend encoding is *not* UTF-8.  I think I see how to
get there once this patch is done, so I definitely would not like
to introduce some comparable restriction in ecpg.

            regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: FETCH FIRST clause WITH TIES option
Следующее
От: Tom Lane
Дата:
Сообщение: Re: ERROR: attribute number 6 exceeds number of columns 5