Re: Character classes
От | Tom Lane |
---|---|
Тема | Re: Character classes |
Дата | |
Msg-id | 24386.1558375597@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Character classes (PG Doc comments form <noreply@postgresql.org>) |
Ответы |
Re: Character classes
|
Список | pgsql-docs |
PG Doc comments form <noreply@postgresql.org> writes: > On https://www.postgresql.org/docs/11/functions-matching.html paragraph > 9.7.3.2. Bracket Expressions says "Standard character class names are: > alnum, alpha, blank, cntrl, digit, graph, lower, print, punct, space, upper, > xdigit". The class "ascii" exists, but is not mentioned (probably a > combination of some of the other classes). Are there any other classes? Hm, fair question. I think the text means to say that these are the character class names required by the POSIX regexp spec, which is accurate. A look into our src/backend/regex/regc_locale.c will show you that we also implement "ascii", and no others. That probably ought to be documented. > Do they work only for ASCII characters (e.g. '\u00A0' is not picked up > by '[:blank:]')? The POSIX ones are implemented by calling the C library, so it's whatever the ctype.h and wctype.h functions think is appropriate for your LC_CTYPE setting. The 20-year-old reference in our text to ctype(3) seems rather unhelpful today; in the first place, there's no such man page on my Linux systems, and in the second place, wctype(3) is more important if it exists, and in the third place what a reader actually wants to know is that this is controlled by the LC_CTYPE server parameter. It'd likely be better to dump the man-page reference altogether and instead point readers to our "Locale Support" chapter. regards, tom lane
В списке pgsql-docs по дате отправления: