Re: [BUGS] BUG #14628: regex description in online documentation misleadingly/wrong

Поиск
Список
Период
Сортировка
От David G. Johnston
Тема Re: [BUGS] BUG #14628: regex description in online documentation misleadingly/wrong
Дата
Msg-id CAKFQuwawUKoqCMTj1AUcr0tfQdzOekL5VwTXT5M9zHNfN-RWZQ@mail.gmail.com
обсуждение исходный текст
Ответ на [BUGS] BUG #14628: regex description in online documentationmisleadingly/wrong  (t.glaser@tarent.de)
Ответы Re: [BUGS] BUG #14628: regex description in online documentation misleadingly/wrong  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
On Thu, Apr 20, 2017 at 8:25 AM, <t.glaser@tarent.de> wrote:
The following bug has been logged on the website:

Bug reference:      14628
Logged by:          Thorsten Glaser
Email address:      t.glaser@tarent.de
PostgreSQL version: 9.6.1
Operating system:   GNU/Linux
Description:

https://www.postgresql.org/docs/9.6/static/functions-matching.html#FUNCTIONS-POSIX-REGEXP
clearly says that ~ matches a POSIX regular expression.

This is only somewhat true: this does match:


Based on what you wrote below I'd maybe (though leaning toward not) modify the chapter title to "POSIX (ARE) Regular Expressions"

I would then likely add two more sentences before Table 9-14 (before the existing intro sentence).

POSIX regular expressions come in multiple flavors, of which PostgreSQL uses ARE by default.  Further information on these flavors is presented in the first subsection, "Regular Expression Details", below.  What follows is an overview of the general mechanics involved with any regular expression.


The cause is likely this statement, burrowed way down in another chapter:
“Note: PostgreSQL always initially presumes that a regular expression
follows the ARE rules.”


​While this maybe could be improved ​the above characterization seems overblown.  9.7.3.1 is a sub-section of 9.7.3 so "[buried] way down" isn't accurate.  That we choose to provide the high-level conceptual overview of regular expressions first, and then delve into ARE/BRE/ERE has caused few or no complaints from the typical reader for whom the defaults are adequate and they just want to know how to get things to work in the simple case.

 
And indeed, it’s an ARE!

tarent=> SELECT 'a\b' ~ '(?e)^[a\b]*$';
 ?column?
----------
 t
(1 row)


I find this extremely misleading (it also does not state whether it matches
BRE or ERE by default, just “POSIX re”),

You missed the big bubble note in 9.7.3.1: "​PostgreSQL always initially presumes that a regular expression follows the ARE rules".
 
especially as it’s extremely
important to know precisely what RE syntax you’re targetting when escaping a
user-provided string into part of a RE (you have to precisely know where to
escape and where to not escape, for example),

​​I'd say that is advanced usage and as you were able to find the needed documentation in 9.7.3.1 I'm not sure there is anything to fix based upon this.​
which is why I personally
always use POSIX standard RE (normally BRE).

​So basically you feel its necessary for us to redundantly emphasize the fact that we default to ARE because its different from your default choice and, you imply but do not support, the choice of the majority of other regular expression implementations.  If one wants to understand the regular expression implementation they read 9.7.3 - in all other places we can just call them regular expressions.  Now, as I note below, if you have specific areas that you think need to be fixed please point them out.


Please indicate in *all* places in the documentation dealing with regular
expressions that it’s about ARE and link ARE to the section in the manual
explaining it -
https://www.postgresql.org/docs/9.6/static/functions-matching.html#POSIX-SYNTAX-DETAILS
- in all of those places. Also, make clear at the beginning of that section
how to force standard POSIX RE (i.e. BRE and ERE).

​You seem to have a very firm grasp of the topic and so might consider some actual firm suggestions and/or a patch.  I've not seen an actual factual omission or error in all of this and while I firmly believe that documentation can always be improved, and that the TCL implementation that we use has its quirks, I don't foresee the requested surgery happening from scratch based upon this report.  I've suggested a fairly easy clarification at the top of the chapter (9.7.3) to at least bring immediate awareness of the flavor issue.  Does that work for you?

David J.

В списке pgsql-bugs по дате отправления:

Предыдущее
От: t.glaser@tarent.de
Дата:
Сообщение: [BUGS] BUG #14628: regex description in online documentationmisleadingly/wrong
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [BUGS] BUG #14628: regex description in online documentation misleadingly/wrong