Обсуждение: regexp_replace 'g' flag

Поиск
Список
Период
Сортировка

regexp_replace 'g' flag

От
Bruce Momjian
Дата:
Why doesn't the 'g' flag appear in this table?

    http://www.postgresql.org/docs/9.2/static/functions-matching.html#POSIX-EMBEDDED-OPTIONS-TABLE

I see text here that says:

    http://www.postgresql.org/docs/9.2/static/functions-matching.html#FUNCTIONS-POSIX-REGEXP
    Other supported flags are described in Table 9-19.

Seems 'g' should appear there too.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +


Re: regexp_replace 'g' flag

От
Bruce Momjian
Дата:
On Thu, Sep  5, 2013 at 08:37:44PM -0400, Bruce Momjian wrote:
> Why doesn't the 'g' flag appear in this table?
>
>     http://www.postgresql.org/docs/9.2/static/functions-matching.html#POSIX-EMBEDDED-OPTIONS-TABLE
>
> I see text here that says:
>
>     http://www.postgresql.org/docs/9.2/static/functions-matching.html#FUNCTIONS-POSIX-REGEXP
>     Other supported flags are described in Table 9-19.
>
> Seems 'g' should appear there too.

Is it because the table has generic pattern modififers and 'g' only is
relevant for regexp_replace?  I assume so.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +


Re: regexp_replace 'g' flag

От
Tom Lane
Дата:
Bruce Momjian <bruce@momjian.us> writes:
> On Thu, Sep  5, 2013 at 08:37:44PM -0400, Bruce Momjian wrote:
>> Why doesn't the 'g' flag appear in this table?
>> http://www.postgresql.org/docs/9.2/static/functions-matching.html#POSIX-EMBEDDED-OPTIONS-TABLE

> Is it because the table has generic pattern modififers and 'g' only is
> relevant for regexp_replace?  I assume so.

The table is specifically about ARE options, and 'g' is *not* one of
those.  Adding 'g' to the table would be wrong.

It does seem to me to be a bit confusing that the text description of
substring() mentions 'i' and 'g' explicitly, when only 'i' is listed in
the table.  You could make a case for phrasing along the line of
"substring() supports the 'g' flag that specifies ..., as well as all the
flags listed in Table 9-19".  On the other hand, 'i' is the most useful of
the flags listed in the table by several country miles, and it doesn't
seem quite right to make people go off and consult the table to find out
about it.

Not sure whether there's any real improvement that can be made here,
but I suppose it'd be nice if the text descriptions of substring() and
regexp_replace() handled this matter in the same way ...

            regards, tom lane


Re: regexp_replace 'g' flag

От
David Johnston
Дата:
Sorry if you get this twice but I use Nabble and didn't subscribe to the list
so my originals got put into the verification queue.  I've subscribed now
and am re-posting hoping it will go through clean.

See my self-quote comment and my direct comment at the end.



David Johnston wrote
>
> Tom Lane-2 wrote
>> Bruce Momjian <

>> bruce@

>> > writes:
>>> On Thu, Sep  5, 2013 at 08:37:44PM -0400, Bruce Momjian wrote:
>>>> Why doesn't the 'g' flag appear in this table?
>>>> http://www.postgresql.org/docs/9.2/static/functions-matching.html#POSIX-EMBEDDED-OPTIONS-TABLE
>>
>>> Is it because the table has generic pattern modififers and 'g' only is
>>> relevant for regexp_replace?  I assume so.
>>
>> The table is specifically about ARE options, and 'g' is *not* one of
>> those.  Adding 'g' to the table would be wrong.
>>
>> It does seem to me to be a bit confusing that the text description of
>> substring() mentions 'i' and 'g' explicitly, when only 'i' is listed in
>> the table.  You could make a case for phrasing along the line of
>> "substring() supports the 'g' flag that specifies ..., as well as all the
>> flags listed in Table 9-19".  On the other hand, 'i' is the most useful
>> of
>> the flags listed in the table by several country miles, and it doesn't
>> seem quite right to make people go off and consult the table to find out
>> about it.
>>
>> Not sure whether there's any real improvement that can be made here,
>> but I suppose it'd be nice if the text descriptions of substring() and
>> regexp_replace() handled this matter in the same way ...
>>
>>             regards, tom lane
> substring(text from pattern) returns a scalar text which corresponds to
> either the entire first match found or the sub-portion of the first match
> corresponding to the first (and only first if more than one) matching
> group in the expression.  It cannot act globally and so cannot accept/use
> a "g" flag even if there was some way to provide it.
>
> regexp_replace indeed handles a "g" flag because while it too returns a
> scalar text it returns the entire source string post-modification as
> opposed to only a subset thereof and the modification itself makes use of
> the "g" flag to decide whether to replace one or ALL occurrences.
>
> I cannot find where "the text description of substring() mentions 'i' and
> 'g' explicitly"; could you maybe copy-paste a direct quote and also note
> the exaction section of the page you are looking in?
>
> David J.

A little bit rambly but hopefully instructive...

"embedded" is the key word here.  Although not applicable to PostgreSQL an
embedded modifier alters the interpretation of the pattern between the
"start" and "end" modifier expression (for PostgreSQL there is only a
"start", no end, and so the embedded modifier affects the entire pattern).
While it is possible to turn on/off case insensitivity, .-newline, and some
other options the "g" (global) option can only apply to the pattern as a
whole and conceptually belongs to the executor of the pattern as opposed to
the pattern itself.

The "g" option is relevant to both "regexp_replace" and "regexp_matches".
In the later case using the "g" modifier allows for more than one row to be
returned from the SRF.  In both cases the entire pattern is being applied to
the input text and the "g" modifier tells the matching algorithm not only to
affirm there is at least one match but to identify all sections of the
source text that match the entire pattern.

PostgreSQL is somewhat more limited in using these embedded options than
other implementations since, IIRC (and my quick scan of the linked documents
just now), you can only begin the pattern with these and so they apply to
the entire pattern too.  Basically they provide a way to include flags in
the pattern when dealing with operator-based invocation.  In other
implementations it is possible to write something like:

'(?i)this section is case insensitive(?-i)this section is case sensitive'

namely toggling these on/off within a pattern.

Since the "g"lobal flag only makes sense in function-call invocations it is
not needed nor useful to have embedded within the expression itself.  i.e.,
operator-based invocations only deal with 'true/false' evaluations which is
a one+-or-none evaluation.

David J.




--
View this message in context: http://postgresql.1045698.n5.nabble.com/regexp-replace-g-flag-tp5769814p5769912.html
Sent from the PostgreSQL - docs mailing list archive at Nabble.com.


Re: regexp_replace 'g' flag

От
Bruce Momjian
Дата:
On Thu, Sep  5, 2013 at 09:59:13PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > On Thu, Sep  5, 2013 at 08:37:44PM -0400, Bruce Momjian wrote:
> >> Why doesn't the 'g' flag appear in this table?
> >> http://www.postgresql.org/docs/9.2/static/functions-matching.html#POSIX-EMBEDDED-OPTIONS-TABLE
>
> > Is it because the table has generic pattern modififers and 'g' only is
> > relevant for regexp_replace?  I assume so.
>
> The table is specifically about ARE options, and 'g' is *not* one of
> those.  Adding 'g' to the table would be wrong.
>
> It does seem to me to be a bit confusing that the text description of
> substring() mentions 'i' and 'g' explicitly, when only 'i' is listed in
> the table.  You could make a case for phrasing along the line of
> "substring() supports the 'g' flag that specifies ..., as well as all the
> flags listed in Table 9-19".  On the other hand, 'i' is the most useful of
> the flags listed in the table by several country miles, and it doesn't
> seem quite right to make people go off and consult the table to find out
> about it.
>
> Not sure whether there's any real improvement that can be made here,
> but I suppose it'd be nice if the text descriptions of substring() and
> regexp_replace() handled this matter in the same way ...

I went ahead and just explicitly documented that 'g' is not in the
table.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +

Вложения