Обсуждение: Timestamp conversion can't use index

Поиск
Список
Период
Сортировка

Timestamp conversion can't use index

От
Bruce Momjian
Дата:
Someone reported to me that they can't get their queries to use indexes.
It turns out this is because timestamp() has pg_proc.proiscachable set
to false in many cases.  Date() also has this in some cases.

I realized timestamp() can be called with 'CURRENT_TIMESTAMP', which of
course is not cachable, but when called with a real date, it seems it
would be cachable.  However, I seem to remember that the timezone
setting can effect the output, and therefore it isn't cachable, or
something like that.

While the actual conversion call it very minor, there is code in
backend/optimizer/utils/clauses::simplify_op_or_func() that has:
   if (!proiscachable)       return NULL;

This prevents index usage for non-cachable functions, as shown below. 

The first only does only a date() conversion, the second adds an
interval, which results in a timestamp() conversion.  Notice this uses a
sequential scan.  The final one avoids timestamp but just adding '1' to
the date value:
    test=> EXPLAIN SELECT * FROM test WHERE x = DATE('2001-01-01');    NOTICE:  QUERY PLAN:        Index Scan USING
i_testON test  (cost=0.00..3.01 ROWS=1 width=208)        EXPLAIN    test=> EXPLAIN SELECT * FROM test WHERE x =
DATE('2001-01-01')+    INTERVAL '1 DAY';    NOTICE:  QUERY PLAN:        Seq Scan ON test  (cost=0.00..26.00 ROWS=5
width=208)       EXPLAIN    test=> EXPLAIN SELECT * FROM test WHERE x = DATE('2001-01-01') + 1;    NOTICE:  QUERY PLAN:
      Index Scan USING i_test ON test  (cost=0.00..3.01 ROWS=1 width=208)        EXPLAIN
 
Can someone explain the rational between which timestamp/date calls are
cachable and which are not, and whether the cachablility really relates
to index usage or is this just a problem with our having only one
cachable setting for each function?  I would like to understand this so
I can formulate a TODO item to document it.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Timestamp conversion can't use index

От
Bruce Momjian
Дата:
> Someone reported to me that they can't get their queries to use indexes.
> It turns out this is because timestamp() has pg_proc.proiscachable set
> to false in many cases.  Date() also has this in some cases.

Please let me add a reference to this email from Tom Lane:
http://fts.postgresql.org/db/mw/msg.html?mid=1041918

It specifically states:[More complete] reasonable [cachable] definitions would be:1. noncachable: must be called every
time;not guaranteed to return sameresult for same parameters even within a query.  random(), timeofday(),nextval() are
examples.2.fully cachable: function guarantees same result for same parametersno matter when invoked.  This setting
allowsa call with constantparameters to be constant-folded on sight.3. query cachable: function guarantees same result
forsame parameterswithin a single query, or more precisely within a singleCommandCounterIncrement interval.  This
correspondsto the actualbehavior of functions that execute SELECTs, and it's sufficiently strongto allow the function
resultto be used in an indexscan, which is whatwe really care about.
 

Item #2 clearly mentions constant folding, I assume by the optimizer. 
What has me confused is why constant folding is needed to perform index
lookups.  Can't the executor call the function and then do the index
lookup?  Is this just a failing in our executor?  Is there a reason
#1-type noncachable functions can't use indexes?  Is the timezone
related here?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Timestamp conversion can't use index

От
Tom Lane
Дата:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> What has me confused is why constant folding is needed to perform index
> lookups.

You are confused because those aren't related.

The entire notion of an indexscan is predicated on the assumption that
you are comparing all elements of the index to the same comparison
value.  Thus for example "x = random()" is not indexable.  To use an
indexscan the query planner must be able to determine that the right
hand side will not change over the course of the scan.

Constant-folding requires a stronger assumption: that the result the
function gives when evaluated by the query planner will be the same
result we'd get later (perhaps much later) at execution time.

Since we only have one kind of noncachable function at the moment,
these two restrictions are conflated ... but there should be more than
one kind of noncachable function.
        regards, tom lane


Re: Timestamp conversion can't use index

От
Thomas Lockhart
Дата:
> > Someone reported to me that they can't get their queries to use indexes.
> > It turns out this is because timestamp() has pg_proc.proiscachable set
> > to false in many cases.  Date() also has this in some cases.
> Please let me add a reference to this email from Tom Lane:

The functions marked as non-cachable are those that are converting from
data types (such as text for which the input may need to be evaluated
for (at least) that transaction.

What kind of queries against constants are they doing that can't use
SQL-standard syntax to avoid a conversion from another data type?

timestamp('stringy time')

may not be good, but I would think that

timestamp 'timey time'

should let the optimizer use indices just fine. It *could* do some more
constant folding if we had a distinction between functions with
indeterminate side effects (e.g. random()) as opposed to those who just
need to be evaluated once per transaction (say, date/time conversion
functions needing the time zone evaluated).
                    - Thomas


Re: Timestamp conversion can't use index

От
Tom Lane
Дата:
Thomas Lockhart <lockhart@fourpalms.org> writes:
> timestamp('stringy time')
> may not be good, but I would think that
> timestamp 'timey time'
> should let the optimizer use indices just fine.

Yup.  Possibly this should be noted in the FAQ?

Actually,timestamp('stringy time')
doesn't work at all anymore in 7.2, unless you doublequote the name:

regression=# select timestamp('now');
ERROR:  parser: parse error at or near "'"
regression=# select "timestamp"('now');        timestamp
----------------------------2001-12-26 12:18:07.008337
(1 row)

Another interesting factoid is that "timestamp"('now') does indeed
produce a constant in 7.2, not a runtime evaluation of text_timestamp.
text_timestamp is still considered noncachable, but the expression is
considered to represent timestamp 'now' and not a call of text_timestamp,
presumably because of this change:

2001-10-04 18:06  tgl
* doc/src/sgml/typeconv.sgml, src/backend/commands/indexcmds.c,src/backend/parser/parse_func.c,
src/include/parser/parse_func.h:Considerinterpreting a function call as a trivial(binary-compatible) type coercion
afterfailing to find an exactmatch in pg_proc, but before considering interpretations thatinvolve a function call with
oneor more argument type coercions. This avoids surprises wherein what looks like a type coercion isinterpreted as
coercingto some third type and then to thedestination type, as in Dave Blasby's bug report of 3-Oct-01.  Seesubsequent
discussionin pghackers.
 

So there's more here than meets the eye, but the syntax change from
7.1 to 7.2 is definitely going to warrant a FAQ entry, IMHO.
        regards, tom lane


Re: Timestamp conversion can't use index

От
Bruce Momjian
Дата:
Based on Tom's comments and this email, I am adding this to the TODO
list:

* Add new pg_proc cachable settings to specify whether function can be evaluated only once or once per query


---------------------------------------------------------------------------

> > Someone reported to me that they can't get their queries to use indexes.
> > It turns out this is because timestamp() has pg_proc.proiscachable set
> > to false in many cases.  Date() also has this in some cases.
> 
> Please let me add a reference to this email from Tom Lane:
> 
>     http://fts.postgresql.org/db/mw/msg.html?mid=1041918
> 
> It specifically states:
>     
>     [More complete] reasonable [cachable] definitions would be:
>     
>     1. noncachable: must be called every time; not guaranteed to return same
>     result for same parameters even within a query.  random(), timeofday(),
>     nextval() are examples.
>     
>     2. fully cachable: function guarantees same result for same parameters
>     no matter when invoked.  This setting allows a call with constant
>     parameters to be constant-folded on sight.
>     
>     3. query cachable: function guarantees same result for same parameters
>     within a single query, or more precisely within a single
>     CommandCounterIncrement interval.  This corresponds to the actual
>     behavior of functions that execute SELECTs, and it's sufficiently strong
>     to allow the function result to be used in an indexscan, which is what
>     we really care about.
> 
> Item #2 clearly mentions constant folding, I assume by the optimizer. 
> What has me confused is why constant folding is needed to perform index
> lookups.  Can't the executor call the function and then do the index
> lookup?  Is this just a failing in our executor?  Is there a reason
> #1-type noncachable functions can't use indexes?  Is the timezone
> related here?
> 
> -- 
>   Bruce Momjian                        |  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us               |  (610) 853-3000
>   +  If your life is a hard drive,     |  830 Blythe Avenue
>   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Timestamp conversion can't use index

От
Bruce Momjian
Дата:
> > > Someone reported to me that they can't get their queries to use indexes.
> > > It turns out this is because timestamp() has pg_proc.proiscachable set
> > > to false in many cases.  Date() also has this in some cases.
> > Please let me add a reference to this email from Tom Lane:
> 
> The functions marked as non-cachable are those that are converting from
> data types (such as text for which the input may need to be evaluated
> for (at least) that transaction.
> 
> What kind of queries against constants are they doing that can't use
> SQL-standard syntax to avoid a conversion from another data type?


They are doing trying to add one day to a date field:
       test=> EXPLAIN SELECT * FROM test WHERE x = DATE('2001-01-01');       NOTICE:  QUERY PLAN:              Index
ScanUSING i_test ON test  (cost=0.00..3.01 ROWS=1 width=208)              EXPLAIN       test=> EXPLAIN SELECT * FROM
testWHERE x = DATE('2001-01-01') +       INTERVAL '1 DAY';       NOTICE:  QUERY PLAN:              Seq Scan ON test
(cost=0.00..26.00ROWS=5 width=208)
 
       ^^^^^^^^
       EXPLAIN       test=> EXPLAIN SELECT * FROM test WHERE x = DATE('2001-01-01') + 1;       NOTICE:  QUERY PLAN:
        Index Scan USING i_test ON test  (cost=0.00..3.01 ROWS=1 width=208)              EXPLAIN
 


Seems it is an operator that returns a timestamp.

> 
> timestamp('stringy time')
> 
> may not be good, but I would think that
> 
> timestamp 'timey time'
> 
> should let the optimizer use indices just fine. It *could* do some more
> constant folding if we had a distinction between functions with
> indeterminate side effects (e.g. random()) as opposed to those who just
> need to be evaluated once per transaction (say, date/time conversion
> functions needing the time zone evaluated).

I have added this to the TODO list.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Timestamp conversion can't use index

От
Bruce Momjian
Дата:
> Thomas Lockhart <lockhart@fourpalms.org> writes:
> > timestamp('stringy time')
> > may not be good, but I would think that
> > timestamp 'timey time'
> > should let the optimizer use indices just fine.
> 
> Yup.  Possibly this should be noted in the FAQ?
> 
> Actually,
>     timestamp('stringy time')
> doesn't work at all anymore in 7.2, unless you doublequote the name:
> 
> regression=# select timestamp('now');
> ERROR:  parser: parse error at or near "'"
> regression=# select "timestamp"('now');
>          timestamp
> ----------------------------
>  2001-12-26 12:18:07.008337
> (1 row)

I have updated HISTORY and release.sgml Migration sections:
    * The timestamp() function is no longer available.  Use timestamp      "string" instead, or CAST.      
> 
> Another interesting factoid is that "timestamp"('now') does indeed
> produce a constant in 7.2, not a runtime evaluation of text_timestamp.
> text_timestamp is still considered noncachable, but the expression is
> considered to represent timestamp 'now' and not a call of text_timestamp,
> presumably because of this change:
> 
> 2001-10-04 18:06  tgl
> 
>     * doc/src/sgml/typeconv.sgml, src/backend/commands/indexcmds.c,
>     src/backend/parser/parse_func.c, src/include/parser/parse_func.h:
>     Consider interpreting a function call as a trivial
>     (binary-compatible) type coercion after failing to find an exact
>     match in pg_proc, but before considering interpretations that
>     involve a function call with one or more argument type coercions. 
>     This avoids surprises wherein what looks like a type coercion is
>     interpreted as coercing to some third type and then to the
>     destination type, as in Dave Blasby's bug report of 3-Oct-01.  See
>     subsequent discussion in pghackers.
> 
> So there's more here than meets the eye, but the syntax change from
> 7.1 to 7.2 is definitely going to warrant a FAQ entry, IMHO.

Added to same files:
datatype(const,...) function calls now evaluated earlier   

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Timestamp conversion can't use index

От
Tom Lane
Дата:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Added to same files:
>     datatype(const,...) function calls now evaluated earlier   

This is quite wrong, since (a) the change only applies to single-
argument function calls (so, no "..."), (b) the call is not
evaluated "earlier", but "differently", and (c) it doesn't only
apply to constant arguments.

Not sure that I can come up with a one-liner definition of this change,
but the above definitely doesn't do the job.

We already have
  Modify type coersion logic to attempt binary-compatible functions first (Tom)

and I'm not sure there is a better one-liner for it.
        regards, tom lane


Re: Timestamp conversion can't use index

От
Bruce Momjian
Дата:
OK, new text:
Some datatype() function calls now evaluated differently   

At least it is a warning.

---------------------------------------------------------------------------

> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Added to same files:
> >     datatype(const,...) function calls now evaluated earlier   
> 
> This is quite wrong, since (a) the change only applies to single-
> argument function calls (so, no "..."), (b) the call is not
> evaluated "earlier", but "differently", and (c) it doesn't only
> apply to constant arguments.
> 
> Not sure that I can come up with a one-liner definition of this change,
> but the above definitely doesn't do the job.
> 
> We already have
> 
>    Modify type coersion logic to attempt binary-compatible functions first (Tom)
> 
> and I'm not sure there is a better one-liner for it.
> 
>             regards, tom lane
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026