Обсуждение: Problems with genetic optimizer
Hi all,
I generated a table with a column type 'box', inserted some values and
created an index (ops_box) on that column, run a vacuum and looked with
the command EXPLAIN if the select will use the index.
And the result was very astonishing:
if I had inserted 120000 elements postgres will _NOT_ use the index, if
I had inserted 100000 elements postgres will use the index!
When I set the sequential search off ('set enable_seqscan off;')
postgres is using the index in both cases.
Has someone any idea what to do?
Peter
--
Bezirksfinanzdirektion Muenchen
Vermessungsabteilung
.........................................................
Peter Keller : Tel: (+49) 089-2190-2594
Vermessungsrat : Fax: (+49) 089-2190-2459
Alexandrastr. 3 : mailto:Peter.Keller@bvv.bayern.de
80538 Muenchen : web: http://www.bayern.de/vermessung
Peter Keller <peter.keller@bvv.bayern.de> writes:
> I generated a table with a column type 'box', inserted some values and
> created an index (ops_box) on that column, run a vacuum and looked with
> the command EXPLAIN if the select will use the index.
> And the result was very astonishing:
> if I had inserted 120000 elements postgres will _NOT_ use the index, if
> I had inserted 100000 elements postgres will use the index!
Hmm. The system's knowledge of selectivities for R-tree indexes is
essentially nil; perhaps someone will be motivated to improve that
someday. In the meantime, the entirely bogus numbers returned by
src/backend/utils/adt/geo_selfuncs.c are supposed to be small enough
to ensure that R-trees are used if available. What was your test
query exactly, and what do you get from EXPLAIN with and without
forcing enable_seqscan off?
regards, tom lane
Hi
>
> Hmm. The system's knowledge of selectivities for R-tree indexes is
> essentially nil; perhaps someone will be motivated to improve that
> someday. In the meantime, the entirely bogus numbers returned by
> src/backend/utils/adt/geo_selfuncs.c are supposed to be small enough
> to ensure that R-trees are used if available. What was your test
> query exactly, and what do you get from EXPLAIN with and without
> forcing enable_seqscan off?
>
> regards, tom lane
Ok, I created a table with only one column (box), inserted 120000
elements, created an index and run a vacuum:
convert=# select count(*) from box_tmp;
count
--------
120000
(1 row)
convert=# explain select * from box_tmp where ebre &&
box('(470758.555,354028.145),(470758.525,354028.115)'::box);
NOTICE: QUERY PLAN:
Seq Scan on box_tmp (cost=0.00..2500.00 rows=2400 width=32)
EXPLAIN
convert=# set enable_seqscan = off;
SET VARIABLE
convert=# explain select * from box_tmp where ebre &&
box('(470758.555,354028.145),(470758.525,354028.115)'::box);
NOTICE: QUERY PLAN:
Index Scan using idx on box_tmp (cost=0.00..2503.28 rows=2400 width=32)
EXPLAIN
I'm running PostgreSQL 7.0.2 on i686-pc-linux-gnu, compiled by gcc
egcs-2.91.66
The result of the query if I set the sequential search on is this:
convert=# select * from box_tmp where ebre &&
box('(470758.555,354028.145),(470758.525,354028.115)'::box);
pqReadData() -- backend closed the channel unexpectedly.
This probably means the backend terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!#
Thanks for helping,
Peter
>
> Peter Keller <peter.keller@bvv.bayern.de> writes:
> > I generated a table with a column type 'box', inserted some values and
> > created an index (ops_box) on that column, run a vacuum and looked with
> > the command EXPLAIN if the select will use the index.
> > And the result was very astonishing:
> > if I had inserted 120000 elements postgres will _NOT_ use the index, if
> > I had inserted 100000 elements postgres will use the index!
--
Bezirksfinanzdirektion Muenchen
Vermessungsabteilung
.........................................................
Peter Keller : Tel: (+49) 089-2190-2594
Vermessungsrat : Fax: (+49) 089-2190-2459
Alexandrastr. 3 : mailto:Peter.Keller@bvv.bayern.de
80538 Muenchen : web: http://www.bayern.de/vermessung
Peter Keller <peter.keller@bvv.bayern.de> writes:
> Ok, I created a table with only one column (box), inserted 120000
> elements, created an index and run a vacuum:
> convert=# explain select * from box_tmp where ebre &&
> box('(470758.555,354028.145),(470758.525,354028.115)'::box);
> NOTICE: QUERY PLAN:
> Seq Scan on box_tmp (cost=0.00..2500.00 rows=2400 width=32)
> convert=# set enable_seqscan = off;
> SET VARIABLE
> convert=# explain select * from box_tmp where ebre &&
> box('(470758.555,354028.145),(470758.525,354028.115)'::box);
> NOTICE: QUERY PLAN:
> Index Scan using idx on box_tmp (cost=0.00..2503.28 rows=2400 width=32)
Oy vey, only three points difference in estimated costs. What is
the actual measured runtime of each approach?
> The result of the query if I set the sequential search on is this:
> convert=# select * from box_tmp where ebre &&
> box('(470758.555,354028.145),(470758.525,354028.115)'::box);
> pqReadData() -- backend closed the channel unexpectedly.
Urk. That's not supposed to happen. There should be a core file
left in your PGDATA/base/dbname/ directory --- can you provide a
backtrace from that file using gdb?
regards, tom lane
> > The result of the query if I set the sequential search on is this:
>
> > convert=# select * from box_tmp where ebre &&
> > box('(470758.555,354028.145),(470758.525,354028.115)'::box);
> > pqReadData() -- backend closed the channel unexpectedly.
>
> Urk. That's not supposed to happen. There should be a core file
> left in your PGDATA/base/dbname/ directory --- can you provide a
> backtrace from that file using gdb?
>
> regards, tom lane
No, I'm sorry, but there is no core file. I tried to debug the query and
this are the relults:
/opt/local/DWH/bin/postmaster: ServerLoop: handling reading 5
/opt/local/DWH/bin/postmaster: ServerLoop: handling reading 5
/opt/local/DWH/bin/postmaster: ServerLoop: handling writing 5
/opt/local/DWH/bin/postmaster: BackendStartup: pid 679 user postgres db
convert
socket 5
/opt/local/DWH/bin/postmaster child[679]: starting with
(/opt/local/DWH/bin/post
gres -d2 -v131072 -p convert )
FindExec: found "/opt/local/DWH/bin/postgres" using argv[0]
started: host=localhost user=postgres database=convert
InitPostgres
StartTransactionCommand
query: SELECT usesuper FROM pg_user WHERE usename = 'postgres'
ProcessQuery
CommitTransactionCommand
StartTransactionCommand
query: select * from box_tmp where ebre &&
('(470758.555,354028.145),(470758.525
,354028.115)'::box);
ProcessQuery
/opt/local/DWH/bin/postmaster: reaping dead processes...
/opt/local/DWH/bin/postmaster: CleanupProc: pid 679 exited with status
11
Server process (pid 679) exited with status 11 at Wed Oct 25 09:04:25
2000
Terminating any active server processes...
Server processes were terminated at Wed Oct 25 09:04:25 2000
Reinitializing shared memory and semaphores
shmem_exit(0)
binding ShmemCreate(key=52e325, size=1104896)
/opt/local/DWH/bin/postmaster: ServerLoop: handling reading 5
/opt/local/DWH/bin/postmaster: ServerLoop: handling reading 5
/opt/local/DWH/bin/postmaster: ServerLoop: handling writing 5
The Data Base System is starting up
/opt/local/DWH/bin/postmaster: ServerLoop: handling writing 5
DEBUG: Data Base System is starting up at Wed Oct 25 09:04:25 2000
DEBUG: Data Base System was interrupted being in production at Wed Oct
25 09:03
:52 2000
DEBUG: Data Base System is in production state at Wed Oct 25 09:04:25
2000
proc_exit(0)
shmem_exit(0)
exit(0)
/opt/local/DWH/bin/postmaster: reaping dead processes...
Don't know if this is helping you
Peter
--
Bezirksfinanzdirektion Muenchen
Vermessungsabteilung
.........................................................
Peter Keller : Tel: (+49) 089-2190-2594
Vermessungsrat : Fax: (+49) 089-2190-2459
Alexandrastr. 3 : mailto:Peter.Keller@bvv.bayern.de
80538 Muenchen : web: http://www.bayern.de/vermessung
Peter Keller <peter.keller@bvv.bayern.de> writes:
> No, I'm sorry, but there is no core file.
You're probably running one of those setups where the postmaster is
started with a ulimit setting that prevents core dumps. You might
want to look into changing that for future debugging purposes.
> query: select * from box_tmp where ebre &&
> ('(470758.555,354028.145),(470758.525
> ,354028.115)'::box);
> ProcessQuery
> /opt/local/DWH/bin/postmaster: reaping dead processes...
> /opt/local/DWH/bin/postmaster: CleanupProc: pid 679 exited with status
> 11
Now that I think about it, are there any NULL entries in box_tmp.ebre?
The box_overlap function, like practically all of the geometric
operators :-(, doesn't defend itself against NULL inputs in 7.0 and
earlier releases. This is fixed for 7.1 but not in a way that could
readily be back-patched into 7.0.*. If there are just specific
operators you need to use with NULL data, you could patch them yourself
in src/backend/utils/adt/geo_ops.c; for instance box_overlap would need
to start out with something like
if (box1 == NULL || box2 == NULL)
return false;
Alternatively, write your queries to avoid invoking && on a NULL, eg
select * from box_tbl where
case when ebre is not null then
ebre && '(470758.555,354028.145),(470758.525,354028.115)'
else false end;
This last is not only ugly but non-indexable, so it's only useful as
a very short-term hack...
regards, tom lane
> No, I'm sorry, but there is no core file.
>
> You're probably running one of those setups where the postmaster is
> started with a ulimit setting that prevents core dumps. You might
> want to look into changing that for future debugging purposes.
Yes, you are right, I changed it in /etc/profile.
>
> > query: select * from box_tmp where ebre &&
> > ('(470758.555,354028.145),(470758.525
> > ,354028.115)'::box);
> > ProcessQuery
> > /opt/local/DWH/bin/postmaster: reaping dead processes...
> > /opt/local/DWH/bin/postmaster: CleanupProc: pid 679 exited with status
> > 11
>
> Now that I think about it, are there any NULL entries in box_tmp.ebre?
> The box_overlap function, like practically all of the geometric
> operators :-(, doesn't defend itself against NULL inputs in 7.0 and
> earlier releases.
YES!! I deleted the NULL entries. Now it works. Thanks!!!
Peter
--
Bezirksfinanzdirektion Muenchen
Vermessungsabteilung
.........................................................
Peter Keller : Tel: (+49) 089-2190-2594
Vermessungsrat : Fax: (+49) 089-2190-2459
Alexandrastr. 3 : mailto:Peter.Keller@bvv.bayern.de
80538 Muenchen : web: http://www.bayern.de/vermessung