Обсуждение: GiST rtree logic is not right
It occurred to me to wonder whether contrib/rtree_gist fixes the rtree bug documented here: http://archives.postgresql.org/pgsql-general/2004-03/msg01143.php The answer is unfortunately "no". In the regression database, install rtree_gist and do: regression=# create table gist_emp4000 as select * from slow_emp4000; SELECT regression=# create index grect2ind ON gist_emp4000 USING gist (home_base); CREATE INDEX regression=# select count(*) from gist_emp4000 where home_base << '(35565,5404),(35546,5360)';count ------- 2144 (1 row) The correct answer is regression=# select count(*) from slow_emp4000 where home_base << '(35565,5404),(35546,5360)';count ------- 2214 (1 row) Now this is noticeably better than the rtree implementation, which finds only 1363 rows, but broken is still broken :-( This is doubtless not as high priority as the concurrency stuff you are working on, but it'd be good to fix anyway. I was thinking of proposing that we move rtree_gist into the core --- but the case for it would be stronger if it worked correctly ... regards, tom lane
I'll look at problem after GiST concurrency. Fixing rtree_gist is bug a fix, not a new feature, so I'm not limited by 1 July. > This is doubtless not as high priority as the concurrency stuff you > are working on, but it'd be good to fix anyway. I was thinking of > proposing that we move rtree_gist into the core --- but the case for > it would be stronger if it worked correctly ... > -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
> I'll look at problem after GiST concurrency. Fixing > rtree_gist is bug a fix, not a new feature, so I'm not > limited by 1 July. Wont fixing rtree(_gist) require initdb, since the behaviour of the operators will change? ... John
Hmm,
something is broken in HEAD. I recall it worked in STABLE, here is what I have
Welcome to psql 8.0.3, the PostgreSQL interactive terminal.
Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands
\gor terminate with semicolon to execute query \q to quit
regression=# create index grect2ind ON gist_emp4000 USING gist (home_base);
CREATE INDEX
regression=# select count(*) from gist_emp4000 where home_base << '(35565,5404)
,(35546,5360)'; count
------- 2214
(1 row)
btw, Teodor, I noticed gist_print from gevel package doesn't works in HEAD
regression=# select * from gist_print('grect2ind') as t(level int, a box) where level =1;
ERROR: function return row and query-specified return row do not match
DETAIL: Returned row contains 3 attributes, but query expects 2.
Oleg
On Wed, 22 Jun 2005, Tom Lane wrote:
> It occurred to me to wonder whether contrib/rtree_gist fixes the rtree
> bug documented here:
> http://archives.postgresql.org/pgsql-general/2004-03/msg01143.php
>
> The answer is unfortunately "no". In the regression database,
> install rtree_gist and do:
>
> regression=# create table gist_emp4000 as select * from slow_emp4000;
> SELECT
> regression=# create index grect2ind ON gist_emp4000 USING gist (home_base);
> CREATE INDEX
> regression=# select count(*) from gist_emp4000 where home_base << '(35565,5404),(35546,5360)';
> count
> -------
> 2144
> (1 row)
>
> The correct answer is
>
> regression=# select count(*) from slow_emp4000 where home_base << '(35565,5404),(35546,5360)';
> count
> -------
> 2214
> (1 row)
>
> Now this is noticeably better than the rtree implementation, which finds
> only 1363 rows, but broken is still broken :-(
>
> This is doubtless not as high priority as the concurrency stuff you
> are working on, but it'd be good to fix anyway. I was thinking of
> proposing that we move rtree_gist into the core --- but the case for
> it would be stronger if it worked correctly ...
>
> regards, tom lane
>
Regards, Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
Oleg Bartunov <oleg@sai.msu.su> writes:
> something is broken in HEAD. I recall it worked in STABLE, here is what I have
No, actually it is broken in 8.0 too. The difference is that 8.0
defaults to not using an indexscan for this query. In 8.0 I see:
regression=# set enable_seqscan TO 1;
SET
regression=# explain select count(*) from gist_emp4000 where home_base << '(35565,5404),(35546,5360)';
QUERY PLAN
---------------------------------------------------------------------Aggregate (cost=65.53..65.53 rows=1 width=0) ->
SeqScan on gist_emp4000 (cost=0.00..64.75 rows=310 width=0) Filter: (home_base <<
'(35565,5404),(35546,5360)'::box)
(3 rows)
regression=# select count(*) from gist_emp4000 where home_base << '(35565,5404),(35546,5360)';count
------- 2214
(1 row)
regression=# set enable_seqscan TO 0;
SET
regression=# explain select count(*) from gist_emp4000 where home_base << '(35565,5404),(35546,5360)';
QUERY PLAN
----------------------------------------------------------------------------------------Aggregate (cost=112.96..112.96
rows=1width=0) -> Index Scan using grect2ind on gist_emp4000 (cost=0.00..112.18 rows=310 width=0) Index Cond:
(home_base<< '(35565,5404),(35546,5360)'::box)
(3 rows)
regression=# select count(*) from gist_emp4000 where home_base << '(35565,5404),(35546,5360)';count
------- 2144
(1 row)
In CVS tip the default plan is:
regression=# explain select count(*) from gist_emp4000 where home_base << '(35565,5404),(35546,5360)';
QUERY PLAN
--------------------------------------------------------------------------------Aggregate (cost=35.74..35.74 rows=1
width=0) -> Bitmap Heap Scan on gist_emp4000 (cost=5.08..34.96 rows=310 width=0) Recheck Cond: (home_base <<
'(35565,5404),(35546,5360)'::box) -> Bitmap Index Scan on grect2ind (cost=0.00..5.08 rows=310 width=0)
Index Cond: (home_base << '(35565,5404),(35546,5360)'::box)
(5 rows)
regression=# select count(*) from gist_emp4000 where home_base << '(35565,5404),(35546,5360)';count
------- 2144
(1 row)
regards, tom lane
"John Hansen" <john@geeknet.com.au> writes:
>> I'll look at problem after GiST concurrency. Fixing
>> rtree_gist is bug a fix, not a new feature, so I'm not
>> limited by 1 July.
> Wont fixing rtree(_gist) require initdb, since the behaviour of the
> operators will change?
Possibly, but we never guarantee no initdb until final release anyway.
Teodor is right that this is a bug fix and so can be postponed on its
own terms. But moving rtree_gist into the core looks like a feature
change to me, so if that's going to happen it has to happen before
1 July. It would be a lot easier to sell that if it gave the right
answers ;-)
regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > It would be a lot easier to sell that if it gave the right answers ;-) Of course given that the real rtree index gives the wrong answers perhaps moving rtree_gist into core (after fixing this) is just a bug fix :) -- greg