Обсуждение: Index on points
Hello. If I have a table like this create table fleet ( ship_id integer, location point); and fill it with a lot of ships and their locations and then want to create an index on this to speed up operations on finding ships within a certain region (let's say its a rectangular region), how do I do this? I tried: CREATE INDEX my_index ON fleet USING gist ( box(location,location)); ? and that command worked, but I have no idea why and what I have just done. The docs are a little to advanced. How should one do it?
On Thu, 2010-09-23 at 12:45 +0200, A B wrote: > Hello. > > If I have a table like this > > create table fleet ( ship_id integer, location point); I recommend taking a look into PostGIS: http://postgis.org Regards, Jeff Davis
But how do I do it without Postgis?
Right now I have a table
fleet (id bigserial primary key, location point);
and I have filled it with random data and then tried selecting
explain analyze select count(1) from fleet where location <@
box(point(300,300),point(600,600));
to gather runningtime data, and then I have created and index (or I
think I have atleast)
CREATE INDEX fleet_location ON fleet USING GIST ( box(location,location) );
but I still get almost exaclty the same run time of the query
explain analyze select count(1) from fleet where location <@
box(point(300,300),point(600,600));
Aggregate (cost=100597.89..100597.90 rows=1 width=0) (actual
time=706.604..706.605 rows=1 loops=1)
-> Seq Scan on fleet (cost=0.00..94347.90 rows=2499996 width=0)
(actual time=0.252..701.624 rows=4533 loops=1)
Filter: (location <@ '(600,600),(300,300)'::box)
Total runtime: 706.675 ms
I have then tried to avoid the seq. scan by
set enable_seqscan=off;
set seq_page_cost=4000; (which would make it more expensive to scan,
wouldn't it?)
and the result is the same
Aggregate (cost=10127460749.89..10127460749.90 rows=1 width=0)
(actual time=799.077..799.078 rows=1 loops=1)
-> Seq Scan on fleet (cost=10000000000.00..10127454499.90
rows=2499996 width=0) (actual time=0.221..792.374 rows=4533 loops=1)
Filter: (location <@ '(600,600),(300,300)'::box)
Total runtime: 799.117 ms
So how do I create an index that gets used?
(I've run the queries a thousand times to make sure the total runtime
is consistent, and it is)
2010/9/23 Jeff Davis <pgsql@j-davis.com>:
> On Thu, 2010-09-23 at 12:45 +0200, A B wrote:
>> Hello.
>>
>> If I have a table like this
>>
>> create table fleet ( ship_id integer, location point);
>
> I recommend taking a look into PostGIS: http://postgis.org
>
> Regards,
> Jeff Davis
>
>
On 23/09/10 11:45, A B wrote:
> Hello.
>
> If I have a table like this
>
> create table fleet ( ship_id integer, location point);
>
> and fill it with a lot of ships and their locations and then want to
> create an index on this to speed up operations on finding ships within
> a certain region (let's say its a rectangular region), how do I do
> this?
>
> I tried:
>
> CREATE INDEX my_index ON fleet USING gist ( box(location,location)); ?
That's the idea, but you'll need to be careful about how you're
searching against it. Remember, the index is on a box based on the
location, not the point location itself.
CREATE TABLE fleet (ship int, locn point);
INSERT INTO fleet SELECT (x*1000 + y), point(x,y)
FROM generate_series(0,999) x, generate_series(0,999) y;
CREATE INDEX fleet_locn_idx ON fleet USING gist( box(locn,locn) );
ANALYSE fleet;
EXPLAIN ANALYSE SELECT count(*) FROM fleet
WHERE box(locn,locn) <@ box '(10,10),(20,20)';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=2654.84..2654.85 rows=1 width=0) (actual
time=4.611..4.612 rows=1 loops=1)
-> Bitmap Heap Scan on fleet (cost=44.34..2652.33 rows=1000
width=0) (actual time=4.344..4.491 rows=121 loops=1)
Recheck Cond: (box(locn, locn) <@ '(20,20),(10,10)'::box)
-> Bitmap Index Scan on fleet_locn_idx (cost=0.00..44.09
rows=1000 width=0) (actual time=4.311..4.311 rows=121 loops=1)
Index Cond: (box(locn, locn) <@ '(20,20),(10,10)'::box)
Total runtime: 4.694 ms
(6 rows)
DROP INDEX fleet_locn_idx;
EXPLAIN ANALYSE SELECT count(*) FROM fleet WHERE box(locn,locn) <@ box
'(10,10),(20,20)';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Aggregate (cost=20885.50..20885.51 rows=1 width=0) (actual
time=551.756..551.757 rows=1 loops=1)
-> Seq Scan on fleet (cost=0.00..20883.00 rows=1000 width=0)
(actual time=5.142..551.624 rows=121 loops=1)
Filter: (box(locn, locn) <@ '(20,20),(10,10)'::box)
Total runtime: 551.831 ms
(4 rows)
--
Richard Huxton
Archonet Ltd
On Sat, 2010-09-25 at 09:18 +0200, A B wrote:
> fleet (id bigserial primary key, location point);
>
...
> CREATE INDEX fleet_location ON fleet USING GIST ( box(location,location) );
>
> but I still get almost exaclty the same run time of the query
>
>
> explain analyze select count(1) from fleet where location <@
> box(point(300,300),point(600,600));
>
> Aggregate (cost=100597.89..100597.90 rows=1 width=0) (actual
> time=706.604..706.605 rows=1 loops=1)
> -> Seq Scan on fleet (cost=0.00..94347.90 rows=2499996 width=0)
> (actual time=0.252..701.624 rows=4533 loops=1)
> Filter: (location <@ '(600,600),(300,300)'::box)
> Total runtime: 706.675 ms
>
...
> So how do I create an index that gets used?
> (I've run the queries a thousand times to make sure the total runtime
> is consistent, and it is)
Looks like there's no entry in the box_ops opclass for point <@ box, but
there is an entry for box <@ box.
So, try:
explain analyze
select count(1) from fleet
where box(location,location) <@ box(point(300,300),point(600,600));
There's no reason that there couldn't be a point <@ box operator in the
opclass, but nobody really uses these geometric types that come with
core postgres (at least, not that I can tell). PostGIS is a dedicated
project that has removed most of the justification for trying to improve
the built-in geometric types. However, keep in mind that PostGIS is
under a different license (GPL, I think).
Regards,
Jeff Davis
Jeff Davis <pgsql@j-davis.com> writes:
> There's no reason that there couldn't be a point <@ box operator in the
> opclass, but nobody really uses these geometric types that come with
> core postgres (at least, not that I can tell).
Actually, as of 9.0 there is a point_ops opclass for GIST, with these
indexable operators:
>^(point,point)
<<(point,point)
>>(point,point)
<^(point,point)
~=(point,point)
<@(point,box)
<@(point,polygon)
<@(point,circle)
I agree that for any more than light-duty geometric work, you ought
to look at PostGIS.
regards, tom lane
2010/9/25 Tom Lane <tgl@sss.pgh.pa.us>: > Jeff Davis <pgsql@j-davis.com> writes: >> There's no reason that there couldn't be a point <@ box operator in the >> opclass, but nobody really uses these geometric types that come with >> core postgres (at least, not that I can tell). > > Actually, as of 9.0 there is a point_ops opclass for GIST, with these > indexable operators: > > >^(point,point) > <<(point,point) > >>(point,point) > <^(point,point) > ~=(point,point) > <@(point,box) > <@(point,polygon) > <@(point,circle) > > I agree that for any more than light-duty geometric work, you ought > to look at PostGIS. > > regards, tom lane Thank you Jeff for your reply, that solved the problem. Tom, would you like to elaborate on that PostGIS should be used for other than "light-duty" geometric work? Is it speed, accuracy or features that is the difference? For this project I think <@(point,box) is sufficient. What would it take to motivate a switch to PostGIS for that? Best wishes.
Sorry, Gmail made med confused, my biggest "thank you" was to Richard Huxton, who showed me code that worked. 2010/9/26 A B <gentosaker@gmail.com>: > 2010/9/25 Tom Lane <tgl@sss.pgh.pa.us>: >> Jeff Davis <pgsql@j-davis.com> writes: >>> There's no reason that there couldn't be a point <@ box operator in the >>> opclass, but nobody really uses these geometric types that come with >>> core postgres (at least, not that I can tell). >> >> Actually, as of 9.0 there is a point_ops opclass for GIST, with these >> indexable operators: >> >> >^(point,point) >> <<(point,point) >> >>(point,point) >> <^(point,point) >> ~=(point,point) >> <@(point,box) >> <@(point,polygon) >> <@(point,circle) >> >> I agree that for any more than light-duty geometric work, you ought >> to look at PostGIS. >> >> regards, tom lane > > Thank you Jeff for your reply, that solved the problem. > > Tom, would you like to elaborate on that PostGIS should be used for > other than "light-duty" geometric work? > Is it speed, accuracy or features that is the difference? > For this project I think <@(point,box) is sufficient. What would it > take to motivate a switch to PostGIS for that? > > Best wishes. >