Re: [GENERAL] Large data and slow queries
От | vinny |
---|---|
Тема | Re: [GENERAL] Large data and slow queries |
Дата | |
Msg-id | 2d52de6778a7e907db41267d8e96d373@xs4all.nl обсуждение исходный текст |
Ответ на | Re: [GENERAL] Large data and slow queries (Samuel Williams <space.ship.traveller@gmail.com>) |
Ответы |
Re: [GENERAL] Large data and slow queries
|
Список | pgsql-general |
On 2017-04-19 07:04, Samuel Williams wrote: > Thanks John. Yes, you are absolutely right, you want the index to be > bottom heavy so you can cull as much as possible at the top. I'm > familiar with that, once implementing a brute-force sudoku solver, it > has the same principle. > > I've been working on this all afternoon. By reducing the longitude, > latitude columns to float4, in my test cases, I found about 50% > improvement in performance. It may also use less space. So part of the > problem was my choice of data type. We've computed that float4 has a > worst case precision of about 1.6m which we are okay with for > analytics data. > > Another option we may consider is using a (signed) integer - e.g. > longitude = 180*(v/2^31) and latitude = 180*(v/2^31) as this has a > uniform error across all points, but it's a bit more cumbersome to > handle. Is there a rational datatype in postgres which works like > this? > > > > On 19 April 2017 at 16:42, John R Pierce <pierce@hogranch.com> wrote: >> On 4/18/2017 9:01 PM, Samuel Williams wrote: >>> >>> We want the following kinds of query to be fast: >>> >>> SELECT ... AND (latitude > -37.03079375089291 AND latitude < >>> -36.67086424910709 AND longitude > 174.6307139779924 AND longitude < >>> 175.0805140220076); >> >> >> >> I wonder if GIST would work better if you use the native POINT type, >> and >> compared it like >> >> mypoint <@ BOX >> '((174.6307139779924,-37.03079375089291),(175.0805140220076,-36.67086424910709 >> ))' >> >> with a gist index on mypoint... >> >> but, it all hinges on which clauses in your query are most selective, >> thats >> where you want an index. >> >> -- >> john r pierce, recycling bits in santa cruz >> >> >> >> -- >> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-general Did that 50% performance gain come from just the datatype, or that fact that the index became smaller? Given the number of records, my first thought was either partitioning or partial-indexes. The fewer rows are in the index, the quicker it will be to check, and it's not a lot of work to create separate indexes for lat/long ranges or dates.
В списке pgsql-general по дате отправления: