Re: AXLE Plans for 9.5 and 9.6
От | Hannu Krosing |
---|---|
Тема | Re: AXLE Plans for 9.5 and 9.6 |
Дата | |
Msg-id | 535660E0.4020003@krosing.net обсуждение исходный текст |
Ответ на | Re: AXLE Plans for 9.5 and 9.6 (Simon Riggs <simon@2ndQuadrant.com>) |
Список | pgsql-hackers |
On 04/22/2014 02:04 PM, Simon Riggs wrote: > On 22 April 2014 00:24, Josh Berkus <josh@agliodbs.com> wrote: >> On 04/21/2014 03:41 PM, Simon Riggs wrote: >>> Storage Efficiency >>> * Compression >>> * Column Orientation >> You might look at turning this: >> >> http://citusdata.github.io/cstore_fdw/ >> >> ... into a more integrated part of Postgres. > Of course I'm aware of that work - credit to them. Certainly, many > people feel that it is now time to do as you suggest and include > column store features within PostgreSQL. > > As to turning it into a more integrated part of Postgres, we have a > few problems there > > 1. cstore_fdw code has an incompatible licence > > 2. I don't think FDWs are the right place for complex new > architectures such as column store, massively parallel processing or > sharding. I agree that FDW is not an end-all solution for all these, but it is a reasonable starting point and it just might be that the extra things needed could be added to our FDW API instead of sewing it directly into backend guts. I recently tried to implement sharding at FDW level and the main problem I ran into was a missing join type for efficiently using it for certain queries. The specific use case was queries of form select l.*, r* from remotetable r join localtable l on l.key1 = r.id and l.n = N; PostgreSQL offered only two options: 1) full scan on remote table 2) single id=$ selects neither of which are what is actually needed, as firs performs badly if there are more than a few rows in remote table and 2nd performs badly if l.n = N returns more than a few rows when I manually rewrote the query to select l.*, r* from remotetable r where r.id = ANY(ARRAY(select key1 from localtable where n = N)) join localtable l on l.key1 = r.id and l.n = N; it run really well. Unfortunately this is not something that postgreSQL considers by itself while optimising. BTW, this kind of optimisation should also be a win for really large IN queries if we could have an indexed IN whic would not start each lookup from the index root, but rather would sort the IN contents and do an index merge vis skipping from current position. Cheers
В списке pgsql-hackers по дате отправления: