Re: trouble converting several serial queries into a parallel query
От | Chris Mair |
---|---|
Тема | Re: trouble converting several serial queries into a parallel query |
Дата | |
Msg-id | 2fd2157f47b42b07b09330a2a1847946@smtp.hushmail.com обсуждение исходный текст |
Ответ на | trouble converting several serial queries into a parallel query (Jonathan Vanasco <postgres@2xlp.com>) |
Список | pgsql-general |
> I have a very simple query that is giving me some issues due to the size of the database and the number of requests I maketo it in order to compile the report I need: > > A dumbed down version of the table and query: > > CREATE TABLE a_to_b ( > id_a INT NOT NULL REFERENCES table_a(id), > id_b INT NOT NULL REFERENCES table_b(id), > PRIMARY KEY (id_a, id_b) > ); > SELECT id_a, id_b FROM a_2_b WHERE id_a = 1 LIMIT 5; > > The problem is that the table has a few million records and I need to query it 30+ times in a row. > > I'd like to improve this with a parallel search using `IN()` > > SELECT id_a, id_b FROM a_2_b WHERE id_a = IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26.27,28,29,30); > > That technique has generally fixed a lot of bottlenecks for us. > > However I can't wrap my head around structuring it so that I can apply a limit based on the column -- so that I only get5 records per id_a. > > The table has columns that I would use for ordering in the future, but I'm fine with just getting random values right now. > > Can anyone offer some suggestions? Thanks in advance. Hi, I had exactly the same problem some time ago and came up with this: select * from ( select *, rank() over (partition by id_a order by id_b) as r from a_to_b where id_a in (1, 2) ) as subsel where r <= 5; Note the ordering is already there (by id_b), you can pick other columns of course). It looks a bit complicated, though. If anybody knows a more straitforward way I'd be glad to hear it :) Bye, Chris.
В списке pgsql-general по дате отправления: