parallel query evaluation
От | Oliver Seidel |
---|---|
Тема | parallel query evaluation |
Дата | |
Msg-id | 214a89820818a3da44bcadb4ab829c14@os10000.net обсуждение исходный текст |
Ответы |
Re: parallel query evaluation
|
Список | pgsql-performance |
Hi, I have create table x ( att bigint, val bigint, hash varchar(30) ); with 693million rows. The query create table y as select att, val, count(*) as cnt from x group by att, val; ran for more than 2000 minutes and used 14g memory on an 8g physical RAM machine -- eventually I stopped it. Doing create table y ( att bigint, val bigint, cnt int ); and something a bit like: for i in `seq 0 255` | xargs -n 1 -P 6 psql -c "insert into y select att, val, count(*) from x where att%256=$1 group by att, val" test runs 6 out of 256 in 10 minutes -- meaning the whole problem can be done in just under 3 hours. Question 1: do you see any reason why the second method would yield a different result from the first method? Question 2: is that method generalisabl so that it could be included in the base system without manual shell glue? Thanks, Oliver
В списке pgsql-performance по дате отправления: