How to do this in Postgres
От | Holger Klawitter |
---|---|
Тема | How to do this in Postgres |
Дата | |
Msg-id | 383BAE18.B7F4D49C@klawitter.de обсуждение исходный текст |
Список | pgsql-general |
Thanks for all the hints I've got! What I've learned from this question is, that not everything which seems to be a database problem should be solved using databases. (select * from data where <<prio is minimal per id pair>>) The problem wasn't that there is no statement to get this data, the problem was the large number of rows. Everything involving a join of data with itself took plainly too long. Nested selects only seem to be a syntactical escape. However, there is one solution (PostgreSQL specific) using the feature SELECT DISTINCT ON extending the SQL standard. This select retrieves only the first occurrence obeying the implied order. However, one has to be distinct on one *single* column. (Feature request: SELECT DISTINCT ON (id1,id2) ... ): alter table data add column combined text; update data set combined = ( id1 || '|' ) || id2; select distinct on combined id1,id2,<<data>> from data order by prio But, the update already takes more than 30 min. The following perl script (paraphrased) solved the same problem in 40 sec (not counting "copy into" which I had to do anyhow): while( <> ) { my( $id1, $id2, $prio, $data ) = split( /\t/, $_ ); my( $key ) = "$id1\t$id2"; my( $entry ) = $table{$key}; if( $entry eq "" ) { $table{$key} = "$prio\t$data"; } else { my( $oldprio, $junk ) = split( /\t/, $entry ); if( $prio < $oldprio ) { $table{$key} = "$prio\t$data"; } } } foreach $key (keys %table) { printf "%s\t%s", $key, $table{$key}; } Regards, Holger Klawitter -- Holger Klawitter +49 (0)251 484 0637 holger@klawitter.de http://www.klawitter.de/
В списке pgsql-general по дате отправления: