Re: [SQL] optimizing 2-table join w/millions of rows
От | Tom Lane |
---|---|
Тема | Re: [SQL] optimizing 2-table join w/millions of rows |
Дата | |
Msg-id | 18394.911837459@sss.pgh.pa.us обсуждение исходный текст |
Ответы |
Re: [SQL] optimizing 2-table join w/millions of rows
|
Список | pgsql-sql |
Michael Olivier <molivier@yahoo.com> writes: >> How about: >> SELECT U.acctname >> FROM usertest U, bgndtest B >> WHERE B.acctname = U.acctname >> AND B.part_needed=3 AND B.loc_needed=5 >> AND U.part=2 AND U.loc=3; > Yes, that looks equivalent. My problem is this is too slow an > operation as I'm benchmarking it right now. And if I add less-than or > greater-than comparisons, the performance goes _way_ down from there. > How can I get the best performance out of this kind of operation? The main thing is to make sure you have appropriate indexes and that the query optimizer is making reasonable use of them. My guess here is that you want b-tree indexes on B.acctname and U.acctname, which the optimizer can use to drive a merge join. (Use EXPLAIN with the query to see what plan the optimizer will actually use; don't forget that if you haven't VACUUMed the tables lately, the optimizer may be working from bogus statistics.) A different possibility is to do the primary search on the part number, ie rewrite to SELECT U.acctname FROM usertest U, bgndtest B WHERE B.part_needed=3 AND U.part=2 AND B.loc_needed=5 AND B.acctname = U.acctname AND U.loc=3; Here, indexes on the part number would be stressed. You'd have to experiment to see which is faster on your data. (As far as I know, the Postgres optimizer will always evaluate your AND clauses left to right, which means that the key to performance is to make the leftmost clause reject as large a fraction of candidate rows as possible.) It's possible that indexes on some of the secondary fields would help, depending on what the statistics of your data are. But every index you add slows down insert/update performance, so keep an eye on that when you're deciding how many indexes are worth having. > Is there any way to force postgres to hold certain tables in memory > all the time? As I said, cost of memory isn't an issue, but > performance is. I don't think you can lock a specific table into memory. What you can do is increase the size of the shared buffer area used by the backends (-B switch during postmaster startup, IIRC). This should give you pretty much the same result, if you're always hitting the same table, and it's much more flexible if your access patterns change over time. regards, tom lane
В списке pgsql-sql по дате отправления: