Re: Critical performance problems on large databases
От | Stephan Szabo |
---|---|
Тема | Re: Critical performance problems on large databases |
Дата | |
Msg-id | 20020410154437.W20046-100000@megazone23.bigpanda.com обсуждение исходный текст |
Ответ на | Critical performance problems on large databases (Gunther Schadow <gunther@aurora.regenstrief.org>) |
Список | pgsql-general |
On Wed, 10 Apr 2002, Gunther Schadow wrote: > Off the bat, this indicates to me that there is something > sub-obtimal about PostgreSQL handling simple queries. From > a database that should perform well in online user transactions > one would want the query processing to be streamed as much > as possible, i.e., since in a SELECT * FROM Bigtable; there is > no work needed other than to retrieve the tuples out of > physical storage, the response should be immediate and resource > usage low. There should not be large buffer allocations. > > Conversely it looks as if PostgreSQL will always read a sizeable > piece (if not all?) of the result set into some buffer area before > returning a single row. This would explain the slow startup on > the SELECT * FROM Bigtable; query as well as the fact that > COUNT(smallcolumn) behaves much faster than COUNT(*). IIRC, the entire result set is sent across in the select * from bigtable case, possibly to allow random access to the result set? Not sure really. The usual way to deal with these cases is to use limit/offset or a cursor to fetch pieces of the data as you want them (ie: begin; DECLARE foo CURSOR FOR SELECT * FROM Bigtable; FETCH 100 from foo; FETCH 100 from foo; ... end; ) > Again, count should be streamed as well such as to use no > significant memory resources other than the counter. Apparently > a COUNT(*) in postgres is executed as > > SELECT * FROM Bigtable INTO $somebuffer > COUNT(tuples in $somebuffer) I believe the thing that takes a long time here is that it has to do a sequential scan of Bigtable, which for large tables is rather time consuming. I generally haven't seen large growth of backends on moderate sized tables for such queries, although due to the sequential scan I try to avoid count() whenever possible.
В списке pgsql-general по дате отправления: