Re: "Big O" notation for postgres?
От | H. Hall |
---|---|
Тема | Re: "Big O" notation for postgres? |
Дата | |
Msg-id | 483AD082.8020104@reedyriver.com обсуждение исходный текст |
Ответ на | Re: "Big O" notation for postgres? (Gregory Stark <stark@enterprisedb.com>) |
Список | pgsql-performance |
Gregory Stark wrote: > "Richard Huxton" <dev@archonet.com> writes: > > >> Jonah H. Harris wrote: >> >>> On Wed, May 21, 2008 at 10:10 AM, H. Hall wrote: >>> >>>> Does anyone know if there is a source that provides "Big O" notation for >>>> postgres's aggregate functions and operations? For example is count(*) = >>>> O(1) or O(n)? >>>> >>> I don't know of any document containing the complexity of each >>> aggregate, but it's sometimes left as a comment in the souce code. >>> >> Recent max() and min() can be O(n) or O(1) depending on the where-clause and >> presence of an index too, just to muddy the waters. >> When I first read the above from Jonah I just assumed some Postgres magic was producing O(1). After seeing this again, I believe that Postgres must be doing something like the following for max and min: Max: ORDER BY colName DESC LIMIT 1 Min: ORDER BY coName ASC LIMIT 1 Thus Jonah's caveat about using an index. If postgres is using an index as in the above then the Max and Min functions would both be O(log N) , this is log base2 of N, which is the time it takes to search a balanced binary tree, not O(1) which implies a constant time to perform, regardless of the size of the dataset N. > > Hm, true. But excluding those special cases all Postgres aggregate functions > will be O(n) unless they're doing something very odd. None of the built-in > functions (except min/max as noted) do anything odd like that. > > The reason way is because of the basic design of Postgres aggregate functions. > They are fed every tuple one by one and have to keep their state in a single > variable. Most of the aggregate functions like count(*) etc just keep a static > non-growing state and the state transition function is a simple arithmetic > operation which is O(1). So the resulting operation is O(n). > > Actually one exception would be something like > > CREATE AGGREGATE array_agg(anyelement) (SFUNC = array_append, STYPE = anyarray, INITCOND='{}'); > > Since the state variable has to keep accumulating more and more stuff the > array_append becomes more and more expensive (it has to generate a new array > so it has to copy the existing stuff). So actually it woul dbe O(n^2). > > The only builtin aggregate which looks like it falls in this category would be > xmlagg() > > Functions with O(N^2) scale very badly of course. It would be very handy if the Planner could kick out Big O with its estimates. This would allow one to quickly tell how a query scales with a large number of rows. Thanks, HH -- H. Hall ReedyRiver Group LLC http://www.reedyriver.com
В списке pgsql-performance по дате отправления: