Re: BUG #15307: Low numerical precision of (Co-) Variance
От | Dean Rasheed |
---|---|
Тема | Re: BUG #15307: Low numerical precision of (Co-) Variance |
Дата | |
Msg-id | CAEZATCX3F5D-cTAJSEW0gqyO6m6TpS=cq9Q-WOzygHvhzmy5Qw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #15307: Low numerical precision of (Co-) Variance (Erich Schubert <erich@debian.org>) |
Ответы |
Re: BUG #15307: Low numerical precision of (Co-) Variance
|
Список | pgsql-bugs |
On 7 August 2018 at 12:58, Erich Schubert <erich@debian.org> wrote: > I am surprised that this is possible to do with arbitrary precision > here, as for example with 8 data points, it will eventually involve a > division by 7, which cannot be represented even with arbitrary (finite) > precision floating point. But maybe just this division comes late enough > (after the difference) that it just works before being converted to a float > for the division. Right, the division comes at the end in that case. > Instead of Welford, I recommend to use the Youngs-Cramer approach. It is > almost the same; roughly it is aggregating sum-X and sum((X-meanX)²) > directly (whereas Welford updates mean-X, and sum((X-meanX)^2)). So the > first aggregate is actually unchanged to the current approach. > In my experiments, this was surprisingly much faster when the data was a > double[]; explainable by CPU pipelining (see the Figure in the paper I > linked - the division comes late, and the next step does not depend on the > division, so pipelining can already process the next double without waiting > for the division to finish). > > Youngs & Cramer original publication: > https://www.jstor.org/stable/pdf/1267176.pdf OK, thanks for the reference. That was very interesting. It was actually the Knuth variant of the Welford algorithm that I was playing with, but I have now tried out the Youngs-Cramer algorithm as well. In terms of performance the YC algorithm was maybe 2-3% slower than Welford, which was roughly on a par with the current schoolbook algorithm in PostgreSQL, but there was some variability in those results, depending on the exact query in question. One thing to bear in mind is that the kind of micro-optimisations that will make a huge difference when processing in-memory data in tight loops as in your examples are not nearly as important in the PostgreSQL environment where there is a significant per-tuple overhead in fetching data, depending on the enclosing SQL query. So the odd cycle here and there from a division isn't going to matter much in this context. Both algorithms eliminate the main loss-of-precision problem that the current schoolbook algorithm suffers from, which I think has to be the primary aim of this. The YC paper suggests that the results from their algorithm might be slightly more accurate (although maybe not enough that we really care). What I do like about the YC algorithm is that it guarantees the same result from avg(), whereas the Welford algorithm will change avg() slightly, possibly making it very slightly less accurate. So I concur, the YC algorithm is probably preferable. For the record, attached are both versions that I tried. > If Postgres could use parallel aggregation, let me know. I can assist with > the proper aggregation of several partitions (both distributed, or > multithreaded). Using multiple partitions can also slightly improve > precision; but it is mostly interesting to process multiple chunks in > parallel. PostgreSQL isn't multithreaded, but it will parallelise large aggregates by distributing the computation over a small number of worker backend processes, and then combining the results. I did a quick back-of-the-envelope calculation to see how to combine the results, and in testing it seemed to work correctly, but it would be worth having a second pair of eyes to validate that. I'll add this to the next commitfest for consideration. Regards, Dean
Вложения
В списке pgsql-bugs по дате отправления: