Re: Default setting for enable_hashagg_disk
От | Alvaro Herrera |
---|---|
Тема | Re: Default setting for enable_hashagg_disk |
Дата | |
Msg-id | 20200625224422.GA9653@alvherre.pgsql обсуждение исходный текст |
Ответ на | Re: Default setting for enable_hashagg_disk (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: Default setting for enable_hashagg_disk
|
Список | pgsql-hackers |
On 2020-Jun-25, Andres Freund wrote: > > What are people doing for those cases already? Do we have an > > real-world queries that are a problem in PG 13 for this? > > I don't know about real world, but it's pretty easy to come up with > examples. > > query: > SELECT a, array_agg(b) FROM (SELECT generate_series(1, 10000)) a(a), (SELECT generate_series(1, 10000)) b(b) GROUP BY aHAVING array_length(array_agg(b), 1) = 0; > > work_mem = 4MB > > 12 18470.012 ms > HEAD 44635.210 ms > > HEAD causes ~2.8GB of file IO, 12 doesn't cause any. If you're IO > bandwidth constrained, this could be quite bad. ... however, you can pretty much get the previous performance back by increasing work_mem. I just tried your example here, and I get 32 seconds of runtime for work_mem 4MB, and 13.5 seconds for work_mem 1GB (this one spills about 800 MB); if I increase that again to 1.7GB I get no spilling and 9 seconds of runtime. (For comparison, 12 takes 15.7 seconds regardless of work_mem). My point here is that maybe we don't need to offer a GUC to explicitly turn spilling off; it seems sufficient to let users change work_mem so that spilling will naturally not occur. Why do we need more? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-hackers по дате отправления: