Re: 9.5: Better memory accounting, towards memory-bounded HashAgg
От | Robert Haas |
---|---|
Тема | Re: 9.5: Better memory accounting, towards memory-bounded HashAgg |
Дата | |
Msg-id | CA+Tgmobnu7XEn1gRdXnFo37P79bF=qLt46=37ajP3Cro9dBRaA@mail.gmail.com обсуждение исходный текст |
Ответ на | 9.5: Better memory accounting, towards memory-bounded HashAgg (Jeff Davis <pgsql@j-davis.com>) |
Ответы |
Re: 9.5: Better memory accounting, towards memory-bounded
HashAgg
|
Список | pgsql-hackers |
On Sat, Aug 2, 2014 at 4:40 PM, Jeff Davis <pgsql@j-davis.com> wrote: > Attached is a patch that explicitly tracks allocated memory (the blocks, > not the chunks) for each memory context, as well as its children. > > This is a prerequisite for memory-bounded HashAgg, which I intend to > submit for the next CF. Hashjoin tracks the tuple sizes that it adds to > the hash table, which is a good estimate for Hashjoin. But I don't think > it's as easy for Hashagg, for which we need to track transition values, > etc. (also, for HashAgg, I expect that the overhead will be more > significant than for Hashjoin). If we track the space used by the memory > contexts directly, it's easier and more accurate. > > I did some simple pgbench select-only tests, and I didn't see any TPS > difference. I was curious whether a performance difference would show up when sorting, so I tried it out. I set up a test with pgbench -i 300. I then repeatedly restarted the database, and after each restart, did this: time psql -c 'set trace_sort=on; reindex index pgbench_accounts_pkey;' I alternated runs between master and master with this patch, and got the following results: master: LOG: internal sort ended, 1723933 KB used: CPU 2.58s/11.54u sec elapsed 16.88 sec LOG: internal sort ended, 1723933 KB used: CPU 2.50s/12.37u sec elapsed 17.60 sec LOG: internal sort ended, 1723933 KB used: CPU 2.14s/11.28u sec elapsed 16.11 sec memory-accounting: LOG: internal sort ended, 1723933 KB used: CPU 2.57s/11.97u sec elapsed 17.39 sec LOG: internal sort ended, 1723933 KB used: CPU 2.30s/12.57u sec elapsed 17.68 sec LOG: internal sort ended, 1723933 KB used: CPU 2.54s/11.99u sec elapsed 17.25 sec Comparing the median times, that's about a 3% regression. For this particular case, we might be able to recapture that by replacing the bespoke memory-tracking logic in tuplesort.c with use of this new facility. I'm not sure whether there are other cases that we might also want to test; I think stuff that runs all on the server side is likely to show up problems more clearly than pgbench. Maybe a PL/pgsql loop that does something allocation-intensive on each iteration, for example, like parsing a big JSON document. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: