Re: A potential memory leak on Merge Join when Sort node is not below Materialize node
От | David Rowley |
---|---|
Тема | Re: A potential memory leak on Merge Join when Sort node is not below Materialize node |
Дата | |
Msg-id | CAApHDvpxQsTX0rYceTQYeSqUZzpyFQkFVKOV=UUB68=NLcJAFg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: A potential memory leak on Merge Join when Sort node is not below Materialize node (Peter Geoghegan <pg@bowt.ie>) |
Ответы |
Re: A potential memory leak on Merge Join when Sort node is not below Materialize node
|
Список | pgsql-hackers |
On Thu, 29 Sept 2022 at 12:07, Peter Geoghegan <pg@bowt.ie> wrote: > Also potentially relevant: the 2017 commit fa117ee4 anticipated adding > a "copy" argument to tuplesort_getdatum() (the same commit added such > a "copy" argument to tuplesort_gettupleslot()). I see that that still > hasn't happened to tuplesort_getdatum() all these years later. Might > be a good idea to do it in the next year or two, though. > > If David is interested in pursuing this now then I certainly won't object. Just while this is fresh in my head, I wrote some code to make this happen. My preference would be not to add the "copy" param to the existing function and instead just add a new function to prevent additional branching. The attached puts back the datum sort in nodeSort.c for byref types and adjusts process_ordered_aggregate_single() to make use of this function. I did a quick benchmark to see if this help DISTINCT aggregate any: create table t1 (a varchar(32) not null, b varchar(32) not null); insert into t1 select md5((x%10)::text),md5((x%10)::text) from generate_Series(1,1000000)x; vacuum freeze t1; create index on t1(a); With a work_mem of 256MBs I get: query = select max(distinct a), max(distinct b) from t1; Master: latency average = 313.197 ms Patched: latency average = 304.335 ms So not a very impressive speedup there (about 3%) Some excerpts from perf top show: Master: 1.40% postgres [.] palloc 1.13% postgres [.] tuplesort_getdatum 0.77% postgres [.] datumCopy Patched: 0.91% postgres [.] tuplesort_getdatum_nocopy 0.65% postgres [.] palloc I stared for a while at the mode_final() function and thought maybe we could use the nocopy variant there. I just didn't quite pluck up the motivation to write any code to see if it could be made faster. David
Вложения
В списке pgsql-hackers по дате отправления: