Re: inheritance: planning time vs children number vs column number
От | Marc Cousin |
---|---|
Тема | Re: inheritance: planning time vs children number vs column number |
Дата | |
Msg-id | 201103010842.42128.cousinmarc@gmail.com обсуждение исходный текст |
Ответ на | Re: inheritance: planning time vs children number vs column number (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: inheritance: planning time vs children number vs column number
|
Список | pgsql-performance |
Le mardi 01 mars 2011 07:20:19, Tom Lane a écrit : > Marc Cousin <cousinmarc@gmail.com> writes: > > The Monday 28 February 2011 16:35:37, Tom Lane wrote : > >> Could we see a concrete example demonstrating that? I agree with Heikki > >> that it's not obvious what you are testing that would have such > >> behavior. I can think of places that would have O(N^2) behavior in the > >> length of the targetlist, but it seems unlikely that they'd come to > >> dominate runtime at a mere 1000 columns. > > > > I feel a little silly not having provided a test case from the startق� > > > > A script doing a complete test is attached to this email. > > I did some oprofile analysis of this test case. It's spending > essentially all its time in SearchCatCache, on failed searches of > pg_statistic. The cache accumulates negative entries for each probed > column, and then the searches take time proportional to the number of > entries, so indeed there is an O(N^2) behavior --- but N is the number > of columns times number of tables in your test case, not just the number > of columns. > > The cache is a hash table, so ideally the search time would be more or > less constant as the table grows, but to make that happen we'd need to > reallocate with more buckets as the table grows, and catcache.c doesn't > do that yet. We've seen a few cases that make that look worth doing, > but they tend to be pretty extreme, like this one. > > It's worth pointing out that the only reason this effect is dominating > the runtime is that you don't have any statistics for these toy test > tables. If you did, the cycles spent using those entries would dwarf > the lookup costs, I think. So it's hard to get excited about doing > anything based on this test case --- it's likely the bottleneck would be > somewhere else entirely if you'd bothered to load up some data. > > regards, tom lane Yes, for the same test case, with a bit of data in every partition and statistics up to date, planning time goes from 20 seconds to 125ms for the 600 children/1000 columns case. Which is of course more than acceptable. Now I've got to check it's the same problem on the real environment. I think it has quite a few empty partitions, so no statistics for them… Thanks a lot. Marc
В списке pgsql-performance по дате отправления: