Re: DBT-3 with SF=20 got failed
От | Tomas Vondra |
---|---|
Тема | Re: DBT-3 with SF=20 got failed |
Дата | |
Msg-id | 5604278D.4030003@2ndquadrant.com обсуждение исходный текст |
Ответ на | Re: DBT-3 with SF=20 got failed (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: DBT-3 with SF=20 got failed
|
Список | pgsql-hackers |
On 09/24/2015 05:09 PM, Robert Haas wrote: > On Thu, Sep 24, 2015 at 9:49 AM, Tomas Vondra > <tomas.vondra@2ndquadrant.com> wrote: >> So while it does not introduce behavior change in this particular >> case (because it fails, as you point out), it introduces a behavior >> change in general - it simply triggers behavior that does not >> happen below the limit. Would we accept the change if the proposed >> limit was 256MB, for example? > > So, I'm a huge fan of arbitrary limits. > > That's probably the single thing I'll say this year that sounds most > like a troll, but it isn't. I really, honestly believe that. > Doubling things is very sensible when they are small, but at some > point it ceases to be sensible. The fact that we can't set a > black-and-white threshold as to when we've crossed over that line > doesn't mean that there is no line. We can't imagine that the > occasional 32GB allocation when 4GB would have been optimal is no > more problematic than the occasional 32MB allocation when 4MB would > have been optimal. Where exactly to put the divider is subjective, > but "what palloc will take" is not an obviously unreasonable > barometer. There are two machines - one with 32GB of RAM and work_mem=2GB, the other one with 256GB of RAM and work_mem=16GB. The machines are hosting about the same data, just scaled accordingly (~8x more data on the large machine). Let's assume there's a significant over-estimate - we expect to get about 10x the actual number of tuples, and the hash table is expected to almost exactly fill work_mem. Using the 1:3 ratio (as in the query at the beginning of this thread) we'll use ~512MB and ~4GB for the buckets, and the rest is for entries. Thanks to the 10x over-estimate, ~64MB and 512MB would be enough for the buckets, so we're wasting ~448MB (13% of RAM) on the small machine and ~3.5GB (~1.3%) on the large machine. How does it make any sense to address the 1.3% and not the 13%? > Of course, if we can postpone sizing the hash table until after the > input size is known, as you suggest, then that would be better still > (but not back-patch material). This dynamic resize is 9.5-only anyway. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-hackers по дате отправления: