Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?
От | David Geier |
---|---|
Тема | Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc? |
Дата | |
Msg-id | 6a3f41fd-7103-28e2-dfa0-a7f1e6b89329@gmail.com обсуждение исходный текст |
Ответ на | Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc? (Andres Freund <andres@anarazel.de>) |
Список | pgsql-hackers |
Hi > > I think at least some should be converted to just accumulate in an > instr_time... I think that's for a later patch though? > Yep, at least quite similar. OK. I coded it up in the latest version of the patch. > Depending on how low we want to keep the error, I don't think we can: > > If I set the allowed deviation to 10**-9, we end up requiring a shift by 29 > for common ghz ranges. Clearly 33bits isn't an interesting range. > > But even if you accept a higher error - we don't have *that* much range > available. Assuming an uint64, the range is ~584 years. If we want 10 years > range, we end up > > math.log(((2**64)-1) / (10 * 365 * 60 * 60 * 24 * 10**9), 2) > ~= 5.87 > > So 5 bits available that we could "use" for multiply/shift. For something like > 2.5ghz, that'd be ~2% error, clearly not acceptable. And even just a year of > range, ends up allowing a failure of 30796s = 8min over a year, still too > high. Thanks for doing the math. Agreed. The latest patch detects overflow and correctly handles it. > But I don't think it's really an issue - normally that branch will never be > taken (at least within the memory of the branch predictor), which on modern > CPUs means it'll just be predicted as not taken. So as long as we tell the > compiler what's the likely branch, it should be fine. At least as long as the > branch compares with a hardcoded number. Yeah. The overflow detection just compares two int64. The "overflow threshold" is pre-computed. > - the restriction just to linux, that'll make testing harder for some, and > ends up encoding too much OS dependency > - I think we need both the barrier and non-barrier variant, otherwise I > suspect we'll end up with inccuracies we don't want > - needs lots more documentation about why certain cpuid registers are used > - cpu microarch dependencies - isn't there, e.g., the case that the scale on > nehalem has to be different than on later architectures? > - lack of facility to evaluate how well the different time sources work Makes sense. I carried that list over to my latest e-mail which also includes the patch to have some sort of summary of where we are in a single place. -- David Geier (ServiceNow)
В списке pgsql-hackers по дате отправления: