Re: [HACKERS] strange behavior of UPDATE
От | Tom Lane |
---|---|
Тема | Re: [HACKERS] strange behavior of UPDATE |
Дата | |
Msg-id | 24700.937963003@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: [HACKERS] strange behavior of UPDATE (Bruce Momjian <maillist@candle.pha.pa.us>) |
Список | pgsql-hackers |
Bruce Momjian <maillist@candle.pha.pa.us> writes: > Tom, I assume this is done, right? Only partly. I changed _bt_binsrch to use a pure binary search, but didn't yet get around to fixing _bt_findsplitloc (it's more complicated :-(). Vadim seemed to think that a better solution would be to fix the comparison rules so that no two entries are ever considered to have equal keys anyway. So I put my change on the back burner ... but if his change doesn't happen, mine ought to get done. >> I wrote: >>>> then a possible explanation for the problem would be that our >>>> btree index code isn't very fast when there are large numbers of >>>> identical keys. >> >> Ah-hah, a lucky guess! I went back and looked at the profile stats >> I'd extracted from Edmund's "update" example. This Linux box has >> the same semi-functional gprof as someone else was using a while >> ago --- the timings are bogus, but the call counts seem right. >> And what I find are entries like this: >> >> 0.00 0.00 284977/284977 _bt_binsrch [3174] >> [3177] 0.0 0.00 0.00 284977 _bt_firsteq [3177] >> 0.00 0.00 21784948/24713758 _bt_compare [3169] >> >> 0.00 0.00 426/35632 _bt_split [53] >> 0.00 0.00 35206/35632 _bt_insertonpg [45] >> [3185] 0.0 0.00 0.00 35632 _bt_findsplitloc [3185] >> 0.00 0.00 5093972/8907411 _bt_itemcmp [3171] >> >> In other words, _bt_firsteq is averaging almost 100 comparisons per >> call, _bt_findsplitloc well over that. Both of these routines are >> evidently designed on the assumption that there will be relatively >> few duplicate keys --- they reduce to linear scans when there are >> many duplicates. >> >> _bt_firsteq shouldn't exist at all; the only routine that calls it >> is _bt_binsrch, which does a fast binary search of the index page. >> _bt_binsrch should be fixed so that the binary search logic does the >> right thing for equal-key cases, rather than degrading to a linear >> search. I am less confident that I understand _bt_findsplitloc's place >> in the great scheme of things, but it could certainly be revised to use >> a binary search rather than linear scan. >> >> This benchmark is clearly overemphasizing the equal-key case, but >> I think it ought to be fixed independently of whether we want to >> look good on a dubious benchmark ... equal keys are not uncommon in >> real-world scenarios after all. >> >> Next question is do we want to risk twiddling this code so soon before >> 6.5, or wait till after?
В списке pgsql-hackers по дате отправления: