Re: [PATCH] pgbench --throttle (submission 7 - with lag measurement)
От | Greg Smith |
---|---|
Тема | Re: [PATCH] pgbench --throttle (submission 7 - with lag measurement) |
Дата | |
Msg-id | 51E2E61F.5020807@2ndQuadrant.com обсуждение исходный текст |
Ответ на | Re: [PATCH] pgbench --throttle (submission 7 - with lag measurement) (Fabien COELHO <coelho@cri.ensmp.fr>) |
Ответы |
Re: [PATCH] pgbench --throttle (submission 7 - with lag
measurement)
|
Список | pgsql-hackers |
On 7/13/13 12:13 PM, Fabien COELHO wrote: > My 0.02€: if it means adding complexity to the pgbench code, I think > that it is not worth it. The point of pgbench is to look at a steady > state, not to end in the most graceful possible way as far as measures > are concerned. That's how some people use pgbench. I'm just as likely to use it to characterize system latency. If there's a source of latency that's specific to the pgbench code, I want that out of there even if it's hard. But we don't have to argue about that because it isn't. The attached new patch seems to fix the latency spikes at the end, with -2 lines of new code! With that resolved I did a final pass across the rate limit code too, attached as a v14 and ready for a committer. I don't really care what order these two changes are committed, there's no hard dependency, but I would like to see them both go in eventually. No functional code was changed from your v13 except for tweaking the output. The main thing I did was expand/edit comments and rename a few variables to try and make this easier to read. If you have any objections to my cosmetic changes feel free to post an update. I've put a good bit of time into trying to simplify this further, thinking it can't really be this hard. But this seems to be the minimum complexity that still works given the mess of the pgbench state machine. Every change I try now breaks something. To wrap up the test results motivating my little pgbench-delay-finish patch, the throttled cases that were always giving >100ms of latency clustered at the end here now look like this: average rate limit lag: 0.181 ms (max 53.108 ms) tps = 10088.727398 (including connections establishing) tps = 10105.775864 (excluding connections establishing) There are still some of these cases where latency spikes, but they're not as big and they're randomly distributed throughout the run. The problem I had with the ones at the end is how they tended to happen a few times in a row. I kept seeing multiple of these ~50ms lulls adding up to a huge one, because the source of the lag kept triggering at every connection close. pgbench was already cleaning up all of its connections at the end, after all the transactions were finished. It looks safe to me to just rely on that for calling PQfinish in the normal case. And calls to client_done already label themselves ok or abort, the code just didn't do anything with that state before. I tried adding some more complicated state tracking, but that adds complexity while doing the exact same thing as the simple implementation I did. The only part of your code change I reverted was altering the latency log transaction timestamps to read like "1373821907.65702" instead of "1373821907 65702". Both formats were considered when I added that feature, and I completely understand why you'd like to change it. One problem is that doing so introduces a new class of float parsing and rounding issues to consumers of that data. I'd rather not revisit that without a better reason to break the output format. Parsing tools like my pgbench-tools already struggle trying to support multiple versions of pgbench, and I don't think there's enough benefit to the float format to bother breaking them today. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
Вложения
В списке pgsql-hackers по дате отправления: