Re: PATCH: pgbench - random sampling of transaction written into log
От | Robert Haas |
---|---|
Тема | Re: PATCH: pgbench - random sampling of transaction written into log |
Дата | |
Msg-id | CA+TgmoYENHaLJoWDvMwusTxeUatp2Fp3Hd7h-tRj2Jc5X5u-qw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: PATCH: pgbench - random sampling of transaction written into log (Tomas Vondra <tv@fuzzy.cz>) |
Ответы |
Re: PATCH: pgbench - random sampling of transaction
written into log
|
Список | pgsql-hackers |
On Sun, Aug 26, 2012 at 1:04 PM, Tomas Vondra <tv@fuzzy.cz> wrote: > Attached is an improved patch, with a call to rand() replaced with > getrand(). > > I was thinking about the counter but I'm not really sure how to handle > cases like "39%" - I'm not sure a plain (counter % 100 < 37) is not a > good sampling, because it always keeps continuous sequences of > transactions. Maybe there's a clever way to use a counter, but let's > stick to a getrand() unless we can prove is't causing issues. Especially > considering that a lot of data won't be be written at all with low > sampling rates. I like this patch, and I think sticking with a random number is a good idea. But I have two suggestions. Number one, I think the sampling rate should be stored as a float, not an integer, because I can easily imagine wanting a sampling rate that is not an integer percentage - especially, one that is less than one percent, like half a percent or a tenth of a percent. Also, I suggest that the command-line option should be a long option rather than a single character option. That will be more mnemonic and avoid using up too many single letter options, of which there is a limited supply. So to sample every hundredth result, you could do something like this: pgbench --latency-sample-rate 0.01 Another option I personally think would be useful is an option to record only those latencies that are above some minimum bound, like this: pgbench --latency-only-if-more-than $MICROSECONDS The problem with recording all the latencies is that it tends to have a material impact on throughput. Your patch should address that for the case where you just want to characterize the latency, but it would also be nice to have a way of recording the outliers. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: