Обсуждение: track_io_timing default setting

Поиск
Список
Период
Сортировка

track_io_timing default setting

От
Jeff Janes
Дата:
Can we change the default setting of track_io_timing to on?

I see a lot of questions, such as over at stackoverflow or dba.stackexchange.com, where people ask for help with plans that would be much more useful were this on.  Maybe they just don't know better, maybe they can't turn it on because they are not a superuser.

I can't imagine a lot of people who care much about its performance impact will be running the latest version of PostgreSQL on ancient/weird systems that have slow clock access. (And the few who do can just turn it off for their system).

For systems with fast user-space clock access, I've never seen this setting being turned on make a noticeable dent in performance.  Maybe I just never tested enough in the most adverse scenario (which I guess would be a huge FS cache, a small shared buffers, and a high CPU count with constant churning of pages that hit the FS cache but miss shared buffers--not a system I have handy to do a lot of tests with.)

Cheers,

Jeff

RE: track_io_timing default setting

От
Jakub Wartak
Дата:
> Can we change the default setting of track_io_timing to on?

+1 for better observability by default.

> I can't imagine a lot of people who care much about its performance impact will be running the latest version of
PostgreSQLon ancient/weird systems that have slow clock access. (And the few who do can just turn it off for their
system).
> For systems with fast user-space clock access, I've never seen this setting being turned on make a noticeable dent in
performance. Maybe I just never tested enough in the most adverse scenario (which I guess would be a huge FS cache, a
smallshared buffers, and a high CPU count with constant churning of pages that hit the FS cache but miss shared
buffers--nota system I have handy to do a lot of tests with.) 

Coincidently I have some quick notes for measuring the impact of changing the "clocksource"  on the Linux 5.10.x (real
syscallvs vd.so optimization) on PgSQL 13.x as input to the discussion. The thing is that the slow "xen" implementation
(atleast on AWS i3, Amazon Linux 2) is default because apparently time with faster TSC/ RDTSC ones can potentially
driftbackwards e.g. during potential(?) VM live migration. I haven't seen better way to see what happens under the hood
thanstrace and/or measuring huge no of calls. This only shows of course the impact to the whole PgSQL (with
track_io_timing=on),not just impact between track_io_timing=on vs off. IMHO better knowledge (in explain analyze,
autovacuum)is worth more than this potential degradation when using slow clocksources. 

With /sys/bus/clocksource/devices/clocksource0/current_clocksource=xen  (default on most AWS instances; ins.pgb =
simpleinsert to table with PK only from sequencer.): 
# time ./testclock # 10e9 calls of gettimeofday()
real    0m58.999s
user    0m35.796s
sys     0m23.204s

//pgbench
    transaction type: ins.pgb
    scaling factor: 1
    query mode: simple
    number of clients: 8
    number of threads: 2
    duration: 100 s
    number of transactions actually processed: 5511485
    latency average = 0.137 ms
    latency stddev = 0.034 ms
    tps = 55114.743913 (including connections establishing)
    tps = 55115.999449 (excluding connections establishing)

With /sys/bus/clocksource/devices/clocksource0/current_clocksource=tsc :
# time ./testclock # 10e9 calls of gettimeofday()
real    0m2.415s
user    0m2.415s
sys     0m0.000s # XXX: notice, userland only workload, no %sys part

//pgbench:
    transaction type: ins.pgb
    scaling factor: 1
    query mode: simple
    number of clients: 8
    number of threads: 2
    duration: 100 s
    number of transactions actually processed: 6190406
    latency average = 0.123 ms
    latency stddev = 0.035 ms
    tps = 61903.863938 (including connections establishing)
    tps = 61905.261175 (excluding connections establishing)

In addition what could be done here - if that XXX note holds true on more platforms - is to measure via rusage() many
gettimeofdays()during startup and log warning to consider checking OS clock implementation if it takes relatively too
longand/or %sys part is > 0. I dunno what to suggest for the potential time going backwards , but changing
track_io_timings=ondoesn't feel like it is going to make stuff crash., so again I think it is good idea.  

-Jakub Wartak.



Re: track_io_timing default setting

От
Tom Lane
Дата:
Jeff Janes <jeff.janes@gmail.com> writes:
> Can we change the default setting of track_io_timing to on?

That adds a very significant amount of overhead on some platforms
(gettimeofday is not cheap if it requires a kernel call).  And I
doubt the claim that the average Postgres user needs this, and
doubt even more that they need it on all the time.
So I'm -1 on the idea.

            regards, tom lane



Re: track_io_timing default setting

От
Laurenz Albe
Дата:
On Fri, 2021-12-10 at 10:20 -0500, Tom Lane wrote:
> Jeff Janes <jeff.janes@gmail.com> writes:
> > Can we change the default setting of track_io_timing to on?
> 
> That adds a very significant amount of overhead on some platforms
> (gettimeofday is not cheap if it requires a kernel call).  And I
> doubt the claim that the average Postgres user needs this, and
> doubt even more that they need it on all the time.
> So I'm -1 on the idea.

I set "track_io_timing" to "on" all the time, same as "log_lock_waits",
so I'd want them both on by default.

Yours,
Laurenz Albe




Re: track_io_timing default setting

От
Tomas Vondra
Дата:
On 12/10/21 17:22, Laurenz Albe wrote:
> On Fri, 2021-12-10 at 10:20 -0500, Tom Lane wrote:
>> Jeff Janes <jeff.janes@gmail.com> writes:
>>> Can we change the default setting of track_io_timing to on?
>>
>> That adds a very significant amount of overhead on some platforms
>> (gettimeofday is not cheap if it requires a kernel call).  And I
>> doubt the claim that the average Postgres user needs this, and
>> doubt even more that they need it on all the time.
>> So I'm -1 on the idea.
> 
> I set "track_io_timing" to "on" all the time, same as "log_lock_waits",
> so I'd want them both on by default.
> 

IMHO those options have very different overhead - log_lock_waits logs 
only stuff that exceeds deadlock_timeout (1s by default), so the amount 
of gettimeofday() calls is miniscule compared to calling it for every 
I/O request.

I wonder if we could simply do the thing we usually do when measuring 
expensive stuff - measure just a small sample. That is, we wouldn't 
measure timing for every I/O request, but just a small fraction. For 
cases with a lot of I/O requests that should give pretty good image.

That's not a simple "change GUC default" patch, but it's not a very 
complicated patch either.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



RE: [EXTERNAL] Re: track_io_timing default setting

От
"Godfrin, Philippe E"
Дата:
>-----Original Message-----
>Sent: Friday, December 10, 2021 9:20 AM
>To: Jeff Janes <jeff.janes@gmail.com>
>Cc: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
>Subject: [EXTERNAL] Re: track_io_timing default setting

>Jeff Janes <jeff.janes@gmail.com> writes:
> Can we change the default setting of track_io_timing to on?

>That adds a very significant amount of overhead on some platforms (gettimeofday is not cheap if it requires a kernel
call). And I doubt the claim that the average Postgres user needs this, and doubt even more >that they need it on all
thetime. 
>So I'm -1 on the idea.

            regards, tom lane
>

In all honesty, the term "significant amount of overhead on some platforms" is ambiguous. Exactly how much overhead and
onwhat platforms??? I would prefer the document to say something  on the order of: 

    "Enables timing of database I/O calls. This parameter is historically off by default, because it will repeatedly
querythe operating system for the current time, which may increase overhead costs  of     elapsed time for each IO.
Platformsknown to incur a problematic overhead are, <etc, etc, etc>. To measure the overhead of timing on your system,
usethe pg_test_timing tool. This overhead may     become a performance issue when less than 90% of the tests execute
formore than 1 microsecond (us). Please refer to the pg_test_timing tool page for more details" 

I have the timing always turned on, but that doesn't necessarily mean the default should be changed. However the
documentationshould be changed as the current phrasing would probably discourage some folks from even trying. I ran the
pg_test_timingtool and it came out to .000000023 seconds overhead. Since we typically measure IO in terms of
milliseconds,this number is statistically insignificant. 

As long as we're on the topic, the documentation for the pg_test_timing tool as well as the output of the tool have
somethingto be desired. The tool output looks like this: 

Testing timing overhead for 3 seconds.
Per loop time including overhead: 23.02 ns
Histogram of timing durations:
  < us   % of total      count
     1     97.70191  127332403
     2      2.29729    2993997
     4      0.00007         90
     8      0.00069        904
    16      0.00004         57

Take note of the comment: "Per loop time including overhead" - so  does that means the overhead IS LESS than the
reported23.02 ns? Is that an issue with the actual test code or the output prose? Furthermore the tool's doc goes on to
thingslike this: 

    "The i7-860 system measured runs the count query in 9.8 ms while the EXPLAIN ANALYZE version takes 16.6 ms, each
processingjust over 100,000 rows. That 6.8 ms difference means the timing     overhead per row is 68 ns, about twice
whatpg_test_timing estimated it would be. Even that relatively small amount of overhead is making the fully timed count
statementtake almost 70% longer. On     more substantial queries, the timing overhead would be less problematic." 

IMHO this is misleading. This timing process is what EXPLAIN ANALYZE does and most likely completely unrelated to the
topicin question - that is turning on io timing! What this paragraph is implying through the reader's chain of events
isthat IF you turn on track_io_timing you may result in a 70% overhead!!! Umm - really??? 

Long story short, I'm perfectly fine with this 'overhead' - unless someone wants to refute this.
Regards,
phil