Обсуждение: High-CPU consumption on information_schema (only) query

Поиск
Список
Период
Сортировка

High-CPU consumption on information_schema (only) query

От
Robins Tharakan
Дата:
Hi,

An SQL (with only information_schema related JOINS) when triggered, runs with max CPU (and never ends - killed after 2 days).
- It runs similarly (very slow) on a replicated server that acts as a read-only slave.
- Top shows only postgres as hitting max CPU (nothing else). When query killed, CPU near 0%.
- When the DB is restored on a separate test server (with the exact postgresql.conf) the same query works fine.
- There is no concurrent usage on the replicated / test server (although the primary is a Production server and has concurrent users).

Questions:
- If this was a postgres bug or a configuration issue, query on the restored DB should have been slow too. Is there something very basic I am missing here?

If someone asks for I could provide SQL + EXPLAIN, but it feels irrelevant here. I amn't looking for a specific solution but what else should I be looking for here? 

p.s.: All postgres servers are running the v9.3.10

-
robins
--

-
robins

Re: High-CPU consumption on information_schema (only) query

От
Craig Ringer
Дата:
<p dir="ltr"><p dir="ltr">On 8 Sep. 2016 7:38 am, "Robins Tharakan" <<a
href="mailto:tharakan@gmail.com">tharakan@gmail.com</a>>wrote:<br /> ><br /> > Hi,<br /> ><br /> > An
SQL(with only information_schema related JOINS) when triggered, runs with max CPU (and never ends - killed after 2
days).<br/> > - It runs similarly (very slow) on a replicated server that acts as a read-only slave.<br /> > -
Topshows only postgres as hitting max CPU (nothing else). When query killed, CPU near 0%.<br /> > - When the DB is
restoredon a separate test server (with the exact postgresql.conf) the same query works fine.<br /> > - There is no
concurrentusage on the replicated / test server (although the primary is a Production server and has concurrent
users).<br/> ><br /> > Questions:<br /> > - If this was a postgres bug or a configuration issue, query on the
restoredDB should have been slow too. Is there something very basic I am missing here?<br /> ><br /> > If someone
asksfor I could provide SQL + EXPLAIN, but it feels irrelevant here. I amn't looking for a specific solution but what
elseshould I be looking for here? <p dir="ltr">Get a series of stack traces.<p dir="ltr">Perf with stack output would
begood too.<p dir="ltr">You need debug info for both. 

Re: High-CPU consumption on information_schema (only) query

От
Jeff Janes
Дата:
<p dir="ltr">On Wed, Sep 7, 2016 at 4:37 PM, Robins Tharakan <<a
href="mailto:tharakan@gmail.com">tharakan@gmail.com</a>>wrote:<br /><blockquote><p dir="ltr">><br
/></blockquote><pdir="ltr">> Hi,<br /> ><br /> > An SQL (with only information_schema related JOINS) when
triggered,runs with max CPU (and never ends - killed after 2 days).<br /> > - It runs similarly (very slow) on a
replicatedserver that acts as a read-only slave.<br /> > - Top shows only postgres as hitting max CPU (nothing
else).When query killed, CPU near 0%.<br /> > - When the DB is restored on a separate test server (with the exact
postgresql.conf)the same query works fine.<br /> > - There is no concurrent usage on the replicated /
test server (althoughthe primary is a Production server and has concurrent users).<br /> ><br /> > Questions:<br
/>> - If this was a postgres bug or a configuration issue, query on the restored DB should have been slow too. Is
theresomething very basic I am missing here?<br /> ><br /> > If someone asks for I could provide SQL + EXPLAIN,
butit feels irrelevant here. I amn't looking for a specific solution but what else should I be looking for here? <br
/><pdir="ltr">strace -ttt -T -y the process to see what system calls it is making.  If it is not doing many systme
calls,or they are uninformative, then attach the gdb debugger to it and periodically interrupt the process (ctrl c) and
takea back trace (bt), then restart it (c) and repeat.  If all the stack traces look similar, you will know where the
timeis going. <br />  <br /> Cheers, <p dir="ltr">Jeff<br /> 

Re: High-CPU consumption on information_schema (only) query

От
Andres Freund
Дата:
On 2016-09-07 23:37:31 +0000, Robins Tharakan wrote:
> If someone asks for I could provide SQL + EXPLAIN, but it feels irrelevant
> here.

Why is that? information_schema are normal sql queries, and some of them
far from trivial.

Andres



Re: High-CPU consumption on information_schema (only) query

От
Robins Tharakan
Дата:
On Fri, 9 Sep 2016 at 09:39 Andres Freund <andres@anarazel.de> wrote:
On 2016-09-07 23:37:31 +0000, Robins Tharakan wrote:
> If someone asks for I could provide SQL + EXPLAIN, but it feels irrelevant
> here.

Why is that? information_schema are normal sql queries, and some of them
far from trivial.

Andres
Hi Andres,

I completely agree. With 'irrelevant' I was only trying to imply that irrespective of the complexity of the query, a replicated box was seeing similar slowness whereas a Restored DB wasn't. It felt that the SQL itself isn't to blame here...

In effect, I was trying to ask if I am forgetting / missing something very obvious / important that could cause such an observation.

As others recommended, I am unable to have direct access to the production (master / slave) instances and so GDB / stack trace options are out of bounds at this time. I'll revert if I am able to do that.

-
thanks
robins


 
--

-
robins

Re: High-CPU consumption on information_schema (only) query

От
Tom Lane
Дата:
Robins Tharakan <tharakan@gmail.com> writes:
> I completely agree. With 'irrelevant' I was only trying to imply that
> irrespective of the complexity of the query, a replicated box was seeing
> similar slowness whereas a Restored DB wasn't. It felt that the SQL itself
> isn't to blame here...

Without having at least compared EXPLAIN outputs from the two boxes, you
have no business jumping to that conclusion.

If EXPLAIN does show different plans, my first instinct would be to wonder
whether the pg_stats data is equally up-to-date on both boxes.
        regards, tom lane



Re: High-CPU consumption on information_schema (only) query

От
Robins Tharakan
Дата:

Without having at least compared EXPLAIN outputs from the two boxes, you
have no business jumping to that conclusion.

If EXPLAIN does show different plans, my first instinct would be to wonder
whether the pg_stats data is equally up-to-date on both boxes.

                        regards, tom lane

Thanks. EXPLAIN plans were different but (don't have them now and) didn't know system catalogs were so severely affected by outdated statistics as well (which is why I was looking for any other reasons I might be missing). I reckon an ANALYSE; should solve this? ... Would get back if I have something else to offer.

-
robins 
--

-
robins