Обсуждение: Enquiry about long-running queries

Поиск
Список
Период
Сортировка

Enquiry about long-running queries

От
Ashish Mukherjee
Дата:
Hello,

I have a query like this showing up on my production database - 

s05=> SELECT pid, user, usename, application_name, client_addr, client_hostname, client_port, datname, now() - query_start as "runtime",
state, wait_event_type, wait_event,
substr(query, 0, 100)
 FROM  pg_stat_activity
 WHERE now() - query_start > '5 minutes'::interval and state = 'active'
 ORDER BY runtime DESC;
   pid   | user | usename | application_name |  client_addr  | client_hostname | client_port | datname |        runtime
       | state  | wait_event_type |   wait_event   |                                               substr
-------+--------+-----------------+----------------+--------------------------------------------------------------------
  356274 | s05  | s05     | scandir          | 192.168.64.61 |                 |       44098 | s05     | 9 days 18:45:37
.65577 | active | IPC             | ParallelFinish | select scac_code from scac where supported_by_smc = true

The query when run from psql prompt finishes in a jiffy, so query performance/cost is not the problem. Also, when I try to kill the query through pg_terminate_backend or pg_cancel_backend, it does not get killed.

I am wondering what could be the root cause of this problem and how it could be addressed.  Any pointers would be appreciated.

Regards,
Ashish

Re: Enquiry about long-running queries

От
Ron Johnson
Дата:
On Fri, Nov 7, 2025 at 8:49 AM Ashish Mukherjee <ashish.mukherjee@gmail.com> wrote:
Hello,

I have a query like this showing up on my production database - 

s05=> SELECT pid, user, usename, application_name, client_addr, client_hostname, client_port, datname, now() - query_start as "runtime",
state, wait_event_type, wait_event,
substr(query, 0, 100)
 FROM  pg_stat_activity
 WHERE now() - query_start > '5 minutes'::interval and state = 'active'
 ORDER BY runtime DESC;
   pid   | user | usename | application_name |  client_addr  | client_hostname | client_port | datname |        runtime
       | state  | wait_event_type |   wait_event   |                                               substr
-------+--------+-----------------+----------------+--------------------------------------------------------------------
  356274 | s05  | s05     | scandir          | 192.168.64.61 |                 |       44098 | s05     | 9 days 18:45:37
.65577 | active | IPC             | ParallelFinish | select scac_code from scac where supported_by_smc = true

The query when run from psql prompt finishes in a jiffy, so query performance/cost is not the problem. Also, when I try to kill the query through pg_terminate_backend or pg_cancel_backend, it does not get killed.

If it runs in a jiffy, do you have time to kill it before it finishes?
 
I am wondering what could be the root cause of this problem and how it could be addressed.  Any pointers would be appreciated.

What problem?  The query existing, or not being able to kill a query that finishes before you have time to kill it?

--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!

Re: Enquiry about long-running queries

От
Laurenz Albe
Дата:
On Fri, 2025-11-07 at 19:18 +0530, Ashish Mukherjee wrote:
> I have a query like this showing up on my production database - 
>
> s05=> SELECT pid, user, usename, application_name, client_addr, client_hostname, client_port, datname, now() -
query_startas "runtime", 
> state, wait_event_type, wait_event,
> substr(query, 0, 100)
>  FROM  pg_stat_activity
>  WHERE now() - query_start > '5 minutes'::interval and state = 'active'
>  ORDER BY runtime DESC;
>    pid   | user | usename | application_name |  client_addr  | client_hostname | client_port | datname |      
 runtime
>        | state  | wait_event_type |   wait_event   |                                               substr
>
-------+--------+-----------------+----------------+--------------------------------------------------------------------
>   356274 | s05  | s05     | scandir          | 192.168.64.61 |                 |       44098 | s05     | 9 days
18:45:37
> .65577 | active | IPC             | ParallelFinish | select scac_code from scac where supported_by_smc = true
>
> The query when run from psql prompt finishes in a jiffy, so query performance/cost is not the problem.
> Also, when I try to kill the query through pg_terminate_backend or pg_cancel_backend, it does not get killed.
>
> I am wondering what could be the root cause of this problem and how it could be addressed.  Any pointers would be
appreciated.

That is strange.  The wait event means that the backend is waiting for parallel
workers to finish.  But any existing parallel worker processes would also have
to show up in the query result.

On what operating system does PostgreSQL run?

What exact version is it?

You could try to "strace" the backend process (or use an equivalent tool, if
you are not on Linux) and see if the process issues any system calls.
To see what's going on, you'd have to attach to the backend process with a
debugger and take a stack trace.

Yours,
Laurenz Albe



Re: Enquiry about long-running queries

От
Adrian Klaver
Дата:
On 11/7/25 05:48, Ashish Mukherjee wrote:
> Hello,
> 
> I have a query like this showing up on my production database -

> The query when run from psql prompt finishes in a jiffy, so query

What client is the stuck query being run from?

> 
> Regards,
> Ashish


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Enquiry about long-running queries

От
Ashish Mukherjee
Дата:
pgsql 17.5 on Red Hat Enterprise Linux release 8.10 (Ootpa)

The query is run from one of our php applications.

On Fri, Nov 7, 2025 at 8:59 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Fri, 2025-11-07 at 19:18 +0530, Ashish Mukherjee wrote:
> I have a query like this showing up on my production database - 
>
> s05=> SELECT pid, user, usename, application_name, client_addr, client_hostname, client_port, datname, now() - query_start as "runtime",
> state, wait_event_type, wait_event,
> substr(query, 0, 100)
>  FROM  pg_stat_activity
>  WHERE now() - query_start > '5 minutes'::interval and state = 'active'
>  ORDER BY runtime DESC;
>    pid   | user | usename | application_name |  client_addr  | client_hostname | client_port | datname |        runtime
>        | state  | wait_event_type |   wait_event   |                                               substr
> -------+--------+-----------------+----------------+--------------------------------------------------------------------
>   356274 | s05  | s05     | scandir          | 192.168.64.61 |                 |       44098 | s05     | 9 days 18:45:37
> .65577 | active | IPC             | ParallelFinish | select scac_code from scac where supported_by_smc = true
>
> The query when run from psql prompt finishes in a jiffy, so query performance/cost is not the problem.
> Also, when I try to kill the query through pg_terminate_backend or pg_cancel_backend, it does not get killed.
>
> I am wondering what could be the root cause of this problem and how it could be addressed.  Any pointers would be appreciated.

That is strange.  The wait event means that the backend is waiting for parallel
workers to finish.  But any existing parallel worker processes would also have
to show up in the query result.

On what operating system does PostgreSQL run?

What exact version is it?

You could try to "strace" the backend process (or use an equivalent tool, if
you are not on Linux) and see if the process issues any system calls.
To see what's going on, you'd have to attach to the backend process with a
debugger and take a stack trace.

Yours,
Laurenz Albe

Re: Enquiry about long-running queries

От
Greg Sabino Mullane
Дата:
Wow, 9 days is a long time. The fact that it is not responding to pg_terminate_backend means it's not even calling CHECK_FOR_INTERRUPTS inside the WaitForParallelWorkersToFinish loop, so something is seriously wrong. If the PID is still there, please do what Laurenz suggested:

strace -tp 356274

Is the problem reproducible, and has it happened more than once?

Cheers,
Greg