Обсуждение: Mat view sometimes taking 10x the time to refresh concurrently
Hey, we have a mat view that normally takes ~18m to re-materialize concurrently, which we do daily. Maybe once a week, it takes ~180m to refresh, and we're at a loss as to why.
We are running backs during the same general time window, where we do backup the schema where this mat view is located, could this be an issue? We have not noticed any lock errors in the backup or log, though.
Any tips on tracing this down would be appreciated.
Wells Oliver <wells.oliver@gmail.com> writes: > Hey, we have a mat view that normally takes ~18m to re-materialize > concurrently, which we do daily. Maybe once a week, it takes ~180m to > refresh, and we're at a loss as to why. > We are running backs during the same general time window, where we do > backup the schema where this mat view is located, could this be an issue? > We have not noticed any lock errors in the backup or log, though. > Any tips on tracing this down would be appreciated. My own mindset would be to wonder if a different/worse plan is being chosen. You could investigate that perhaps by running an EXPLAIN on the matview's query just before each refresh, to see if it changes. Checking pg_locks for ungranted locks while the REFRESH is running would be good to do too, although I'd think that a backup vs. a concurrent refresh shouldn't have that sort of problem. It seems possible also that you're just maxing out the machine's I/O capacity between these two tasks. regards, tom lane
possible scenarios:
The tables in the view have much more data now than it was earlier. if you track metrics for table size/db size, do you see growth there,
the tables and/or indexes may be bloated. you can vacuum / pg_repack on tables used in the view and indexes if the bloat is too much.
can you run the query directly and see the explain plan. it might give some hints if estimates are off.
Does the db server have enough resources ? you can try bumping maintenance_work_mem and work_mem in a session and see if this helps speed things up.
On Thu, 27 May 2021 at 21:45, Wells Oliver <wells.oliver@gmail.com> wrote:
Hey, we have a mat view that normally takes ~18m to re-materialize concurrently, which we do daily. Maybe once a week, it takes ~180m to refresh, and we're at a loss as to why.We are running backs during the same general time window, where we do backup the schema where this mat view is located, could this be an issue? We have not noticed any lock errors in the backup or log, though.Any tips on tracing this down would be appreciated.
Thanks,
Vijay
Mumbai, India
Thanks, Tom. I will dig into those ideas. I do think the I/O capacity might ultimately be a big factor.
On Thu, May 27, 2021 at 9:36 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Wells Oliver <wells.oliver@gmail.com> writes:
> Hey, we have a mat view that normally takes ~18m to re-materialize
> concurrently, which we do daily. Maybe once a week, it takes ~180m to
> refresh, and we're at a loss as to why.
> We are running backs during the same general time window, where we do
> backup the schema where this mat view is located, could this be an issue?
> We have not noticed any lock errors in the backup or log, though.
> Any tips on tracing this down would be appreciated.
My own mindset would be to wonder if a different/worse plan is being
chosen. You could investigate that perhaps by running an EXPLAIN
on the matview's query just before each refresh, to see if it changes.
Checking pg_locks for ungranted locks while the REFRESH is running
would be good to do too, although I'd think that a backup vs. a
concurrent refresh shouldn't have that sort of problem.
It seems possible also that you're just maxing out the machine's
I/O capacity between these two tasks.
regards, tom lane
Wells Oliver
wells.oliver@gmail.com
wells.oliver@gmail.com
" although I'd think that a backup vs. a
concurrent refresh shouldn't have that sort of problem"
concurrent refresh shouldn't have that sort of problem"
yep. I guess pg_dump requests for access share lock iirc.
*****************************************
session 1:
demo=# create table t(id int, name text);
CREATE TABLE
demo=# insert into t select x, md5(x::text) from generate_series(1, 1000) x;
INSERT 0 1000
demo=# CREATE MATERIALIZED VIEW tv as select id,name from t,pg_sleep(5) where id > 10;
SELECT 990
demo=# create unique INDEX on tv(id);
CREATE INDEX
demo=# refresh materialized view concurrently tv;
REFRESH MATERIALIZED VIEW
demo=# refresh materialized view concurrently tv;
REFRESH MATERIALIZED VIEW
demo=# refresh materialized view concurrently tv;
REFRESH MATERIALIZED VIEW
session2:
postgres@go:/tmp$ pg_dump demo > abc.sql # while refresh running. completes fine.
postgres@go:/tmp$ psql
postgres=# \c demo
You are now connected to database "demo" as user "postgres".
demo=# begin;
BEGIN
demo=*# lock table t IN ACCESS SHARE MODE; -- simulate pg_dump
LOCK TABLE
demo=*# select * from tv limit 1; -- also just query the existing view
id | name
----+----------------------------------
11 | 6512bd43d9caa6e02c990b0a82652dca
(1 row)
***********************
so, pg_dump may not be the problem. if i simulated correctly. ( ofcourse without any exclusive locks on base table t)
On Thu, 27 May 2021 at 22:08, Wells Oliver <wells.oliver@gmail.com> wrote:
Thanks, Tom. I will dig into those ideas. I do think the I/O capacity might ultimately be a big factor.On Thu, May 27, 2021 at 9:36 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:Wells Oliver <wells.oliver@gmail.com> writes:
> Hey, we have a mat view that normally takes ~18m to re-materialize
> concurrently, which we do daily. Maybe once a week, it takes ~180m to
> refresh, and we're at a loss as to why.
> We are running backs during the same general time window, where we do
> backup the schema where this mat view is located, could this be an issue?
> We have not noticed any lock errors in the backup or log, though.
> Any tips on tracing this down would be appreciated.
My own mindset would be to wonder if a different/worse plan is being
chosen. You could investigate that perhaps by running an EXPLAIN
on the matview's query just before each refresh, to see if it changes.
Checking pg_locks for ungranted locks while the REFRESH is running
would be good to do too, although I'd think that a backup vs. a
concurrent refresh shouldn't have that sort of problem.
It seems possible also that you're just maxing out the machine's
I/O capacity between these two tasks.
regards, tom lane--Wells Oliver
wells.oliver@gmail.com
Thanks,
Vijay
Mumbai, India