Обсуждение: Dealing with SeqScans when Time-based Partitions Cut Over

Поиск
Список
Период
Сортировка

Dealing with SeqScans when Time-based Partitions Cut Over

От
Matthew Planchard
Дата:
In a table with high insert frequency (~1.5k rows/s) and high query
frequency (~1k queries/s), partitioned by record creation time, we have
observed the following behavior:

* When the current time crosses a partition boundary, all new records
  are written to the new partition, which was previously empty, as
  expected

* Because the planner's latest knowledge of the partition was based on
  its state prior to the cutover, it assumes the partition is empty and
  creates plans that use sequential scans

* The table accumulates tens to hundreds of thousands of rows, and the
  sequentail scans start to use nearly 100% of available database CPU

* Eventually the planner updates thee stats and all is well, but the
  cycle repeats the next time the partitions cut over.

We have tried setting up a cron job that runs ANALYZE on the most recent
partition of the table every 15 seconds at the start of the hour, and
while this does help in reducing the magnitude and duration of the
problem, it is insufficient to fully resolve it (our engineers are still
getting daily pages for high DB CPU utilization).

We have considered maintaining a separate connection pool with
connections that have `enable_seqscan` set to `off`, and updating the
application to use that pool for these queries, but I was hoping the
community might have some better suggestions.

- Matthew Planchard



Re: Dealing with SeqScans when Time-based Partitions Cut Over

От
"Peter J. Holzer"
Дата:
On 2025-12-18 13:52:22 -0500, Matthew Planchard wrote:
>
> In a table with high insert frequency (~1.5k rows/s) and high query
> frequency (~1k queries/s), partitioned by record creation time, we have
> observed the following behavior:
>
> * When the current time crosses a partition boundary, all new records
>   are written to the new partition, which was previously empty, as
>   expected
>
> * Because the planner's latest knowledge of the partition was based on
>   its state prior to the cutover, it assumes the partition is empty and
>   creates plans that use sequential scans
>
> * The table accumulates tens to hundreds of thousands of rows, and the
>   sequentail scans start to use nearly 100% of available database CPU
>
> * Eventually the planner updates thee stats and all is well, but the
>   cycle repeats the next time the partitions cut over.

If I understand you correctly, the stats are updated after "tens to
hundreds of thousands of rows" which at that insert rate would translate
to several minutes.


> We have tried setting up a cron job that runs ANALYZE on the most recent
> partition of the table every 15 seconds at the start of the hour, and
> while this does help in reducing the magnitude and duration of the
> problem, it is insufficient to fully resolve it (our engineers are still
> getting daily pages for high DB CPU utilization).

Since nobody seems to have an idea how to prevent the seqscans I'd go
back one step:

What is the real problem here?

1) The database is sluggish during that time and users are negatively
   affected by the sluggishness.
2) Your engineers get spurious pages for something which isn't really a
   problem and which they can do nothing about.
3) Something else.

Note that I don't consider "high CPU utilization" a real problem. That's
what the CPU is there for. It might cause problems or be a symptom of a
problem, but it isn't a problem in itself.

If the problem is 1) I agree that a database-related solution should be
found. I think it is possible to explicitely set the statistics on a
table. So maybe that would be good enough until auto-analyze kicks in.

But if 2) is the real problem I think you should change the ruleset in
your monitoring. Pages should only be sent out for conditions which
actually impact the service and which can (probably) be fixed or at
least investigated by the engineer on call. Something which happens
every day at the same time and resolves itself within a few minutes
doesn't satisfy these criteria. Maybe you can just add a "downtime" to
that rule. Something like "between 00:00 and 00:07, don't send out
notifications for this condition".

        hjp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Вложения

Re: Dealing with SeqScans when Time-based Partitions Cut Over

От
"Peter J. Holzer"
Дата:
On 2025-12-21 10:49:08 +0100, Peter J. Holzer wrote:
> Since nobody seems to have an idea how to prevent the seqscans

Ah, sorry. I didn't see that there were two (almost?) identical messages.
There have been replies to the other one.

        hjp


--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Вложения