Hi,
On 2024-05-16 13:29:49 -0700, Andres Freund wrote:
> On 2024-05-16 16:13:35 -0400, Peter Geoghegan wrote:
> > > Now I wonder if there is some codepath triggering catalog lookups during bulk
> > > delete.
> >
> > I don't think that there's any rule that says that VACUUM cannot do
> > catalog lookups during bulk deletions. B-Tree page deletion needs to
> > generate an insertion scan key, so that it can "refind" a page
> > undergoing deletion. That might require catalog lookups.
>
> I'm not saying there's a hard rule against it. Just that there wasn't an
> immediately apparent, nor immediately observable, path for it. As I didn't see
> the path to the horizon recomputation, I didn't know how a btbulkdelete in the
> middle of the scan would potentially trigger the problem.
Hm. Actually. I think it might not be correct to do catalog lookups at that
point. But it's a bigger issue than just catalog lookups during bulkdelete:
Once we've done
MyProc->statusFlags |= PROC_IN_VACUUM;
the current backend's snapshots don't prevent rows from being removed
anymore.
I first wrote:
> That's not a huge issue for the pg_class entry itself, as the locks should
> prevent it from being updated. But there are a lot of catalog lookups that
> aren't protected by locks, just normal snapshot semantics.
but as it turns out we haven't even locked the relation at the point we set
PROC_IN_VACUUM.
That seems quite broken.
WRT bulkdelete, there's this comment where we set PROC_IN_VACUUM:
* In lazy vacuum, we can set the PROC_IN_VACUUM flag, which lets
* other concurrent VACUUMs know that they can ignore this one while
* determining their OldestXmin. (The reason we don't set it during a
* full VACUUM is exactly that we may have to run user-defined
* functions for functional indexes, and we want to make sure that if
* they use the snapshot set above, any tuples it requires can't get
* removed from other tables. An index function that depends on the
* contents of other tables is arguably broken, but we won't break it
* here by violating transaction semantics.)
the parenthetical explains that/why we can't evaluate user defined
functions. Which seems to be violated by doing key comparisons, no?
Greetings,
Andres Freund