Re: cache lookup failed dropping public schema with trgm index

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: cache lookup failed dropping public schema with trgm index
Дата
Msg-id 20230821223610.glmwmdsv2xofihiu@awork3.anarazel.de
обсуждение исходный текст
Ответ на cache lookup failed dropping public schema with trgm index  (Wyatt Alt <wyatt.alt@gmail.com>)
Ответы Re: cache lookup failed dropping public schema with trgm index  (Michael Paquier <michael@paquier.xyz>)
Список pgsql-bugs
Hi,

On 2023-08-21 11:40:15 -0700, Wyatt Alt wrote:
> This reproduces on 15.4 and 13.12:

Also reproduces on HEAD.


> create table foo(t text);
> create extension pg_trgm;
> create index on foo using gist(t gist_trgm_ops);
> drop schema public cascade;
>
> NOTICE:  drop cascades to 2 other objects
> DETAIL:  drop cascades to table foo
> drop cascades to extension pg_trgm
> ERROR:  cache lookup failed for function 1195999
> Time: 20.968 ms

It also seems to work without even involving a drop schema. Just dropping
pg_trgm with cascade is sufficient.

<several wrong theories>

I think we might primarily dealing with missing invalidations. Dropping
objects is scheduled in an order where we first drop
  schedule deletion of function 10 (text, text) of operator family t.gist_trgm_ops for access method gist:
t.gtrgm_options(internal)at 0
 
  schedule deletion of function t.gtrgm_options(internal) at 1
and then later
  schedule deletion of index t.foo_t at 32


During the index deletion we try to initialize the access method. But haven't
performed sufficient invalidation and still think "function 10"
exists. Calling it then causes the error.

One can "verify" this theory by adding an InvalidateSystemCaches() at the end
of deleteOneObject(). That "fixes" the issue.  This also explains why dropping
pg_trgm in a new session works - we don't have old cache entries that could be
out of date.

Not quite sure where we are dropping the ball with invalidations yet.


However, I suspect there's more wrong than this, albeit perhaps not
problematic in a huge way. While debugging I added a getObjectDescription()
description call to deleteObjectsInList() *before* calling
deleteOneObject(). That fails when dropping pg_trgmp, because we end up
dropping the type gtrgm before trgm_out(). The reason this happens is because
we reach gtrgm_out() via the extension dependency, rather than via the
type. When recursing to gtrgm_out(), we recurse to type gtrgm, recurse to
gtrgm_in(), schedule it for deletion, then recurse to gtrgm_out(), but find
it's in the stack and do *not* schedule it for deletion, before scheduling
gtrgm for deletion. Only then we delete gtrgm_out().

Now, this isn't a real issue in practice (without such a debugging statement,
which likely can't work in some cases), but I strongly suspect that it
indicates a scheduling order issue that's more widespread. Despite, I think,
correct dependencies, we end up with a topologically inconsistent drop
order. There aren't any cycles in the directed dependency graph from what I
can see.

Greetings,

Andres Freund



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: cache lookup failed dropping public schema with trgm index
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: cache lookup failed dropping public schema with trgm index