Re: crash with assertions and WAL_DEBUG
От | Heikki Linnakangas |
---|---|
Тема | Re: crash with assertions and WAL_DEBUG |
Дата | |
Msg-id | 53AC1E69.3030802@vmware.com обсуждение исходный текст |
Ответ на | Re: crash with assertions and WAL_DEBUG (Alvaro Herrera <alvherre@2ndquadrant.com>) |
Список | pgsql-hackers |
On 06/24/2014 05:47 PM, Alvaro Herrera wrote: >>> diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c >>> index 3c1c81a..4264373 100644 >>> --- a/src/backend/storage/smgr/md.c >>> +++ b/src/backend/storage/smgr/md.c >>> @@ -219,6 +219,16 @@ mdinit(void) >>> &hash_ctl, >>> HASH_ELEM | HASH_FUNCTION | HASH_CONTEXT); >>> pendingUnlinks = NIL; >>> + >>> + /* >>> + * XXX: The checkpointer needs to add entries to the pending ops >>> + * table when absorbing fsync requests. That is done within a critical >>> + * section. It means that there's a theoretical possibility that you >>> + * run out of memory while absorbing fsync requests, which leads to >>> + * a PANIC. Fortunately the hash table is small so that's unlikely to >>> + * happen in practice. >>> + */ >>> + MemoryContextAllowInCriticalSection(MdCxt, true); >>> } >>> } >> >> Isn't that allowing a bit too much? We e.g. shouldn't allow >> _fdvec_alloc() within a crritical section. Might make sense to create a >> child context for it. > > I agree. Ok. That leaves nothing but _fdvec_alloc() in MdCxt. > Rahila Syed wrote: > >> The patch on compilation gives following error, >> >> mcxt.c: In function ‘MemoryContextAllowInCriticalSection’: >> mcxt.c:322: error: ‘struct MemoryContextData’ has no member named >> ‘allowInCriticalSection’ >> >> The member in MemoryContextData is defined as 'allowInCritSection' while >> the MemoryContextAllowInCriticalSection accesses the field as >> 'context->allowInCriticalSection'. > > It appears Heikki did a search'n replace for "->allowInCritSection" > before submitting, which failed to match the struct declaration. Uh, looks like I failed to test this at all with assertions enabled. Once you fix that error, you get a lot of assertion failures :-(. Attached is a new attempt: * walDebugCxt is now created at backend startup (in XLOGShmemInit()). This fixes the issue that Andres pointed out, that the context creation would PANIC if the first XLogInsert in a backend happens within a critical section. LWLOCK_STATS had the same issue, it would fail if the first LWLock acquisition in a process happens within a critical section. I fixed that by adding an explicit InitLWLockAccess() function and moved the initialization there. * hash_create also creates a new memory context within the given context. That new context needs to inherit the allowInCritSection flag, otherwise allocating entries in the hash will still fail. Both LWLOCK_STATS and the pending ops hash table in the checkpointer had this issue. * Checkpointer does one palloc to allocate private space to copy the requests to, in addition to using the hash table. I moved that palloc to before entering the critical section. - Heikki
Вложения
В списке pgsql-hackers по дате отправления: