Re: 7.0.2 dies when connection dropped mid-transaction
| От | Alfred Perlstein |
|---|---|
| Тема | Re: 7.0.2 dies when connection dropped mid-transaction |
| Дата | |
| Msg-id | 20001109184324.L11449@fw.wintelcom.net обсуждение исходный текст |
| Ответ на | Re: 7.0.2 dies when connection dropped mid-transaction (Tom Lane <tgl@sss.pgh.pa.us>) |
| Список | pgsql-hackers |
* Tom Lane <tgl@sss.pgh.pa.us> [001109 18:30] wrote: > I said: > > OK, after digging some more, it seems that the critical requirement > > is that the cursor's query contain a hash join. > > Here's the deal: > > test7=# set enable_mergejoin to off; > SET VARIABLE > test7=# begin; > BEGIN > -- I've previously checked that this produces a hash join plan: > test7=# declare c cursor for select * from foo t1, foo t2 where t1.f1=t2.f1; > SELECT > test7=# fetch 1 from c; > f1 | f1 > ----+---- > 1 | 1 > (1 row) > > test7=# abort; > NOTICE: trying to delete portal name that does not exist. > pqReadData() -- backend closed the channel unexpectedly. > This probably means the backend terminated abnormally > before or while processing the request. > > This happens with either 7.0.2 or 7.0.3 (probably with anything back to > 6.5, if not before). It does *not* happen with current development tip. > > The problem is that two "portal" structures are used. One holds the > overall query plan and execution state for the cursor, and the other > holds the hash table for the hash join. During abort, the portal > manager tries to delete both of them. BUT: deleting the query plan > causes query cleanup to be executed, which among other things deletes > the hash join's table. Then the portal manager tries to delete the > already-deleted second portal, which leads first to the above notice > and then to Assert failure (and probably would lead to coredump if > you didn't have Asserts on). Alternatively, it might try to delete > the hash join portal first, which would leave the query cleanup code > deleting an already-deleted portal, and doubtless still crashing. > > Current sources don't show the problem because hashtables aren't kept > in portals anymore. > > I've thought for some time that CollectNamedPortals is a horrid kluge, > and really ought to be rewritten. Hadn't seen it actually do the wrong > thing before, but now... > > I guess the immediate question is do we want to hold up 7.0.3 release > for a fix? This bug is clearly ancient, so I'm not sure it's > appropriate to go through a fire drill to fix it for 7.0.3. > Comments? I dunno, having the database crash because a errant client disconnected without shutting down, or needed to abort a transaction looks like a show stopper. We do track CVS and wouldn't have a problem shifting to 7_0_3_PATCHES, but I'm not sure if the rest of the userbase is going to have much fun. It seems to be a serious problem, I think people wouldn't mind waiting for you to squash this one. -- -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] "I have the heart of a child; I keep it in a jar on my desk."
В списке pgsql-hackers по дате отправления: