Re: [HACKERS] memory problems in copying large table to STDOUT
От | Tom Lane |
---|---|
Тема | Re: [HACKERS] memory problems in copying large table to STDOUT |
Дата | |
Msg-id | 3580.939605580@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: [HACKERS] memory problems in copying large table to STDOUT (Martin Weinberg <weinberg@osprey.astro.umass.edu>) |
Ответы |
Re: [HACKERS] memory problems in copying large table to STDOUT
|
Список | pgsql-hackers |
Martin Weinberg <weinberg@osprey.astro.umass.edu> writes: > I got the backtrace with ""Memory exhausted in > AllocSetAlloc()" this time. > #4 0x8119732 in MemoryContextAlloc (context=0x81efbd8, size=875628846) > at mcxt.c:230 > #5 0x810ebf1 in textout (vlena=0x4106182c) at varlena.c:190 > #6 0x808508c in CopyTo (rel=0x822af08, binary=0 '\000', oids=0 '\000', > fp=0x0, delim=0x813c829 "\t") at copy.c:421 OK, that shoots down the "memory leak" theory. It sure looks like what you've got is corrupt data: textout is reading a length word of 875628846 (plus or minus a couple bytes) from what is supposed to be a text datum. Obviously that's not right. Next question is how it got that way. I think it's pretty likely that the original cause is the kernel disk driver or disk hardware flakiness that we already have evidence for. However, I hate passing the buck like that, so I'm willing to continue digging if you are. > Again, I've saved the process in case checking any symbols > would be helpful. You should look at the source tuple location info in CopyTo --- something like(gdb) f 6 -- frame 6, ie, CopyTo(gdb) p i -- get column number(gdb) p *tuple -- printcontents of HeapTupleData(gdb) p *tuple->t_data -- print contents of HeapTupleHeaderData The last is mainly to find out the tuple's OID for possible future reference. What we want right now is the tuple location info, tuple->t_self, which will give us a block number (bi_hi and bi_lo in that struct are the high and low 16 bits of the block number). Then, if you can use dd and od to get a hex dump of that block from the relation's data files, we can see what's really on disk there. (Remember that the "blocks" are 8K each; also, if you get an offset beyond 1 gig, then it's going to be in one of the continuation files "psc.1", "psc.2", etc --- one gig apiece.) It would also be useful to look at the contents of the disk block as sitting in memory in the backend, to see if they are the same as what you read using dd; I would not be too surprised to find they are not. The t_data pointer should be pointing into a disk buffer in Postgres' shared memory block, but offhand I'm not sure what's the easiest way to discover the starting address of that buffer using gdb. (Can any other hackers lend a hand here?) regards, tom lane
В списке pgsql-hackers по дате отправления: