Re: WIP parallel restore patch
От | Andrew Dunstan |
---|---|
Тема | Re: WIP parallel restore patch |
Дата | |
Msg-id | 49259697.20905@dunslane.net обсуждение исходный текст |
Ответ на | Re: WIP parallel restore patch (Magnus Hagander <magnus@hagander.net>) |
Список | pgsql-hackers |
Magnus Hagander wrote: > Andrew Dunstan wrote: > >> Attached is my latest parallel restore patch. I think it's functionally >> complete for Unix. >> >> Many bugs have been fixed since the last patch, and the hardcoded >> limitation to two table dependencies is removed. It seems fairly robust >> in my recent testing. >> >> Remaining to be done: >> >> . code cleanup >> . better error checking in a few places >> . final decision re command line option names/defaults >> . documentation >> . Windows support. >> > > I've looked around this a bit, and it's fairly clear where the issue > comes in with Windows - we get heap corruption. Most likely because we > have multiple threads working on the same data somewhere. > > I notice for example that we're doing a shallow copy of the > ArchiveHandle with a simple memcpy() for each thread. But that struct > contains a number of things like file descriptors and pointers. Have you > verified for each and every one of those that it actually doesn't get > modified anywhere? If not, a deep copy may be needed to make sure of that. > > Other than that, are there any global variables that may be addressed > from more than one worker? If so they need to be marked as TLS. > > And yes, I got hit by the lack of error checking a couple of times > during my testing - it would probably be a good idea to add that as soon > as possible, it helps a lot with the debugging. > > If I run it with just a single thread, it also crashes in PQfinish() > called from die_horribly(), when trying to free conn->pgpass, which has > a value (non-NULL) but is not a valid pointer. This crash happens in the > worker thread, after it has logged that "fseek is required" - that's an > indicator something being passed down to the thread is either wrong or > being scribbled upon after the fact. > > I didn't dig into these questions specifically - since you have already > been reading up on this code to do the patch you can probably reach the > answer to them much quicker :-) So I'll stick to the questions. > > > OK, Thanks, this will help. I thought I had caught the ArchiveHandle things, but there might be one or two I missed. I'll try to have a new version in a few days. cheers andrew
В списке pgsql-hackers по дате отправления: