Обсуждение: parallel restore item dependencies
OK, I've worked out why I am seeing deadlocks etc. from parallel restore
on FK items.
In my original patch, I looked at all the dependencies of a candidate
item ansd compared them with the dependencies of the running items to
see if there was a potential locking clash. However, Tom in his
admirable reworking of my patch, restricted the list of potential
clashing items (lockDeps) to "TABLE" items, if any. This would probably
have been ok if we hadn't just beforehand transferred all TABLE
dependencies in POST_DATA items to the corresponding TABLE DATA item.
The result is that we get empty lockDeps lists on all items - I'm
surprised we haven't had more complaints about deadlock or failing locks.
A simple fix that would probably work would be to adjust the filter to
include TABLE DATA items, so the relevant statement would read:
if (tocsByDumpId[depid - 1] && (strcmp(tocsByDumpId[depid - 1]->desc, "TABLE") == 0 ||
strcmp(tocsByDumpId[depid- 1]->desc, "TABLE DATA") == 0)) lockids[nlockids++] = depid;
Perhaps a better fix would move the code that sets up the lockDeps so
that it runs before we adjust the dependencies.
I'm moderately confident that either of these fixes will work, but I
think this demonstrates the need for lots of testing, especially with
complex data sets that have lots of dependencies and potentially
deadlocking items.
thoughts?
cheers
andrew
Andrew Dunstan <andrew@dunslane.net> writes:
> OK, I've worked out why I am seeing deadlocks etc. from parallel restore
> on FK items.
> In my original patch, I looked at all the dependencies of a candidate
> item ansd compared them with the dependencies of the running items to
> see if there was a potential locking clash. However, Tom in his
> admirable reworking of my patch, restricted the list of potential
> clashing items (lockDeps) to "TABLE" items, if any. This would probably
> have been ok if we hadn't just beforehand transferred all TABLE
> dependencies in POST_DATA items to the corresponding TABLE DATA item.
> The result is that we get empty lockDeps lists on all items - I'm
> surprised we haven't had more complaints about deadlock or failing locks.
[ scratches head... ] I coulda sworn I tested that when I was hacking
it. I'm running low on steam tonight but will think more about this
tomorrow.
regards, tom lane
I wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> In my original patch, I looked at all the dependencies of a candidate
>> item ansd compared them with the dependencies of the running items to
>> see if there was a potential locking clash. However, Tom in his
>> admirable reworking of my patch, restricted the list of potential
>> clashing items (lockDeps) to "TABLE" items, if any. This would probably
>> have been ok if we hadn't just beforehand transferred all TABLE
>> dependencies in POST_DATA items to the corresponding TABLE DATA item.
>> The result is that we get empty lockDeps lists on all items - I'm
>> surprised we haven't had more complaints about deadlock or failing locks.
> [ scratches head... ] I coulda sworn I tested that when I was hacking
> it. I'm running low on steam tonight but will think more about this
> tomorrow.
I think I have reconstructed what happened: I tested this code before
I decided that repointing the dependencies was a good idea, or else
reordered the sequence of operations in fix_dependencies after that.
It looks to me like the correct fix is just to look for TABLE DATA
not TABLE while setting up lockDeps[], since all the entry types we
care about are POST_DATA items. Anyway, I've committed that, please
try it.
regards, tom lane
Tom Lane wrote: > I wrote: > >> Andrew Dunstan <andrew@dunslane.net> writes: >> >>> In my original patch, I looked at all the dependencies of a candidate >>> item ansd compared them with the dependencies of the running items to >>> see if there was a potential locking clash. However, Tom in his >>> admirable reworking of my patch, restricted the list of potential >>> clashing items (lockDeps) to "TABLE" items, if any. This would probably >>> have been ok if we hadn't just beforehand transferred all TABLE >>> dependencies in POST_DATA items to the corresponding TABLE DATA item. >>> The result is that we get empty lockDeps lists on all items - I'm >>> surprised we haven't had more complaints about deadlock or failing locks. >>> > > >> [ scratches head... ] I coulda sworn I tested that when I was hacking >> it. I'm running low on steam tonight but will think more about this >> tomorrow. >> > > I think I have reconstructed what happened: I tested this code before > I decided that repointing the dependencies was a good idea, or else > reordered the sequence of operations in fix_dependencies after that. > It looks to me like the correct fix is just to look for TABLE DATA > not TABLE while setting up lockDeps[], since all the entry types we > care about are POST_DATA items. Anyway, I've committed that, please > try it. > > > Passes test. Thanks. andrew