Re: stress test for parallel workers
От | Tom Lane |
---|---|
Тема | Re: stress test for parallel workers |
Дата | |
Msg-id | 32179.1563919918@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: stress test for parallel workers (Thomas Munro <thomas.munro@gmail.com>) |
Ответы |
Re: stress test for parallel workers
|
Список | pgsql-hackers |
Thomas Munro <thomas.munro@gmail.com> writes: > *I suspect that the only thing implicating parallelism in this failure > is that parallel leaders happen to print out that message if the > postmaster dies while they are waiting for workers; most other places > (probably every other backend in your cluster) just quietly exit. > That tells us something about what's happening, but on its own doesn't > tell us that parallelism plays an important role in the failure mode. I agree that there's little evidence implicating parallelism directly. The reason I'm suspicious about a possible OOM kill is that parallel queries would appear to the OOM killer to be eating more resources than the same workload non-parallel, so that we might be at more hazard of getting OOM'd just because of that. A different theory is that there's some hard-to-hit bug in the postmaster's processing of parallel workers that doesn't apply to regular backends. I've looked for one in a desultory way but not really focused on it. In any case, the evidence from the buildfarm is pretty clear that there is *some* connection. We've seen a lot of recent failures involving "postmaster exited during a parallel transaction", while the number of postmaster failures not involving that is epsilon. regards, tom lane
В списке pgsql-hackers по дате отправления: