Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
От | Sergey Koposov |
---|---|
Тема | Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile |
Дата | |
Msg-id | alpine.LRH.2.02.1205310148440.6351@calx046.ast.cam.ac.uk обсуждение исходный текст |
Ответ на | Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile (Jeff Janes <jeff.janes@gmail.com>) |
Ответы |
Re: 9.2beta1, parallel queries, ReleasePredicateLocks,
CheckForSerializableConflictIn in the oprofile
Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile |
Список | pgsql-hackers |
On Wed, 30 May 2012, Jeff Janes wrote: >> But the question now is whether there is a *PG* problem here or not, or is >> it Intel's or Linux's problem ? Because still the slowdown was caused by >> locking. If there wouldn't be locking there wouldn't be any problems (as >> demonstrated a while ago by just cat'ting the files in multiple threads). > > You cannot have a traditional RDBMS without locking. From your I understand the need of significant locking when there concurrent writes, but not when there only reads. But I'm not a RDBMS expert, so that's maybe that's misunderstanding on my side. > description of the problem, I probably wouldn't be using a traditional > database system at all for this, but rather flat files and Perl. Flat files and perl for 25-50 TB of data over few years is a bit extreme ;) > Or > at least, I would partition the data before loading it to the DB, > rather than trying to do it after. I intensionally did otherwise, because I thought that PG will to be much smarter than me in juggling the data I'm ingesting (~ tens of gig each day), join the appropriate bits of data and then split by partitions. Unfortunately I see that there are some scalability issues on the way, which I didn't expect. Those aren't fatal, but slightly disappointing. > But anyway, is idt_match a fairly static table? If so, I'd partition > that into 16 tables, and then have each one of your tasks join against > a different one of those tables. That should relieve the contention > on the index root block, and might have some other benefits as well. No, idt_match is getting filled by multi-threaded copy() and then joined with 4 other big tables like idt_phot. The result is then split into partitions. And I was trying different approaches to fully utilize the CPUs and/or I/O and somehow parallize the queries. That's the reasoning for somewhat contrived queries in my test. Cheers, S ***************************************************** Sergey E. Koposov, PhD, Research Associate Institute of Astronomy, University of Cambridge Madingley road, CB3 0HA, Cambridge, UK Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/
В списке pgsql-hackers по дате отправления: