Re: Further pg_upgrade analysis for many tables
От | Bruce Momjian |
---|---|
Тема | Re: Further pg_upgrade analysis for many tables |
Дата | |
Msg-id | 20121124171220.GE9382@momjian.us обсуждение исходный текст |
Ответ на | Re: Further pg_upgrade analysis for many tables (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
Added to TODO: Improve cache lookup speed for sessions accessing many relations http://archives.postgresql.org/pgsql-hackers/2012-11/msg00356.php --------------------------------------------------------------------------- On Fri, Nov 9, 2012 at 12:50:34AM -0500, Tom Lane wrote: > Jeff Janes <jeff.janes@gmail.com> writes: > > Are sure the server you are dumping out of is head? > > I experimented a bit with dumping/restoring 16000 tables matching > Bruce's test case (ie, one serial column apiece). The pg_dump profile > seems fairly flat, without any easy optimization targets. But > restoring the dump script shows a rather interesting backend profile: > > samples % image name symbol name > 30861 39.6289 postgres AtEOXact_RelationCache > 9911 12.7268 postgres hash_seq_search > 2682 3.4440 postgres init_sequence > 2218 2.8482 postgres _bt_compare > 2120 2.7223 postgres hash_search_with_hash_value > 1976 2.5374 postgres XLogInsert > 1429 1.8350 postgres CatalogCacheIdInvalidate > 1282 1.6462 postgres LWLockAcquire > 973 1.2494 postgres LWLockRelease > 702 0.9014 postgres hash_any > > The hash_seq_search time is probably mostly associated with > AtEOXact_RelationCache, which is run during transaction commit and scans > the relcache hashtable looking for tables created in the current > transaction. So that's about 50% of the runtime going into that one > activity. > > There are at least three ways we could whack that mole: > > * Run the psql script in --single-transaction mode, as I was mumbling > about the other day. If we were doing AtEOXact_RelationCache only once, > rather than once per CREATE TABLE statement, it wouldn't be a problem. > Easy but has only a narrow scope of applicability. > > * Keep a separate list (or data structure of your choice) so that > relcache entries created in the current xact could be found directly > rather than having to scan the whole relcache. That'd add complexity > though, and could perhaps be a net loss for cases where the relcache > isn't so bloated. > > * Limit the size of the relcache (eg by aging out > not-recently-referenced entries) so that we aren't incurring O(N^2) > costs for scripts touching N tables. Again, this adds complexity and > could be counterproductive in some scenarios. > > regards, tom lane -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
В списке pgsql-hackers по дате отправления: