Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)
От | Andres Freund |
---|---|
Тема | Re: [HACKERS] Cutting initdb's runtime (Perl question embedded) |
Дата | |
Msg-id | 20170412173437.qfqfnl6k3icpfczx@alap3.anarazel.de обсуждение исходный текст |
Ответ на | [HACKERS] Cutting initdb's runtime (Perl question embedded) (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
On 2017-04-12 10:12:47 -0400, Tom Lane wrote: > Andres mentioned, and I've confirmed locally, that a large chunk of > initdb's runtime goes into regprocin's brute-force lookups of function > OIDs from function names. The recent discussion about cutting TAP test > time prompted me to look into that question again. We had had some > grand plans for getting genbki.pl to perform the name-to-OID conversion > as part of a big rewrite, but since that project is showing few signs > of life, I'm thinking that a more localized performance fix would be > a good thing to look into. There seem to be a couple of plausible > routes to a fix: > > 1. The best thing would still be to make genbki.pl do the conversion, > and write numeric OIDs into postgres.bki. The core stumbling block > here seems to be that for most catalogs, Catalog.pm and genbki.pl > never really break down a DATA line into fields --- and we certainly > have got to do that, if we're going to replace the values of regproc > fields. The places that do need to do that approximate it like this: > > # To construct fmgroids.h and fmgrtab.c, we need to inspect some > # of the individual data fields. Just splitting on whitespace > # won't work, because some quoted fields might contain internal > # whitespace. We handle this by folding them all to a simple > # "xxx". Fortunately, this script doesn't need to look at any > # fields that might need quoting, so this simple hack is > # sufficient. > $row->{bki_values} =~ s/"[^"]*"/"xxx"/g; > @{$row}{@attnames} = split /\s+/, $row->{bki_values}; > > We would need a bullet-proof, non-hack, preferably not too slow way to > split DATA lines into fields properly. I'm one of the world's worst > Perl programmers, but surely there's a way? I've done something like 1) before: http://archives.postgresql.org/message-id/20150221230839.GE2037%40awork2.anarazel.de I don't think the speeds matters all that much, because we'll only do it when generating the .bki file - a couple ms more or less won't matter much. I IIRC spent some more time to also load the data files from a different format: https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/sane-catalog although that's presumably heavily outdated now. - Andres
В списке pgsql-hackers по дате отправления: