Re: CopyReadLineText optimization
От | Andrew Dunstan |
---|---|
Тема | Re: CopyReadLineText optimization |
Дата | |
Msg-id | 47D057BF.9030302@dunslane.net обсуждение исходный текст |
Ответ на | Re: CopyReadLineText optimization (Greg Smith <gsmith@gregsmith.com>) |
Список | pgsql-patches |
Greg Smith wrote: > On Thu, 6 Mar 2008, Heikki Linnakangas wrote: > >> At the most conservative end, we could fall back to the current >> method on the first escape, quote or backslash character. > > I would just count the number of escaped/quote characters on each > line, and then at the end of the line switch modes between the current > code on the new version based on what the previous line looked like. > That way the only additional overhead is a small bit only when escapes > show up often, plus a touch more just once per line. Barely noticable > in the case where nothing is escaped, very small regression for > escape-heavy stuff but certainly better than the drop you reported in > the last rev of this patch. > > Rev two of that design would keep a weighted moving average of the > total number of escaped characters per line (say > wma=(7*wma+current)/8) and switch modes based on that instead of the > previous one. There's enough play in the transition between where the > two approaches work better at that this should be easy enough to get a > decent transition between. Based on your data I would put the > transition at wma>4, which should keep the old code in play even if > only half the lines have the bad regression that shows up with >8 > escapes per line. > > I'd be inclined just to look at the first buffer of data we read in, and make a one-off decision there, if we can get away with it. Then the cost of testing is fixed rather than per line. cheers andrew
В списке pgsql-patches по дате отправления: