Re: Scanner performance (was Re: 7.3 schedule)
От | Peter Eisentraut |
---|---|
Тема | Re: Scanner performance (was Re: 7.3 schedule) |
Дата | |
Msg-id | Pine.LNX.4.30.0204161125280.689-100000@peter.localdomain обсуждение исходный текст |
Ответ на | Re: Scanner performance (was Re: 7.3 schedule) (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Scanner performance (was Re: 7.3 schedule)
Re: Scanner performance (was Re: 7.3 schedule) |
Список | pgsql-hackers |
Tom Lane writes: > The regression tests contain no very-long literals. The results I was > referring to concerned cases with string (BLOB) literals in the > hundreds-of-K range; it seems that the per-character loop in the flex > lexer starts to look like a bottleneck when you have tokens that much > larger than the rest of the query. > > Solutions seem to be either (a) make that loop quicker, or (b) find a > way to avoid passing BLOBs through the lexer. I was merely suggesting > that (a) should be investigated before we invest the work implied > by (b). I've done the following test: Ten statements of the form SELECT 1 FROM tab1 WHERE val = '...'; where ... are literals of length 5 - 10 MB (some random base-64 encoded MP3 files). "tab1" was empty. The test ran 3:40 min wall-clock time. Top ten calls: % cumulative self self totaltime seconds seconds calls ms/call ms/call name36.95 9.87 9.87 74882482 0.00 0.00 pq_getbyte22.80 15.96 6.09 11 553.64 1450.93 pq_getstring13.55 19.58 3.62 11 329.09 329.10 scanstr12.09 22.81 3.23 110 29.36 86.00 base_yylex 4.27 23.95 1.14 34 33.53 33.53 yy_get_previous_state 3.86 24.98 1.03 22 46.82 46.83 textin 3.67 25.96 0.98 34 28.82 28.82 myinput 1.83 26.45 0.49 45 10.89 32.67 yy_get_next_buffer 0.11 26.48 0.03 3027 0.01 0.01 AllocSetAlloc 0.11 26.51 0.03 129 0.23 0.23 fmgr_isbuiltin The string literals didn't contain any backslashes, so scanstr is operating in the best-case scenario here. But for arbitary binary data we need some escape mechanism, so I don't see much room for improvement there. It seems the real bottleneck is the excessive abstraction in the communications layer. I haven't looked closely at all, but it would seem better if pq_getstring would not use pq_getbyte and instead read the buffer directly. -- Peter Eisentraut peter_e@gmx.net
В списке pgsql-hackers по дате отправления: