Re: [PATCH] Optimize json_lex_string by batching character copying
От | John Naylor |
---|---|
Тема | Re: [PATCH] Optimize json_lex_string by batching character copying |
Дата | |
Msg-id | CAFBsxsGhaR2KQ5eisaK=6Vm60t=axhD8Ckj1qFoCH1pktZi+2w@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [PATCH] Optimize json_lex_string by batching character copying (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: [PATCH] Optimize json_lex_string by batching character copying
|
Список | pgsql-hackers |
On Sat, Jun 25, 2022 at 8:05 AM Andres Freund <andres@anarazel.de> wrote: > > I tried your patch with: > > > > DROP TABLE IF EXISTS json_as_text; > > CREATE TABLE json_as_text AS SELECT (SELECT json_agg(row_to_json(pd)) as t FROM pg_description pd) FROM generate_series(1,100); > > VACUUM FREEZE json_as_text; > > > > SELECT 1 FROM json_as_text WHERE jsonb_typeof(t::jsonb) = 'not me'; > > > > Which the patch improves from 846ms to 754ms (best of three). A bit smaller > > than your improvement, but still nice. > > > > > > I think your patch doesn't quite go far enough - we still end up looping for > > each character, have the added complication of needing to flush the > > "buffer". I'd be surprised if a "dedicated" loop to see until where the string > > last isn't faster. That then obviously could be SIMDified. > > A naive implementation (attached) of that gets me down to 706ms. Taking this a step further, I modified json_lex and json_lex_string to use a const end pointer instead of maintaining the length (0001). The observed speedup is small enough that it might not be real, but the code is simpler this way, and it makes 0002 and 0003 easier to reason about. Then I modified your patch to do the same (0002). Hackish SSE2 support is in 0003. To exercise the SIMD code a bit, I added a second test: DROP TABLE IF EXISTS long_json_as_text; CREATE TABLE long_json_as_text AS with long as ( select repeat(description, 10) from pg_description pd ) SELECT (select json_agg(row_to_json(long)) as t from long) from generate_series(1, 100); VACUUM FREEZE long_json_as_text; SELECT 1 FROM long_json_as_text WHERE jsonb_typeof(t::jsonb) = 'not me'; With this, I get (turbo disabled, min of 3): short test: master: 769ms 0001: 751ms 0002: 703ms 0003: 701ms long test; master: 939ms 0001: 883ms 0002: 537ms 0003: 439ms I think 0001/2 are mostly in committable shape. With 0003, I'd want to make the portability check a bit nicer and more centralized. I'm thinking of modifying the CRC check to report that the host cpu/compiler understands SSE4.2 x86 intrinsics, and then the compile time SSE2 check can piggyback on top of that without a runtime check. This is conceptually easy but a bit of work to not look like a hack (which probably means the ARM CRC check should look more generic somehow...). The regression tests will likely need some work as well. > Separately, it seems pretty awful efficiency / code density wise to have the > NULL checks for ->strval all over. Might be worth forcing json_lex() and > json_lex_string() to be inlined, with a constant parameter deciding whether > ->strval is expected. That'd likely be enough to get the compiler specialize > the code for us. I had a look at this but it's a bit more invasive than I want to devote time to at this point. -- John Naylor EDB: http://www.enterprisedb.com
Вложения
В списке pgsql-hackers по дате отправления: