Re: remaining sql/json patches
От | Amit Langote |
---|---|
Тема | Re: remaining sql/json patches |
Дата | |
Msg-id | CA+HiwqHGOL8a5MjfOd_jbWMontKizqjoqCf5HtqCXhNjnpUktw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: remaining sql/json patches (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: remaining sql/json patches
(Alvaro Herrera <alvherre@alvh.no-ip.org>)
|
Список | pgsql-hackers |
On Thu, Nov 23, 2023 at 4:38 AM Andres Freund <andres@anarazel.de> wrote: > On 2023-11-21 12:52:35 +0900, Amit Langote wrote: > > version gram.o text bytes %change gram.c bytes %change > > > > 9.6 534010 - 2108984 - > > 10 582554 9.09 2258313 7.08 > > 11 584596 0.35 2313475 2.44 > > 12 590957 1.08 2341564 1.21 > > 13 590381 -0.09 2357327 0.67 > > 14 600707 1.74 2428841 3.03 > > 15 633180 5.40 2495364 2.73 > > 16 653464 3.20 2575269 3.20 > > 17-sqljson 672800 2.95 2709422 3.97 > > > > So if we put SQL/JSON (including JSON_TABLE()) into 17, we end up with a gram.o 2.95% larger than v16, which grantedis a somewhat larger bump, though also smaller than with some of recent releases. > > I think it's ok to increase the size if it's necessary increases - but I also > think we've been a bit careless at times, and that that has made the parser > slower. There's probably also some "infrastructure" work we could do combat > some of the growth too. > > I know I triggered the use of the .c bytes and text size, but it'd probably > more sensible to look at the size of the important tables generated by bison. > I think the most relevant defines are: > > #define YYLAST 117115 > #define YYNTOKENS 521 > #define YYNNTS 707 > #define YYNRULES 3300 > #define YYNSTATES 6255 > #define YYMAXUTOK 758 > > > I think a lot of the reason we end up with such a big "state transition" space > is that a single addition to e.g. col_name_keyword or unreserved_keyword > increases the state space substantially, because it adds new transitions to so > many places. We're in quadratic territory, I think. We might be able to do > some lexer hackery to avoid that, but not sure. One thing I noticed when looking at the raw parsing times across versions is that they improved a bit around v12 and then some in v13: 9.0 0.000060 s 9.6 0.000061 s 10 0.000061 s 11 0.000063 s 12 0.000055 s 13 0.000054 s 15 0.000057 s 16 0.000059 s I think they might be due to the following commits in v12 and v13 resp.: commit c64d0cd5ce24a344798534f1bc5827a9199b7a6e Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Wed Jan 9 19:47:38 2019 -0500 Use perfect hashing, instead of binary search, for keyword lookup. ... Discussion: https://postgr.es/m/20190103163340.GA15803@britannica.bec.de commit 7f380c59f800f7e0fb49f45a6ff7787256851a59 Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Mon Jan 13 15:04:31 2020 -0500 Reduce size of backend scanner's tables. ... Discussion: https://postgr.es/m/CACPNZCvaoa3EgVWm5yZhcSTX6RAtaLgniCPcBVOCwm8h3xpWkw@mail.gmail.com I haven't read the whole discussions there to see if the target(s) included the metrics you've mentioned though, either directly or indirectly. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: