"memory exhausted" in query parser/simplifier for many nested parentheses
От | Niklas Hambüchen |
---|---|
Тема | "memory exhausted" in query parser/simplifier for many nested parentheses |
Дата | |
Msg-id | 161892c0-65fc-42a6-8af9-b098879decd4@nh2.me обсуждение исходный текст |
Ответы |
Re: "memory exhausted" in query parser/simplifier for many nested parentheses
Re: "memory exhausted" in query parser/simplifier for many nested parentheses |
Список | pgsql-bugs |
Hello, we noticed that when writing a large query of form (((A OR B) OR C) OR ...) with many terms (e.g. 13000) this causes one of two errors: 1. memory exhausted at or near "(" 2. stack depth limit exceeded This seems bugged/unfortunate, as writing A OR B OR C OR ... does not cause the errors. This means the query fails depending whether or not the user provided parentheses in the query. Note the query doesn't have to involve `OR`, it can be anyting involving many parentheses. I tracked down the cause of the two errors: * In the Bison parser: #define YYMAXDEPTH 10000 http://git.savannah.gnu.org/cgit/bison.git/tree/src/parse-gram.c?h=v3.8.2#n1376 This causes error: ERROR: memory exhausted at or near \"(\" * In the expression simplifier: simplify_function() at clauses.c:3921 calls expression_tree_mutator() at nodeFuncs.c:3117 calls eval_const_expressions_mutator() at clauses.c:2449 calls simplify_function() This causes error: stack depth limit exceeded from https://github.com/postgres/postgres/blob/REL_14_12/src/backend/tcop/postgres.c#L3481-L3498 A minimal repro for this can be found in: https://github.com/questdb/questdb/issues/3841#issuecomment-2536599109 (I'm not affiilated with QuestDB, it seems to be a database also using Postgres's parser.) Repeating the repro here: #!/usr/bin/env python3 import sys N = int(sys.argv[1]) exp = "" for i in range(N): exp += "lower(" exp += "'ABCDEFGHI'" for i in range(N): exp += ")" assert exp.count("(") == exp.count(")") print(exp) Example invocation: psql -h localhost -p 5432 postgres -c "select $(python3 expression.py 10000); select 1" For playing with the limits in Postgres, the `YYMAXDEPTH 10000` constant can be changed by manually changing it `gram.c`(generated from `gram.y`). Both above problems stem from using a recursive descent parser in a language with limited stack size (C, or Bison's manualstack implemented with `goto`). The Bison parser runs before the expression simplifier. So if the expression is too large, we get `memory exhausted`. If the expression is small enough to make it to the simplifer, we get `stack depth limit exceeded` if it's too large forthe simplier. The simplifier stack limit can be adjusted with max_stack_depth = 2MB (the default) max_stack_depth = 7680kB (possible maximum) and this maximum is of course still too low if we have tens of terms. We consider this a bug because it means one cannot easily generate large queries that query a couple thousand entries ina `SELECT ... WHERE` condition. It would be great if postgres can alleviate these limits, or at least succeed to parse everything that's written with parenthesesequally well when written with parentheses. Thank you! Niklas & Ruben
В списке pgsql-bugs по дате отправления: