Обсуждение: ERRORDATA_STACK_SIZE exceeded (server crash)
Hi, I have encountered a server crash while working with different locale settings. After searching on the internet I have seen a similar issue with 8.3.1 release and Tom has fixed that issue. That bug was only in Windows but I am getting same server crash on Linux, although I am using a later release with the patch applied. Here is the link of previous bug http://archives.postgresql.org/pgsql-committers/2008-05/msg00349.php OS = ubuntu PG Version = psql (8.4devel) postgres=# show server_encoding;server_encoding -----------------UTF8 (1건 있음) postgres=# show client_encoding;client_encoding -----------------UTF8 (1건 있음) postgres=# set client_encoding ='euc-jp'; SET postgres=# x; 서버가 갑자기 연결을 닫았음 이런 처리는 클라이언트의 요구를 처리하는 동안이나 처리하기 전에 서버가 갑자기 종료되었음을 의미함 서버로부터 연결이 끊어졌습니다. 다시 연결을 시도합니다: 실패. !> -- Ibrar Ahmed EnterpriseDB http://www.enterprisedb.com
Ibrar Ahmed wrote: > I have encountered a server crash while working with different locale > settings. After searching on the internet I have seen a similar issue > with 8.3.1 release and Tom has fixed that issue. That bug was only in > Windows but I am getting same server crash on Linux, although I am > using a later release with the patch applied. > > Here is the link of previous bug > > http://archives.postgresql.org/pgsql-committers/2008-05/msg00349.php > > OS = ubuntu > PG Version = psql (8.4devel) > > > postgres=# show server_encoding; > server_encoding > ----------------- > UTF8 > (1건 있음) > > postgres=# show client_encoding; > client_encoding > ----------------- > UTF8 > (1건 있음) > > postgres=# set client_encoding ='euc-jp'; > SET > postgres=# x; > 서버가 갑자기 연결을 닫았음 > 이런 처리는 클라이언트의 요구를 처리하는 동안이나 > 처리하기 전에 서버가 갑자기 종료되었음을 의미함 > 서버로부터 연결이 끊어졌습니다. 다시 연결을 시도합니다: 실패. > !> Do you have core dump, stack trace or any other details? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
"Ibrar Ahmed" <ibrar.ahmad@gmail.com> writes:
> I have encountered a server crash while working with different locale
> settings.
Are you going to give us a hint what settings those would be?
regards, tom lane
Sure! CODE ------ /configure --enable-nls --enable-depend --enable-debug make make install SERVER SIDE ----------------- 1 - export LANG=ko_KR.UTF-8 2 - ./initdb -E UTF8 -D ../data 3 - ./postmaster -D ../data CLIENT SIDE --------------- 1 - export LANG=ko_KR.UTF-8 2 - psql postgres postgres=# show server_encoding;server_encoding -----------------UTF8 (1건 있음) postgres=# show client_encoding;client_encoding -----------------UTF8 (1건 있음) postgres=# set client_encoding ='euc-jp'; --[<<<--Negative test scenario] SET postgres=# x; On Mon, Oct 27, 2008 at 6:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Ibrar Ahmed" <ibrar.ahmad@gmail.com> writes: >> I have encountered a server crash while working with different locale >> settings. > > Are you going to give us a hint what settings those would be? > > regards, tom lane > -- Ibrar Ahmed EnterpriseDB http://www.enterprisedb.com
"Ibrar Ahmed" <ibrar.ahmad@gmail.com> writes:
> 1 - export LANG=ko_KR.UTF-8
Hmph ... I can reproduce that on Fedora 9. It seems the problem is
that that translation is full of characters that don't exist in EUC-JP;
in particular the translations of both "ERROR" and "PANIC" contain
untranslatable characters. This means that every time we go to send
a message to the client, we get a recursive error trap.
The "ERRORDATA_STACK_SIZE exceeded" message is intentionally not exposed
to gettext translation, in hopes of stopping this problem, but that
doesn't help much when the PANIC message is exposed :-(.
So one thing we might try to do about it is to intentionally not allow
translation of PANIC (at line 2446 of elog.c). However, that only gets
us down from a stack-overflow crash to a PANIC, which is just about as
bad from a reliability standpoint.
I think the only permanent solution to this class of problem is going
to be something like this:
* When we hit the stack depth overflow PANIC situation in elog.c,
disable gettext so that the error will always be reported in ASCII.
* Reduce the PANIC to a FATAL so that misconfiguration of this sort
just kills the one session and doesn't cause a database crash.
This is still not very nice because what the user would get is
a complaint about ERRORDATA_STACK_SIZE exceeded with no hint that
he's got an encoding problem. It'd be better if we could get the
disable-gettext-and-FATAL-out behavior to apply to the "character
has no equivalent" error message, but I'm not sure how we do that
without bollixing up less-critical occurrences of that message.
Thoughts?
regards, tom lane
I wrote:
> This is still not very nice because what the user would get is
> a complaint about ERRORDATA_STACK_SIZE exceeded with no hint that
> he's got an encoding problem. It'd be better if we could get the
> disable-gettext-and-FATAL-out behavior to apply to the "character
> has no equivalent" error message, but I'm not sure how we do that
> without bollixing up less-critical occurrences of that message.
After poking around a bit I decided that this could be done in a not
horrendously ugly way if we are willing to make a couple more places
know about escaping from error recursion situations. Attached is a
proposed patch that prevents the crash shown previously. BTW, a better
stress test for this is to set LANG = tr_TR.utf8, client_encoding =
latin1, and then try "select E'\305\237';". That's because the
"character has no equivalent" message isn't itself translated in the
present ko translation, but it is in the tr translation. My first-cut
patch worked for the ko case and not the tr case :-(
One thing that is still a bit ugly about this patch is the hack in
wchar.c to ensure that the "character has no equivalent" message
doesn't get translated:
* ... Note that we have to
* spell the message slightly differently, which we do by sticking a
* space on the end --- using errmsg_internal() doesn't actually keep
* elog.c from calling gettext, it only prevents the string from being
* entered into the translation lists.
It might be better to modify elog.c so that errmsg_internal really
doesn't call gettext. This would require kluging up EVALUATE_MESSAGE()
a bit, so I'm not quite sure which is cleaner. Thoughts?
regards, tom lane
Index: src/backend/utils/error/elog.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/error/elog.c,v
retrieving revision 1.208
diff -c -r1.208 elog.c
*** src/backend/utils/error/elog.c 17 Oct 2008 22:56:16 -0000 1.208
--- src/backend/utils/error/elog.c 27 Oct 2008 16:16:28 -0000
***************
*** 149,154 ****
--- 149,169 ----
static void setup_formatted_log_time(void);
static void setup_formatted_start_time(void);
+
+ /*
+ * in_error_recursion_trouble --- are we at risk of infinite error recursion?
+ *
+ * This function exists to provide common control of various fallback steps
+ * that we take if we think we are facing infinite error recursion. See the
+ * callers for details.
+ */
+ bool
+ in_error_recursion_trouble(void)
+ {
+ /* Pull the plug if recurse more than once */
+ return (recursion_depth > 2);
+ }
+
/*
* errstart --- begin an error-reporting cycle
*
***************
*** 261,272 ****
MemoryContextReset(ErrorContext);
/*
! * If we recurse more than once, the problem might be something broken
* in a context traceback routine. Abandon them too. We also abandon
* attempting to print the error statement (which, if long, could
* itself be the source of the recursive failure).
*/
! if (recursion_depth > 2)
{
error_context_stack = NULL;
debug_query_string = NULL;
--- 276,287 ----
MemoryContextReset(ErrorContext);
/*
! * Infinite error recursion might be due to something broken
* in a context traceback routine. Abandon them too. We also abandon
* attempting to print the error statement (which, if long, could
* itself be the source of the recursive failure).
*/
! if (in_error_recursion_trouble())
{
error_context_stack = NULL;
debug_query_string = NULL;
***************
*** 2408,2413 ****
--- 2423,2432 ----
/*
* error_severity --- get localized string representing elevel
+ *
+ * Note: in an error recursion situation, we stop localizing the tags
+ * for ERROR and above. This is necessary because the problem might be
+ * failure to convert one of these strings to the client encoding.
*/
static const char *
error_severity(int elevel)
***************
*** 2437,2449 ****
prefix = _("WARNING");
break;
case ERROR:
! prefix = _("ERROR");
break;
case FATAL:
! prefix = _("FATAL");
break;
case PANIC:
! prefix = _("PANIC");
break;
default:
prefix = "???";
--- 2456,2477 ----
prefix = _("WARNING");
break;
case ERROR:
! if (in_error_recursion_trouble())
! prefix = "ERROR";
! else
! prefix = _("ERROR");
break;
case FATAL:
! if (in_error_recursion_trouble())
! prefix = "FATAL";
! else
! prefix = _("FATAL");
break;
case PANIC:
! if (in_error_recursion_trouble())
! prefix = "PANIC";
! else
! prefix = _("PANIC");
break;
default:
prefix = "???";
Index: src/backend/utils/mb/wchar.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/mb/wchar.c,v
retrieving revision 1.66
diff -c -r1.66 wchar.c
*** src/backend/utils/mb/wchar.c 15 Nov 2007 21:14:40 -0000 1.66
--- src/backend/utils/mb/wchar.c 27 Oct 2008 16:16:28 -0000
***************
*** 1567,1578 ****
for (j = 0; j < jlimit; j++)
p += sprintf(p, "%02x", (unsigned char) mbstr[j]);
! ereport(ERROR,
! (errcode(ERRCODE_UNTRANSLATABLE_CHARACTER),
! errmsg("character 0x%s of encoding \"%s\" has no equivalent in \"%s\"",
! buf,
! pg_enc2name_tbl[src_encoding].name,
! pg_enc2name_tbl[dest_encoding].name)));
}
#endif
--- 1567,1595 ----
for (j = 0; j < jlimit; j++)
p += sprintf(p, "%02x", (unsigned char) mbstr[j]);
! /*
! * In an error recursion situation, don't try to translate the message.
! * This gets us out of trouble if the problem is failure to convert
! * the translated message to the client encoding. Note that we have to
! * spell the message slightly differently, which we do by sticking a
! * space on the end --- using errmsg_internal() doesn't actually keep
! * elog.c from calling gettext, it only prevents the string from being
! * entered into the translation lists.
! */
! if (in_error_recursion_trouble())
! ereport(ERROR,
! (errcode(ERRCODE_UNTRANSLATABLE_CHARACTER),
! errmsg_internal("character 0x%s of encoding \"%s\" has no equivalent in \"%s\" ",
! buf,
! pg_enc2name_tbl[src_encoding].name,
! pg_enc2name_tbl[dest_encoding].name)));
! else
! ereport(ERROR,
! (errcode(ERRCODE_UNTRANSLATABLE_CHARACTER),
! errmsg("character 0x%s of encoding \"%s\" has no equivalent in \"%s\"",
! buf,
! pg_enc2name_tbl[src_encoding].name,
! pg_enc2name_tbl[dest_encoding].name)));
}
#endif
Index: src/include/utils/elog.h
===================================================================
RCS file: /cvsroot/pgsql/src/include/utils/elog.h,v
retrieving revision 1.96
diff -c -r1.96 elog.h
*** src/include/utils/elog.h 9 Oct 2008 22:22:31 -0000 1.96
--- src/include/utils/elog.h 27 Oct 2008 16:16:28 -0000
***************
*** 324,329 ****
--- 324,330 ----
/* Other exported functions */
extern void DebugFileOpen(void);
extern char *unpack_sql_state(int sql_state);
+ extern bool in_error_recursion_trouble(void);
#ifdef HAVE_SYSLOG
extern void set_syslog_parameters(const char *ident, int facility);
Tom Lane escribió: > It might be better to modify elog.c so that errmsg_internal really > doesn't call gettext. This would require kluging up EVALUATE_MESSAGE() > a bit, so I'm not quite sure which is cleaner. Thoughts? I think we document somewhere that a translator can add errmsg_internal to the list of gettext triggers in nls.mk, so I agree that we should making sure it doesn't call gettext at all, just to be sure some overly zealous translator gets past the barrier. (Also, actually adding errmsg_internal is pretty pointless, because the translated messages would be lost when the catalog is passed through our new translation status web system). -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane escribi�:
>> It might be better to modify elog.c so that errmsg_internal really
>> doesn't call gettext. This would require kluging up EVALUATE_MESSAGE()
>> a bit, so I'm not quite sure which is cleaner. Thoughts?
> I think we document somewhere that a translator can add errmsg_internal to the
> list of gettext triggers in nls.mk, so I agree that we should making sure it
> doesn't call gettext at all, just to be sure some overly zealous translator
> gets past the barrier.
Yeah, I agree --- will do it that way.
regards, tom lane