Re: chr() function leads to OOM / killed connection with 8.1, 8.2
| От | Heikki Linnakangas |
|---|---|
| Тема | Re: chr() function leads to OOM / killed connection with 8.1, 8.2 |
| Дата | |
| Msg-id | 469FC5B8.3050609@enterprisedb.com обсуждение исходный текст |
| Ответ на | Re: chr() function leads to OOM / killed connection with 8.1, 8.2 (Tom Lane <tgl@sss.pgh.pa.us>) |
| Ответы |
Re: chr() function leads to OOM / killed connection with 8.1, 8.2
|
| Список | pgsql-bugs |
Tom Lane wrote: > Heikki Linnakangas <heikki@enterprisedb.com> writes: >> Tom Lane wrote: >>> I can reproduce an out-of-memory condition (basically, replace() is >>> going into an infinite loop because of the invalid input) but I'm >>> not seeing any crash. > >> replace_text reads past the end of source string, byte by byte (or >> character by character, not sure), and eventually tries to read from an >> invalid address which causes a segfault. It happens here when start_posn >> == 367368. > > Hm, must be memory-layout-dependent. On mine, the output string buffer > is growing fast enough to ensure there's still RAM to read, up till the > kernel says no more. > > Anyway the problem is that pg_utf2wchar_with_len silently drops the > trailing incomplete character in its input, causing text_position_next > to think the pattern is empty, causing an infinite loop because > curr_posn never advances. replace_text already tried to guard against > empty pattern, but it doesn't know about this case. > > What I intend to do to fix this is to modify the users of > text_position_next to believe the string lengths saved by > text_position_setup, rather than using TEXTLEN() to compute > the lengths. This will effectively make replace_text and > friends consistently act as though the partial character isn't there. > > In the long run it might be better to make pg_utf2wchar_with_len throw > an error for bad input, but I'm quite unsure of the consequences of > that, in view of the existing comment "not ours to throw error". > Anyway such a potentially-significant behavioral change doesn't seem > like a good idea to back-patch. (We seem to have this bug in one form > or another clear back to 7.3...) I agree we should do the above changes for the sake of robustness, but isn't the real problem here that chr function can return invalid byte sequences? That was actually discussed a while back (starting at http://archives.postgresql.org/pgsql-hackers/2007-04/msg00010.php), but that was inconclusive. IMHO chr should at the very least not return invalid byte sequences. Limiting it to ascii range is not a bad idea either, though that might break some applications. Is there any other known loopholes to get invalid data in the database? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
В списке pgsql-bugs по дате отправления: