Re: BUG #15420: Server crash. Segmentation fault when parsing xml file
От | Sergey Mirvoda |
---|---|
Тема | Re: BUG #15420: Server crash. Segmentation fault when parsing xml file |
Дата | |
Msg-id | CALkWAriUN-6GsYyURvAB5f5+HsDbb_bx1YgsXMjs0xsMvCd-xQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #15420: Server crash. Segmentation fault when parsing xml file (Andrew Gierth <andrew@tao11.riddles.org.uk>) |
Ответы |
Re: BUG #15420: Server crash. Segmentation fault when parsing xml file
|
Список | pgsql-bugs |
On Fri, Oct 5, 2018 at 10:08 AM Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:
>>>>> "Andrey" == Andrey Borodin <x4mmm@yandex-team.ru> writes:
>> You're sure about that libxml2 version? I can reproduce a crash on
>> 2.9.4, but have as yet failed to do so on 2.9.7 (fails with an error
>> message instead)
Andrey> You are right, there was default 2.9.4 from OS, and 2.9.4 from
Andrey> brew was not used.
Andrey> x4mmm-osx:pgsql x4mmm$ xmllint --version
Andrey> xmllint: using libxml version 20904
I have a complete diagnosis of why it crashes on 2.9.4, and I can see
why it does not crash the same way on 2.9.7, but I would not bet
anything on 2.9.7 not having some comparable issue.
What happens on 2.9.4 is this (this is all inside libxml2):
- at some point when parsing an element tag, the code decides to raise
a fatal error and call xmlHaltParser
- xmlHaltParser works by resetting the input buffer's "base" and "cur"
pointers to point to a literal "" in the code (thus, a null byte)
- xmlParseStartTag2 detects that input->base has changed, and assumes
that this is because the buffer got reallocated; in the process of
dealing with this, it resets input->cur to input->base + cur where
"cur" is a local variable holding the previous offset in the buffer
(which is now of course nonsense, so input->cur points into the
weeds)
- something later tries to access the byte at *input->cur and likely
crashes (depending on many random factors, including load addresses
of shared libraries and where in the buffer the original error was
detected)
Between 2.9.4 and 2.9.7 xmlParseStartTag2 was changed to handle buffer
reallocations differently so it doesn't fail the same way (it no longer
tries to modify input->cur). But there are so many ways that this error
path can screw itself up that I honestly would not trust it for one
second.
--
Andrew (irc:RhodiumToad)
Sorry for top posting and spelling, T9 and mobile gmail not very usable.
Some notes.
if i set xmloption to document
this code works as expected
postgres=# select d::xml from convert_from(pg_read_binary_file('EGRUL_FULL_2018-01-01_X.XML'),'windows-1251') g(d);
....
postgres=# select xml_is_well_formed(d) from convert_from(pg_read_binary_file('EGRUL_FULL_2018-01-01_X.XML'),'windows-1251') g(d);
xml_is_well_formed
--------------------
t
(1 строка)
postgres=# select xml_is_well_formed(d) from convert_from(pg_read_binary_file('EGRUL_FULL_2018-01-01_X.XML'),'windows-1251') g(d);
xml_is_well_formed
--------------------
t
(1 строка)
but all other XML functions still crashing server
for example:
postgres=# select xpath_exists('//СвЮЛ'::text,d::xml) from convert_from(pg_read_binary_file('egrul/EGRUL_FULL_2018-01-01_X.XML'),'windows-1251') g(d);
--Regards, Sergey Mirvoda
В списке pgsql-bugs по дате отправления: