Re: [PATCH] Add pretty-printed XML output option
От | Jim Jones |
---|---|
Тема | Re: [PATCH] Add pretty-printed XML output option |
Дата | |
Msg-id | abd25443-ef6d-7b8a-c593-a2a991d3e5ce@uni-muenster.de обсуждение исходный текст |
Ответ на | Re: [PATCH] Add pretty-printed XML output option (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: [PATCH] Add pretty-printed XML output option
|
Список | pgsql-hackers |
On 14.03.23 18:40, Tom Lane wrote: > Jim Jones <jim.jones@uni-muenster.de> writes: >> [ v22-0001-Add-pretty-printed-XML-output-option.patch ] > I poked at this for awhile and ran into a problem that I'm not sure > how to solve: it misbehaves for input with embedded DOCTYPE. > > regression=# SELECT xmlserialize(DOCUMENT '<!DOCTYPE a><a/>' as text indent); > xmlserialize > -------------- > <!DOCTYPE a>+ > <a></a> + > > (1 row) The issue was the flag XML_SAVE_NO_EMPTY. It was forcing empty elements to be serialized with start-end tag pairs. Removing it did the trick ... postgres=# SELECT xmlserialize(DOCUMENT '<!DOCTYPE a><a/>' AS text INDENT); xmlserialize -------------- <!DOCTYPE a>+ <a/> + (1 row) ... but as a side effect empty start-end tags will be now serialized as empty elements postgres=# SELECT xmlserialize(CONTENT '<foo><bar></bar></foo>' AS text INDENT); xmlserialize -------------- <foo> + <bar/> + </foo> (1 row) It seems to be the standard behavior of other xml indent tools (including Oracle) > regression=# SELECT xmlserialize(CONTENT '<!DOCTYPE a><a/>' as text indent); > xmlserialize > -------------- > > (1 row) > > The bad result for CONTENT is because xml_parse() decides to > parse_as_document, but xmlserialize_indent has no idea that happened > and tries to use the content_nodes list anyway. I don't especially > care for the laissez faire "maybe we'll set *content_nodes and maybe > we won't" API you adopted for xml_parse, which seems to be contributing > to the mess. We could pass back more info so that xmlserialize_indent > knows what really happened. I added a new (nullable) parameter to the xml_parse function that will return the actual XmlOptionType used to parse the xml data. Now xmlserialize_indent knows how the data was really parsed: postgres=# SELECT xmlserialize(CONTENT '<!DOCTYPE a><a/>' AS text INDENT); xmlserialize -------------- <!DOCTYPE a>+ <a/> + (1 row) I added test cases for these queries. v23 attached. Thanks! Best, Jim
Вложения
В списке pgsql-hackers по дате отправления: