Обсуждение: BUG #15342: pg_dump - XML with mixed content types generates invalidbackup file

Поиск
Список
Период
Сортировка

BUG #15342: pg_dump - XML with mixed content types generates invalidbackup file

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      15342
Logged by:          Ryan Lambert
Email address:      ryan@rustprooflabs.com
PostgreSQL version: 9.6.7
Operating system:   Ubuntu 16; Ubuntu 18; Raspbian (Pi)
Description:

Greetings!  

It seems that `pg_dump` is unable to provide a reliable database backups
that include specific combinations of XML data.  The following SQL Fiddle
creates a table with three rows of XML data.  The first row, "Document, no
DOCTYPE" is the only row of the three that will always load from a backup
from `pg_dump`.  I've tried this one a few sub-versions of 9.6 and 9.5.  

http://sqlfiddle.com/#!17/78a83/1/0

The second row added includes a DOCTYPE declaration in the XML.  Restoring
this row from pg_dump will fail unless you add `SET XML OPTION DOCUMENT;`.
Trying to restore the pg_dump file without adding `SET XML OPTION DOCUMENT`
returns:

```
ERROR:  invalid XML content
DETAIL:  line 2: StartTag: invalid element name
  <!DOCTYPE document SYSTEM "subjects.dtd">
   ^
CONTEXT:  COPY xml_doc, line 2, column data: "<?xml version="1.0"
standalone="no"?>
  <!DOCTYPE document SYSTEM "subjects.dtd">
  <document>
    <..."
```

The third row restores with the default setting but fails if `SET XML OPTION
DOCUMENT;` is set.

```
ERROR:  invalid XML document
DETAIL:  line 1: Start tag expected, '<' not found
abc<foo>bar</foo><bar>foo</bar>
^
CONTEXT:  COPY xml_doc, line 3, column data:
"abc<foo>bar</foo><bar>foo</bar>"
```

So it seems that if you have XML data that includes <!DOCTYPE> and other XML
that is just fragments... pg_dump won't work without manual tinkering and
headaches.

The specific data I use that is hanging me up is the QGIS layer style data
(stored in `public.layer_styles`).


Re: BUG #15342: pg_dump - XML with mixed content types generates invalid backup file

От
Tom Lane
Дата:
=?utf-8?q?PG_Bug_reporting_form?= <noreply@postgresql.org> writes:
> It seems that `pg_dump` is unable to provide a reliable database backups
> that include specific combinations of XML data.  The following SQL Fiddle
> creates a table with three rows of XML data.  The first row, "Document, no
> DOCTYPE" is the only row of the three that will always load from a backup
> from `pg_dump`.  I've tried this one a few sub-versions of 9.6 and 9.5.  

Hm.  So there are two problems here: pg_dump neglects to force a safe
value of xmloption for the restore step, plus there doesn't seem to be
a safe value for it to force :-(.  The first part of that is trivial
to fix, the second perhaps not so much.  However, the fine manual quoth
(in 8.13 XML Type)

    SET xmloption TO { DOCUMENT | CONTENT };

    The default is CONTENT, so all forms of XML data are allowed.

which makes it seem that the CONTENT setting was intended to work for
this.  Perhaps somebody just got overenthusiastic about throwing errors?

            regards, tom lane