Re: XML Issue with DTDs
От | Florian Pflug |
---|---|
Тема | Re: XML Issue with DTDs |
Дата | |
Msg-id | AE499D25-0910-4CFD-AF98-D6103918495E@phlo.org обсуждение исходный текст |
Ответ на | Re: XML Issue with DTDs (Robert Haas <robertmhaas@gmail.com>) |
Список | pgsql-hackers |
On Dec23, 2013, at 03:45 , Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Dec 20, 2013 at 8:16 PM, Florian Pflug <fgp@phlo.org> wrote: >> On Dec20, 2013, at 18:52 , Robert Haas <robertmhaas@gmail.com> wrote: >>> On Thu, Dec 19, 2013 at 6:40 PM, Florian Pflug <fgp@phlo.org> wrote: >>>> Solving this seems a bit messy, unfortunately. First, I think we need to have some XMLOPTION value which is a supersetof all the others - otherwise, dump & restore won't work reliably. That means either allowing DTDs if XMLOPTION isCONTENT, or inventing a third XMLOPTION, say ANY. >>> >>> Or we can just decide that it was a bug that this was ever allowed, >>> and if you upgrade to $FIXEDVERSION you'll need to sanitize your data. >>> This is roughly what we did with encoding checks. >> >> What exactly do you suggest we outlaw? > > <!DOCTYPE> anywhere but at the beginning. I think we're talking past one another here. Fixing XMLCONCAT/XMLAGG to not produce XML values which are neither valid DOCUMENTS nor valid CONTENT fixes *one* part of the problem. The other part of the problem is that since not every DOCUMENT is valid CONTENT (because CONTENT forbids DTDs) and not every CONTENT is a valid DOCUMENT (because DOCUMENT forbids multiple root nodes), it's impossible to set XMLOPTION to a value which accepts *all* valid XML values. That breaks pg_dump/pg_restore. To fix this, we must provide a way to insert XML data which accepts both DOCUMENTS and CONTENT, and not only one or the other. Due to the way COPY works, we cannot call a special conversion function, so we must modify the input functions. My initial thought was to simply allow XML values which are CONTENT, not DOCUMENTS, to contain a DTD (at the beginning), thus making CONTENT a superset of DOCUMENT. But I've since then realized that the 2003 standard explicitly constrains CONTENT to *not* contain a DTD. The only other option that I can see is to invert a third, non-standard XMLOPTION value, ANY. ANY would accept anything accepted by either DOCUMENT or CONTENT, but no more than that. best regards, Florian Pflug
В списке pgsql-hackers по дате отправления: