Re: PostgreSQL vs SQL/XML Standards
От | Chapman Flack |
---|---|
Тема | Re: PostgreSQL vs SQL/XML Standards |
Дата | |
Msg-id | 3e8eab9e-7289-6c23-5e2c-153cccea2257@anastigmatix.net обсуждение исходный текст |
Ответ на | Re: PostgreSQL vs SQL/XML Standards (Chapman Flack <chap@anastigmatix.net>) |
Ответы |
Re: PostgreSQL vs SQL/XML Standards
Re: PostgreSQL vs SQL/XML Standards |
Список | pgsql-hackers |
[Resending to list so commitfest app will see it; the list blocked this message the first time on a mail reputation issue. Sorry for the duplication. I've removed the individual cc:s from this message.] On 02/05/19 23:16, Chapman Flack wrote: > I wonder whether, given the move to next CF, it makes sense to change > the title of the CF entry from "XMLTABLE" to, more generically, XML > improvements, and get one or two more small changes in: Interpreting the crickets as approval, I have changed the title of the CF entry, and the status back to Needs Review, with these patches attached: xmltable-xpath-result-processing-bugfix-6.patch xmltable-xmlexists-passing-mechanisms-3.patch xml-functions-type-docfix-2.patch xml-content-2006-1.patch That last one is new, and everything is rebased (onto 068503c). xmltable-xpath-result-processing-bugfix-6.patch includes a regress/expected output for the no-libxml case that was left out of -5. xml-functions-type-docfix-2.patch removes one more sentence I had meant to remove[1] but forgotten to. xml-content-2006-1.patch does this: > - get XMLPARSE(CONTENT... (and cast-to-xml with XMLOPTION=content) to > succeed even for content with DTDs, so that the content subtype really > does fully include the document subtype, aligning it with the SQL:2006+ > standard. I think this would be a simple patch that I can deliver early > this month, and Tom found reports where the current behavior already > bites people in pg_restore. Its only effect would be to allow a currently- > failing case to succeed (and stop biting people). It works as suggested in [2], just by intercepting the error if a parse-as-content trips over a DTD, and retrying as a parse-as-document. While that has a certain hacky smell, it also has the advantage of handling what's probably an uncommon edge case in a way that adds no upfront cost. (Other, 'tidier' approaches could involve evaluating a regex first to decide how to parse--I believe everything that's allowed ahead of a DTD makes a regular language--but that would add cycles to every parse.) In xml.c one does find the following comment: * TODO maybe libxml2's xmlreader is better? (do not construct DOM, * yet do not use SAX - see xmlreader.c) and yes, I think a complete rewrite of xml_parse along those lines would probably be a substantial win (why construct an internal DOM just to confirm that the input is parsable, then throw it away?). But that would be a more involved rewrite that I'm not volunteering to do. This patch is a quick way to get the desired behavior given the current implementation. -Chap [1] https://www.postgresql.org/message-id/5C4A94A5.8010402%40anastigmatix.net [2] https://www.postgresql.org/message-id/5C4BDBFF.6040905%40anastigmatix.net
Вложения
В списке pgsql-hackers по дате отправления: