Re: PostgreSQL vs SQL/XML Standards

Поиск

Список

Период

Сортировка

От	Chapman Flack
Тема	Re: PostgreSQL vs SQL/XML Standards
Дата	11 февраля 2019 г. 15:51:25
Msg-id	3e8eab9e-7289-6c23-5e2c-153cccea2257@anastigmatix.net обсуждение исходный текст
Ответ на	Re: PostgreSQL vs SQL/XML Standards (Chapman Flack <chap@anastigmatix.net>)
Ответы	Re: PostgreSQL vs SQL/XML Standards Re: PostgreSQL vs SQL/XML Standards
Список	pgsql-hackers

Дерево обсуждения

[Resending to list so commitfest app will see it; the list blocked
this message the first time on a mail reputation issue. Sorry for
the duplication. I've removed the individual cc:s from this message.]

On 02/05/19 23:16, Chapman Flack wrote:
> I wonder whether, given the move to next CF, it makes sense to change
> the title of the CF entry from "XMLTABLE" to, more generically, XML
> improvements, and get one or two more small changes in:

Interpreting the crickets as approval, I have changed the title of the
CF entry, and the status back to Needs Review, with these patches
attached:

xmltable-xpath-result-processing-bugfix-6.patch
xmltable-xmlexists-passing-mechanisms-3.patch
xml-functions-type-docfix-2.patch
xml-content-2006-1.patch

That last one is new, and everything is rebased (onto 068503c).

xmltable-xpath-result-processing-bugfix-6.patch includes a regress/expected
output for the no-libxml case that was left out of -5.

xml-functions-type-docfix-2.patch removes one more sentence I had meant
to remove[1] but forgotten to.

xml-content-2006-1.patch does this:

> - get XMLPARSE(CONTENT... (and cast-to-xml with XMLOPTION=content) to
>   succeed even for content with DTDs, so that the content subtype really
>   does fully include the document subtype, aligning it with the SQL:2006+
>   standard. I think this would be a simple patch that I can deliver early
>   this month, and Tom found reports where the current behavior already
>   bites people in pg_restore. Its only effect would be to allow a currently-
>   failing case to succeed (and stop biting people).

It works as suggested in [2], just by intercepting the error if a
parse-as-content trips over a DTD, and retrying as a parse-as-document.

While that has a certain hacky smell, it also has the advantage of
handling what's probably an uncommon edge case in a way that adds no
upfront cost. (Other, 'tidier' approaches could involve evaluating a
regex first to decide how to parse--I believe everything that's allowed
ahead of a DTD makes a regular language--but that would add cycles to
every parse.)

In xml.c one does find the following comment:

 * TODO maybe libxml2's xmlreader is better? (do not construct DOM,
 * yet do not use SAX - see xmlreader.c)

and yes, I think a complete rewrite of xml_parse along those lines would
probably be a substantial win (why construct an internal DOM just to confirm
that the input is parsable, then throw it away?). But that would be a more
involved rewrite that I'm not volunteering to do.

This patch is a quick way to get the desired behavior given the current
implementation.

-Chap

[1]
https://www.postgresql.org/message-id/5C4A94A5.8010402%40anastigmatix.net
[2]
https://www.postgresql.org/message-id/5C4BDBFF.6040905%40anastigmatix.net

Вложения

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: PostgreSQL vs SQL/XML Standards

Вложения