Re: Review of patch Bugfix for XPATH() if expression returns a scalar value

Поиск
Список
Период
Сортировка
От Florian Pflug
Тема Re: Review of patch Bugfix for XPATH() if expression returns a scalar value
Дата
Msg-id 6A2C047F-7EE4-430D-9813-A1B215D81D56@phlo.org
обсуждение исходный текст
Ответ на Review of patch Bugfix for XPATH() if expression returns a scalar value  (Radosław Smogura <rsmogura@softperience.eu>)
Ответы Re: Review of patch Bugfix for XPATH() if expression returns a scalar value  (Radosław Smogura <rsmogura@softperience.eu>)
Список pgsql-hackers
On Jun29, 2011, at 19:57 , Radosław Smogura wrote:
> This is review of patch
> https://commitfest.postgresql.org/action/patch_view?id=565
> "Bugfix for XPATH() if expression returns a scalar value"
>
> SELECT XMLELEMENT(name root, XMLATTRIBUTES(foo.namespace AS sth)) FROM (SELECT
> (XPATH('namespace-uri(/*)', x))[1] AS namespace FROM (VALUES (XMLELEMENT(name
> "root", XMLATTRIBUTES('<n' AS xmlns, '<v' AS value),'<t'))) v(x)) as foo;
>
>       xmlelement
> -------------------------
> <root sth="&lt;n"/>
>
>     It's clearly visible that value from attribute is "<n", not "<". Every
> parser will read this as "<n" which is not-natural and will require form
> consumer/developer to de-escape this on his side - roughly speaking this will
> be reported as serious bug.

There's a problem there, no doubt, but your proposed solution just moves the
problem around.

Here's your query, reformatted to be more legible

SELECT XMLELEMENT(name root,            XMLATTRIBUTES(foo.namespace AS sth) )
FROM ( SELECT    (XPATH('namespace-uri(/*)', x))[1] AS namespace FROM (VALUES (XMLELEMENT(name "root",
XMLATTRIBUTES('<n'AS xmlns,                             '<v' AS value),'<t') ) 
) v(x)) as foo;

What happens is that the XPATH expression selects the xmlns
attribute with the value '<n'. To be well-formed xml, that value
must be escaped, so what is actually returned by the XPATH
call is '<n'. The XMLATTRIBUTES() function, however, doesn't
distinguish between input of type XML and input of type TEXT,
so it figures it has to represent the *textual* value '<n'
in xml, and produces '&lt;n'.

Not escaping textual values returned by XPATH isn't a solution,
though. For example, assume someone inserts the value returned
by the XPATH() call in your query into a column of type XML.
If XPATH() returned '<n' literally instead of escaping it,
the column would contain non-well-formed XML in a column of type
XML. Not pretty, and pretty fatal during your next dump/reload
cycle.

Or, if you want another example, simply remove the XMLATTRIBUTES
call and use the value returned by XPATH as a child node directly.

SELECT XMLELEMENT(name root,            foo.namespace )
FROM ( SELECT    (XPATH('namespace-uri(/*)', x))[1] AS namespace FROM (VALUES (XMLELEMENT(name "root",
XMLATTRIBUTES('<n'AS xmlns,                             '<v' AS value),'<t') ) 
) v(x)) as foo;
    xmlelement
--------------------<root><n</root>

Note that this correct! The <root> node contains a text node
representing the textual value '<n'. If XPATH() hadn't return
the value escaped, the query above would have produces  <root><n</root>

which is obviously wrong. It works in this case because XMLELEMENT
is smart enough to distinguish between child nodes gives as TEXT
values (those are escaped) and child nodes provided as XML values
(those are inserted unchanged).

XMLATTRIBUTES, however, doesn't make that distinction, and simply
escaped all attributes values independent of their type. Now, the
reason it does that is probably that *not* all XML values are
well-formed attribute values without additional escaping. For example,
you cannot have literal '<' and '>' in attribute values. So if
XMLATTRIBUTES, like XMLELEMENT never escaped values which are
already of type XML, the following piece of code
 XMLELEMENT(name "a", XMLATTRIBUTES('<node/>'::xml as v))

would produce
 <a v="<node/>"/>

which, alas, is not well-formed :-(

The correct solution, I believe, is to teach XMLATTRIBUTES to
only escape those characters which are invalid in attribute
values but valid in XML (Probably only '<' and '>' but I'll
check). If there are no objects, I'll prepare a separate patch
for that. I don't want to include it in this patch because it's
really a distinct issue (even modifies a different function).

best regards,
Florian Pflug






В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Kevin Grittner"
Дата:
Сообщение: serializable installcheck
Следующее
От: Florian Pflug
Дата:
Сообщение: Re: Patch Review: Bugfix for XPATH() if text or attribute nodes are selected