Re: Encoding problems in PostgreSQL with XML data
От | Andrew Dunstan |
---|---|
Тема | Re: Encoding problems in PostgreSQL with XML data |
Дата | |
Msg-id | 3FFF129E.6020109@dunslane.net обсуждение исходный текст |
Ответ на | Re: Encoding problems in PostgreSQL with XML data ("Merlin Moncure" <merlin.moncure@rcsonline.com>) |
Ответы |
Re: Encoding problems in PostgreSQL with XML data
|
Список | pgsql-hackers |
Perhaps the document should be stored in canonical form. See http://www.w3.org/TR/xml-c14n I think I agree with Rod's opinion elsewhere in this thread. I guess the "philosophical" question is this: If 2 XML documents with different encodings have the same canonical form, or perhaps produce the same DOM, are they equivalent? Merlin appears to want to say "no", and I think I want to say "yes". cheers andrew Merlin Moncure wrote: >Peter Eisentraut wrote: > > >>The central problem I have is this: How do we deal with the fact that >>an XML datum carries its own encoding information? >> >> > >Maybe I am misunderstanding your question, but IMO postgres should be >treating xml documents as if they were binary data, unless the server >takes on the role of a parser, in which case it should handle >unspecified/unknown encodings just like a normal xml parser would (and >this does *not* include changing the encoding!). > >According to me, an XML parser should not change one bit of a document, >because that is not a 'parse', but a 'transformation'. > > > >>Rewriting the <?xml?> declaration seems like a workable solution, but >> >> >it > > >>would break the transparency of the client/server encoding conversion. >>Also, some people might dislike that their documents are being changed >>as they are stored. >> >> > >Right, your example begs the question: why does the server care what the >encoding of the documents is (perhaps indexing)? ZML validation is a >standardized operation which the server (or psql, I suppose) can >subcontract out to another application. > >Just a side thought: what if the xml encoding type was built into the >domain type itself? >create domain xml_utf8 ... >Which allows casting, etc. which is more natural than an implicit >transformation. > >Regards, >Merlin > >---------------------------(end of broadcast)--------------------------- >TIP 8: explain analyze is your friend > > >
В списке pgsql-hackers по дате отправления: