Re: Native XML
От | Andrew Dunstan |
---|---|
Тема | Re: Native XML |
Дата | |
Msg-id | 4D6D4D15.9060206@dunslane.net обсуждение исходный текст |
Ответ на | Re: Native XML ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>) |
Ответы |
Re: Native XML
|
Список | pgsql-hackers |
On 03/01/2011 02:15 PM, Kevin Grittner wrote: > >>> Given that there were similar issues for other hierarchical data >>> types, perhaps we need something similar to tsvector, but for >>> hierarchical data. The extra layer of abstraction might not cost >>> much when used for XML compared to the possible benefit with >>> other data. It seems likely to be a very nice fit with GiST >>> indexes. >>> >>> So under this idea, you would always have the text (or maybe byte >>> array?) version of the XML, and you could "shard" it to a >>> separate column for fast searches. > >> Tsearch should be able to handle XML now. It certainly knows how >> to recognize XML tags. > > I apparently didn't express myself very well, since you seem to have > *completely* missed my point. I know we can do tsearch2 searches > against XML, or JSON, or YAML, or (insert next week's new favorite > format here). What we can't currently do efficiently is search for > particular values in some particular place in the hierarchy of a > document. I've had loads of fun approximating it with regular > expressions, but some days I'd like life to be easier. > > What I was arguing for is a new type which would represent the > structure in a fashion which was independent of the particular text > format and was efficient to traverse hierarchically. Done right, > that would map well to GiST. Although, thinking about that some > more, perhaps there would be a way to create a GiST index suitable > for that straight from the XML text, and avoid the sharded column. > A GiST index actually seems pretty close to what such a structure > would look like anyway.... > I probably didn't read your suggestion closely enough. I think hierarchical data really only scratches the surface of the problem. It would be nice to be able to specify all sorts of context for searches: * foo after bar * foo near bar * foo and bar in the same paragraph * foo as a parent/child/ancestor/descendent/sibling/cousinof bar cheers andrew
В списке pgsql-hackers по дате отправления: