Re: fulltext parser strange behave
От | Andrew Dunstan |
---|---|
Тема | Re: fulltext parser strange behave |
Дата | |
Msg-id | 4739FE1A.3090508@dunslane.net обсуждение исходный текст |
Ответ на | Re: fulltext parser strange behave (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > >> I've just been looking at the state machine in wparser_def.c. I think >> the processing for entities is also a few bob short in the pound. It >> recognises decimal numeric character references, but nor hexadecimal >> numeric character references. That's fairly silly since the HTML spec >> specifically says the latter are "particularly useful". The rules for >> named entities are also deficient w.r.t. digits, just like the case of >> tags that Tom noticed. This isn't academic: HTML features a number of >> named entities with digits in the name (sup2, frac14 for example). >> > > >> In XML at least, legal names are defined by the following rules from the >> spec: >> ... >> [A-Za-z:_][A-Za-z0-9:_.-]* >> > > >> I suggest we use that or something very close to it as the rule for >> names in these patterns. >> > > No objections here. Who wants to patch wparser_def? > > > I can get to it some time in the next week. - rather snowed under right now. BTW, I'm also suspicious of the clause that allows <?xml ... it appears that it will allow <?xfoo and <?XFOO also, which seems quite odd, especially the latter. cheers andrew
В списке pgsql-hackers по дате отправления: