Re: Re: [BUGS] BUG #5021: ts_parse doesn't recognize email addresses with underscores
От | Tom Lane |
---|---|
Тема | Re: Re: [BUGS] BUG #5021: ts_parse doesn't recognize email addresses with underscores |
Дата | |
Msg-id | 20643.1268443116@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: Re: [BUGS] BUG #5021: ts_parse doesn't recognize email addresses with underscores (Bruce Momjian <bruce@momjian.us>) |
Ответы |
Re: Re: [BUGS] BUG #5021: ts_parse doesn't
recognize email addresses with underscores
Re: Re: [BUGS] BUG #5021: ts_parse doesn't recognize email addresses with underscores |
Список | pgsql-hackers |
Bruce Momjian <bruce@momjian.us> writes: > Well, I think the big question is whether we need to honor RFC 5322 > (http://www.rfc-editor.org/rfc/rfc5322.txt). Wikipedia says these are > all valid characters: > http://en.wikipedia.org/wiki/E-mail_address > * Uppercase and lowercase English letters (a-z, A-Z) > * Digits 0 to 9 > * Characters ! # $ % & ' * + - / = ? ^ _ ` { | } ~ > * Character . (dot, period, full stop) provided that it is not the > first or last character, and provided also that it does not appear two > or more times consecutively. That's an awful lot of special characters. For the RFC's purposes, it's not hard to be flexible because in an email message there is external context telling where to expect an address. I think if we tried to allow all of those in email addresses in tsearch, we'd have "email addresses" gobbling up a whole lot of adjacent text, to nobody's benefit. I can see the case for adding "+" because that's fairly common as Alvaro notes, but I think we should be very circumspect about going farther. regards, tom lane
В списке pgsql-hackers по дате отправления: