BUG #6375: tsearch does not recognize all valid emails
От | valgog@gmail.com |
---|---|
Тема | BUG #6375: tsearch does not recognize all valid emails |
Дата | |
Msg-id | E1Ri8il-0008Ct-9p@wrigleys.postgresql.org обсуждение исходный текст |
Ответы |
Re: BUG #6375: tsearch does not recognize all valid emails
|
Список | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 6375 Logged by: Valentine Gogichashvili Email address: valgog@gmail.com PostgreSQL version: 9.1.1 Operating system: Debian 4.4.5-8 Description:=20=20=20=20=20=20=20=20 Hello,=20 default tsearch parser does not recognize all valid email addresses and tokenizes them as text, splitting into tokens.=20 For example: postgres=3D# select to_tsquery('simple', 'normal@email.com' ); to_tsquery=20=20=20=20=20 =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80 'normal@email.com' (1 row) here it behaves ok; postgres=3D# select to_tsquery('simple', '-still-normal@email.com' ); to_tsquery=20=20=20=20=20=20=20=20 =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80 'still-normal@email.com' (1 row) here it trims '-' from the beginning of an email. This is not correct, but will at least find that email. postgres=3D# select to_tsquery('simple', '-not-normal-with-dash-@email.com' ); to_tsquery=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 =20=20 =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80 'not-normal-with-dash' & 'not' & 'normal' & 'with' & 'dash' & 'email.com' (1 row) and this is now a real problem as it leads to finding emails that are not the same, but are "super-sets" of that one. Valid email characters, that are not correctly treated also are at least '+' and '.' With my best regards,=20 -- Valentine Gogichashvili=20
В списке pgsql-bugs по дате отправления: