Re: How to find double entries
От | Craig Ringer |
---|---|
Тема | Re: How to find double entries |
Дата | |
Msg-id | 48062003.3050409@postnewspapers.com.au обсуждение исходный текст |
Ответ на | Re: How to find double entries (Vivek Khera <vivek@khera.org>) |
Список | pgsql-sql |
Vivek Khera wrote: > > On Apr 15, 2008, at 11:23 PM, Tom Lane wrote: >> What's really a duplicate sounds like a judgment call here, so you >> probably shouldn't even think of automating it completely. > > I did a consulting gig about 10 years ago for a company that made > software to normalize street addresses and names. Literally dozens of > people worked there, and that was their primary software product. It is > definitely not a trivial task, as the rules can be extremely complex. From what little I've personally seen of others' addressing handling, some (many/most?) people who blindly advocate full normalisation of addresses either: (a) only care about a rather restricted set of address types ("ordinary residential addresses in <my country>", though that can be bad enough); or (b) don't know how horrible addressing is .... yet ... and are going to find out soon when their highly normalized addressing schema proves incapable of representing some address they've just been presented with. with most probably falling into the second category. Overly strict addressing, without the associated fairly extreme development effort to get it even vaguely right, seems to lead to users working around the broken addressing schema by entering bogus data. Personally I'm content to provide lots of space for user-formatted addresses, only breaking out separate fields for the post code (Australian only), the city/suburb, the state, and the country - all stored as strings. The only DB level validation is a rule preventing the entry of invalid & undefined postcodes for Australian addresses, and preventing the entry of invalid Australian states. The app is used almost entirely with Australian addresses, and there's a definitive, up to date list of australian post codes available from the postal services, so it's worth a little more checking to protect against basic typos and misunderstandings. The app provides some more help at the UI level for users, such as automatically filling in the state and suburb if an Australian post code is entered. It'll warn you if you enter an unknown Australian suburb/city for an entry in Australia. For everything else I leave it to the user and to possible later validation and reporting. I've had good results with this policy when working with other apps that need to handle addressing information, and I've had some truly horrible experiences with apps that try to be too strict in their address checking. -- Craig Ringer
В списке pgsql-sql по дате отправления: