Re: Duplicate data despite unique constraint
От | Adrian Klaver |
---|---|
Тема | Re: Duplicate data despite unique constraint |
Дата | |
Msg-id | 3481cafa-cf8c-4401-722f-38eed301f9bd@aklaver.com обсуждение исходный текст |
Ответ на | Duplicate data despite unique constraint (Jonas Tehler <jonas@tehler.se>) |
Ответы |
Re: Duplicate data despite unique constraint
|
Список | pgsql-general |
On 09/02/2016 04:32 AM, Jonas Tehler wrote: > > Hi, > > We’re running Postgresql 9.4.5 on Amazon RDS. One of our tables > looks something like this: > > CREATE TABLE users > ( > ... > email character varying(128) NOT NULL, > ... > CONSTRAINT users_email_key UNIQUE (email) > ) > > Despite this we have rows with very similar email values. I discovered > the problem when I tried to add a column and got the following error: > > ActiveRecord::RecordNotUnique: PG::UniqueViolation: ERROR: could not > create unique index "users_email_key" > DETAIL: Key (email)=(xxx@yyy.com <mailto:xxx@yyy.com>) is duplicated. > : ALTER TABLE "users" ADD COLUMN "client_branch" character varying(255) > DEFAULT ‘beta' > > Now look at this: > > => select email from users where email = 'xxx@yyy.com <mailto:xxx@yyy.com>'; > email > --------------------------- > xxx@yyy.com <mailto:xxx@yyy.com> > (1 row) > > => select email from users where email LIKE 'xxx@yyy.com > <mailto:xxx@yyy.com>'; > email > --------------------------- > xxx@yyy.com <mailto:xxx@yyy.com> > xxx@yyy.com <mailto:xxx@yyy.com> > (2 rows) > > I have tried to compare the binary data in various ways, email::bytes, > md5(email), encode(email::bytea, 'hex’), char_length(email) and it all > looks the same for both rows. > > Any suggestions how I can discover the difference between the values and > how they could have been added without triggering the constraint? I know > that the values were added after the constraint was added. select ctid, email from users where email LIKE 'xxx@yyy.com'; https://www.postgresql.org/docs/9.5/static/ddl-system-columns.html "ctid The physical location of the row version within its table. Note that although the ctid can be used to locate the row version very quickly, a row's ctid will change if it is updated or moved by VACUUM FULL. Therefore ctid is useless as a long-term row identifier. The OID, or even better a user-defined serial number, should be used to identify logical rows. " Best guess is the INDEX on the column is corrupted and needs to be reindexed: https://www.postgresql.org/docs/9.5/static/sql-reindex.html > > The data was added from a Ruby on Rails app that also has unique > constraints on the email field and validation on the email format. > > / Jonas > > -- Adrian Klaver adrian.klaver@aklaver.com
В списке pgsql-general по дате отправления: