Re: [GENERAL] is (not) distinct from
От | Adrian Klaver |
---|---|
Тема | Re: [GENERAL] is (not) distinct from |
Дата | |
Msg-id | ccbfafd3-2f26-8f11-4a0b-ed7d4723c29e@aklaver.com обсуждение исходный текст |
Ответ на | Re: [GENERAL] is (not) distinct from (Johann Spies <johann.spies@gmail.com>) |
Список | pgsql-general |
On 03/07/2017 01:22 AM, Johann Spies wrote: > Thanks (again Adrian) and Tom. > > The situation was that I had a table with 731million records which I > wanted to copy into a partitioned one where there was a unique > constraint on the fields used in my query. > > The "backup" table was the single one. > > While inserting into the partitioned table from the backup one, several > (about 120000) records failed to insert. > > I wanted to find out which records were involved and found that some had > "'" characters in the values which broke some of the functions used to > do some calculations.. > > As there were fields that might have null values I have tried the "is > not distinct from". > > Both sides of the query had primary keys and I did not use group by. > That was why I used "distinct". Would it not be easier to use a LEFT JOIN between the original table(backup) and the new table: https://www.postgresql.org/docs/9.6/static/sql-select.html join_type "LEFT OUTER JOIN returns all rows in the qualified Cartesian product (i.e., all combined rows that pass its join condition), plus one copy of each row in the left-hand table for which there was no right-hand row that passed the join condition. This left-hand row is extended to the full width of the joined table by inserting null values for the right-hand columns. Note that only the JOIN clause's own condition is considered while deciding which rows have matches. Outer conditions are applied afterwards." So something like: SELECT a.pk FROM original_table AS a LEFT JOIN --The OUTER is not required new_table AS b ON a.pk = b.pk WHERE b.pk IS NULL That would show all the rows in the original table that where not transferred over. > > Anyhow in the end, I made some progress with a modified query: > > where > s.citing_article = A.citing_article > and > s.cited_article != A.cited_article > and > s.pubyear is distinct from A.pubyear > and > s.year_cited is distinct from A.year_cited > and > s.cited_author is distinct from A.cited_author > and > regexp_replace(s.cited_title, $$'$$, $$''$$,'g') is distinct from > regexp_replace(A.cited_title, $$'$$, $$''$$,'g') > and > regexp_replace(s.cited_work, $$'$$, $$''$$,'g') is distinct > from regexp_replace(A.cited_work, $$'$$, $$''$$,'g') > and > s.doi is distinct from A.doi > > Regards. > Johann > > -- > Because experiencing your loyal love is better than life itself, > my lips will praise you. (Psalm 63:3) -- Adrian Klaver adrian.klaver@aklaver.com
В списке pgsql-general по дате отправления: