Re: how to make duplicate finding query faster?
От | Sachin Kumar |
---|---|
Тема | Re: how to make duplicate finding query faster? |
Дата | |
Msg-id | CALg-PKB92uV1v_R2JTLsr27xUdJaEru-b=Frig_YC=jy9L3X6A@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: how to make duplicate finding query faster? (Scott Ribe <scott_ribe@elevated-dev.com>) |
Ответы |
Re: how to make duplicate finding query faster?
|
Список | pgsql-admin |
Hi Scott,

Yes, I am checking one by one because my goal is to fail the whole upload if there is any duplicate entry and to inform the user that they have a duplicate entry in the file.
Regards
Sachin

On Wed, Dec 30, 2020 at 6:43 PM Scott Ribe <scott_ribe@elevated-dev.com> wrote:
> On Dec 30, 2020, at 12:36 AM, Sachin Kumar <sachinkumaras@gmail.com> wrote:
>
> Hi All,
>
> I am uploading data into PostgreSQL using the CSV file and checking if there is any duplicates value in DB it should return a duplicate error. I am using below mention query.
>
> if Card_Bank.objects.filter( Q(ACCOUNT_NUMBER=card_number) ).exists():
> flag=2
> else:
> flag=1
> it is taking too much time i am using 600k cards in CSV.
>
> Kindly help me in making the query faster.
>
> I am using Python, Django & PostgreSQL.
> --
>
> Best Regards,
> Sachin Kumar
Are you checking one-by-one because your goal is not to fail the whole upload that contains the duplicates, but rather to skip only the duplicates?
If that's the case, I think you'd be better off copying the CSV straight into a temp table, using a join to delete duplicates from it, then insert the remainder into the target table, and finally drop the temp table.
Best Regards,
Sachin Kumar
В списке pgsql-admin по дате отправления: