Re: identifying duplicates in table with redundancies
От | Oliveiros d'Azevedo Cristina |
---|---|
Тема | Re: identifying duplicates in table with redundancies |
Дата | |
Msg-id | 547760C5672F498ABD458B4B277DE370@marktestcr.marktest.pt обсуждение исходный текст |
Ответ на | identifying duplicates in table with redundancies (Tarlika Elisabeth Schmitz <postgresql@numerixtechnology.de>) |
Список | pgsql-sql |
> > Hallo Andreas, > I reduced the problem to the innermost query: > > 1) SELECT DISTINCT trainer_id, trainer_name FROM student > This results in a sequential table scan. Execution time 7500ms. > > 2) I created an INDEX ON (trainer_id, trainer_name). Then I had an index > scan instead, which still cost 7000ms. > > 3) Next, I changed from DISTINCT to GROUP BY: > SELECT trainer_id, trainer_name FROM student > GROUP BY trainer_id, trainer_name > This resulted in an index scan @ 6750ms > > 4) I filtered out NULL trainer_ids > WHERE trainer_id IS NOT NULL > Amazingly, this resulted in a sequential table scan, which only took > 1300ms!! > > Please, explain (pun not intended)! How can this be. Only 11000/250000 > rows have a null trainer_id. > That's an impressive improvement... Personally I have no idea what caused it, specially when you say it was sequential :-| Warmed caches ? Best, Oliver
В списке pgsql-sql по дате отправления: