Re: Configuration Advice

Поиск

Список

Период

Сортировка

От	Arjen van der Meijden
Тема	Re: Configuration Advice
Дата	18 января 2007 г. 20:10:20
Msg-id	45AFE234.70301@tweakers.net обсуждение исходный текст
Ответ на	Re: Configuration Advice ("Jeremy Haile" <jhaile@fastmail.fm>)
Ответы	Re: Configuration Advice (Tom Lane <tgl@sss.pgh.pa.us>)
Список	pgsql-performance

Дерево обсуждения

On 18-1-2007 18:28 Jeremy Haile wrote:
>> I once had a query which would operate on a recordlist and
>> see whether there were any gaps larger than 1 between consecutive
>> primary keys.
>
> Would you mind sharing the query you described?  I am attempting to do
> something similar now.


Well it was over a year ago, so I don't know what I did back then. But
since it was a query adjusted from what I did in MySQL there where no
subqueries involved, I think it was something like this:
select a.id, min(b.id)
  from
   members a
   join members b on a.id < b.id
   left join members c on a.id +1 = c.id
where c.id IS NULL
group by a.id;

Or rewriting it to this one halves the execution time though:

select a.id, min(b.id)
  from
   members a
   left join members c on a.id +1 = c.id
   join members b on a.id < b.id
where c.id IS NULL
group by a.id;

Although this query seems to be much faster with 150k records:

select aid, bid
from
(select a.id as aid, (select min(b.id) from members b where b.id > a.id)
as bid
  from
   members a
group by a.id) as foo
where bid > aid+1;

The first one takes about 16 seconds on my system with PG 8.2, the
second about 1.8 second. But back then the list was much shorter, so it
can have been the first one or a variant on that. On MySQL the first
takes much more than the 16 seconds PostgreSQL uses, and after editting
this e-mail it still isn't finished... The second one made EXPLAIN hang
in my 5.0.32-bk, so I didn't try that for real.

Best regards,

Arjen

PS, In case any of the planner-hackers are reading, here are the plans
of the first two queries, just to see if something can be done to
decrease the differences between them. The main differences seems to be
that groupaggregate vs the hashaggregate?

  GroupAggregate  (cost=34144.16..35144.38 rows=50011 width=8) (actual
time=17653.401..23881.320 rows=71 loops=1)
    ->  Sort  (cost=34144.16..34269.19 rows=50011 width=8) (actual
time=17519.274..21423.128 rows=7210521 loops=1)
          Sort Key: a.id
          ->  Nested Loop  (cost=11011.41..30240.81 rows=50011 width=8)
(actual time=184.412..10945.189 rows=7210521 loops=1)
                ->  Hash Left Join  (cost=11011.41..28739.98 rows=1
width=4) (actual time=184.384..1452.467 rows=72 loops=1)
                      Hash Cond: ((a.id + 1) = c.id)
                      Filter: (c.id IS NULL)
                      ->  Seq Scan on members a  (cost=0.00..9903.33
rows=150033 width=4) (actual time=0.009..71.463 rows=150033 loops=1)
                      ->  Hash  (cost=9903.33..9903.33 rows=150033
width=4) (actual time=146.040..146.040 rows=150033 loops=1)
                            ->  Seq Scan on members c
(cost=0.00..9903.33 rows=150033 width=4) (actual time=0.002..77.066
rows=150033 loops=1)
                ->  Index Scan using members_pkey on members b
(cost=0.00..875.69 rows=50011 width=4) (actual time=0.025..78.971
rows=100146 loops=72)
                      Index Cond: (a.id < b.id)
  Total runtime: 23882.511 ms
(13 rows)

  HashAggregate  (cost=30240.82..30240.83 rows=1 width=8) (actual
time=12870.440..12870.504 rows=71 loops=1)
    ->  Nested Loop  (cost=11011.41..30240.81 rows=1 width=8) (actual
time=168.658..9466.644 rows=7210521 loops=1)
          ->  Hash Left Join  (cost=11011.41..28739.98 rows=1 width=4)
(actual time=168.630..865.690 rows=72 loops=1)
                Hash Cond: ((a.id + 1) = c.id)
                Filter: (c.id IS NULL)
                ->  Seq Scan on members a  (cost=0.00..9903.33
rows=150033 width=4) (actual time=0.012..70.612 rows=150033 loops=1)
                ->  Hash  (cost=9903.33..9903.33 rows=150033 width=4)
(actual time=140.432..140.432 rows=150033 loops=1)
                      ->  Seq Scan on members c  (cost=0.00..9903.33
rows=150033 width=4) (actual time=0.003..76.709 rows=150033 loops=1)
          ->  Index Scan using members_pkey on members b
(cost=0.00..875.69 rows=50011 width=4) (actual time=0.023..73.317
rows=100146 loops=72)
                Index Cond: (a.id < b.id)
  Total runtime: 12870.756 ms
(11 rows)

В списке pgsql-performance по дате отправления:

Предыдущее

От: Scott Marlowe
Дата: 18 января 2007 г., 20:04:21
Сообщение: Re: Postgres and really huge tables

Следующее

От: Tom Lane
Дата: 18 января 2007 г., 20:30:24
Сообщение: Re: Autoanalyze settings with zero scale factor

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Configuration Advice

Предыдущее

Следующее