Re: Out of memory error during large hashagg
От | Casey Duncan |
---|---|
Тема | Re: Out of memory error during large hashagg |
Дата | |
Msg-id | 8E8D36A0-61E4-46D1-BE1B-789E41D6CD4A@pandora.com обсуждение исходный текст |
Ответ на | Re: Out of memory error during large hashagg (Simon Riggs <simon@2ndquadrant.com>) |
Список | pgsql-bugs |
On Sep 19, 2006, at 1:51 AM, Simon Riggs wrote: > On Mon, 2006-09-18 at 14:08 -0700, Casey Duncan wrote: >> I've reported variants of this in the past, but this case is entirely >> repeatable. >> >> Executing this query: >> >> select st_id, min(seed_id) as "initial_seed_id", count(*) as >> "seed_count" >> from seed group by st_id; >> >> The query plan and table stats are: >> >> QUERY PLAN >> --------------------------------------------------------------------- >> -- >> HashAggregate (cost=1362694.83..1365164.68 rows=164656 width=16) >> -> Seq Scan on seed (cost=0.00..964065.62 rows=53150562 >> width=16) >> >> >> relname | relpages | reltuples >> ---------+----------+------------- >> seed | 428880 | 5.26984e+07 >> > > The hashagg doesn't yet have scroll to disk capability, so a bad > estimation of ndistinct will cause this to fail (at any release). This > is a known issue for certain distributions of data only. The > workaround > is the one you know about already: enable_hashagg = off Ok, is that something for the TODO list? I took a glance and didn't see it. > > I'm interested in collecting info on the distribution of data. > Can you post: > > select tablename, attname, n_distinct from pg_stats > where attname = 'st_id'; tablename | attname | n_distinct --------------+---------+------------ st | st_id | -1 seed | st_id | 164656 feed | st_id | 14250 book | st_id | 14856 legacy_st | st_id | -1 (5 rows) I ran analyze after this, but the results were roughly the same. > select count(distinct st_id) from seed; count ---------- 40418083 (1 row) Looks a tad bit different than the above ;^) > and also the table definition, including the PK Table "public.seed" Column | Type | Modifiers --------------+-----------------------------+--------------- seed_id | bigint | not null mc_id | character varying(20) | st_id | bigint | date_created | timestamp without time zone | default now() Indexes: "seed_pkey" PRIMARY KEY, btree (seed_id) "seed_st_mc_id_idx" UNIQUE, btree (mc_id, st_id) "seed_mc_id_idx" btree (mc_id) "seed_st_id" btree (st_id) Foreign-key constraints: "seed_st_id_fkey" FOREIGN KEY (st_id) REFERENCES st(st_id) ON DELETE RESTRICT Thanks -Casey
В списке pgsql-bugs по дате отправления: