Обсуждение: Parallel Scan Bug: invalid attnum: 0
running pg 9.6.1 on AWS
max_parallel_workers_per_gather = 4
This only occurs in some schemas, not all. The affected schemas have
approx 1 million rows in the table. Much larger schemas (65 million rows)
and smaller schemas(100K rows) do not have the problem.
table definition:
CREATE TABLE ro_summaries (
day date NOT NULL,
entry_firm varchar(15) NOT NULL,
route_firm varchar(15) NOT NULL,
order_type varchar(3) NOT NULL,
symbol varchar(15) NOT NULL,
bucket_id int4 NOT NULL,
entry_status varchar(2) NOT NULL,
covered_order bool NOT NULL,
orders_sent int4 NOT NULL DEFAULT 0,
shares_sent int4 NOT NULL DEFAULT 0,
orders_pos int4 NOT NULL DEFAULT 0,
orders_zero int4 NOT NULL DEFAULT 0,
orders_neg int4 NOT NULL DEFAULT 0,
orders_canceled int4 NOT NULL DEFAULT 0,
orders_null int4 NOT NULL DEFAULT 0,
shares_pos int4 NOT NULL DEFAULT 0,
shares_zero int4 NOT NULL DEFAULT 0,
shares_neg int4 NOT NULL DEFAULT 0,
shares_canceled int4 NOT NULL DEFAULT 0,
shares_null int4 NOT NULL DEFAULT 0,
pi_pos numeric NOT NULL DEFAULT 0,
pi_neg numeric NOT NULL DEFAULT 0,
efq numeric NULL,
effective_spread numeric NULL,
quoted_spread numeric NULL,
realized_spread numeric NULL,
speed numeric NULL,
part1_shares int4 NULL,
trades_total int4 NULL,
orders_filled int4 NULL DEFAULT 0,
efq_shares int4 NULL,
exchange varchar(10) NULL,
notional_value numeric NULL
)
WITHOUT OIDS
TABLESPACE pg_default
GO
CREATE UNIQUE INDEX rosum_ux01
ON ro_summaries USING btree (day date_ops, order_type text_ops, bucket_id
int4_ops, route_firm text_ops, entry_firm text_ops, symbol text_ops,
entry_status text_ops, covered_order bool_ops)
view definition:
CREATE OR REPLACE VIEW ro_summaries_v
AS
SELECT ro_summaries.day,
'S3'::character varying AS source_code,
ro_summaries.entry_firm,
ro_summaries.route_firm,
ro_summaries.order_type,
ro_summaries.symbol,
ro_summaries.bucket_id,
ro_summaries.entry_status,
ro_summaries.covered_order,
ro_summaries.orders_sent,
ro_summaries.shares_sent,
(((ro_summaries.shares_pos + ro_summaries.shares_neg) +
ro_summaries.shares_zero) + ro_summaries.shares_null) AS shares_total,
(ro_summaries.shares_pos + ro_summaries.shares_zero) AS
shares_atorbetter,
ro_summaries.shares_pos,
ro_summaries.shares_neg,
ro_summaries.trades_total,
ro_summaries.pi_pos,
ro_summaries.pi_neg,
ro_summaries.efq,
ro_summaries.effective_spread,
ro_summaries.quoted_spread,
ro_summaries.realized_spread,
ro_summaries.speed,
ro_summaries.part1_shares,
ro_summaries.orders_filled,
ro_summaries.efq_shares,
ro_summaries.exchange,
ro_summaries.notional_value
FROM msco_data.ro_summaries;
offending query:
select
count(*) OVER() AS full_count,
source_code,
sum(shares_atorbetter) shares_atorbetter,
'All ' day_display, 'All ' entry_firm, 'All ' route_firm, 'All ' symbol,
'All ' order_type,
0 bucket_id,
case when sum(shares_total) > 0 then
(sum(shares_atorbetter)::numeric/sum(shares_total)::numeric)*100 end
at_or_better,
case WHEN sum((shares_total)) > 0 THEN
(sum(shares_pos)::numeric/sum((shares_total))::numeric)::numeric*100 END
pi_percent, coalesce(sum(pi_pos),0) pi_amount,
case WHEN sum(shares_pos) > 0 THEN (sum(pi_pos)::numeric /
sum(shares_pos)::numeric)::numeric ELSE 0 END pi_per_unit,
case WHEN sum((shares_total)) > 0 THEN
(sum(shares_neg)::numeric/sum((shares_total))::numeric)::numeric*100 END
slippage_percent,
coalesce(sum(pi_neg),0) slippage_amount,
case WHEN sum((efq_shares)) > 0 THEN
sum((efq_shares)*effective_spread)::numeric/sum((efq_shares))::numeric END
effective_spread,
case WHEN sum((efq_shares)) > 0 THEN
sum((efq_shares)*quoted_spread)::numeric/sum((efq_shares))::numeric END
quoted_spread,
case WHEN sum((efq_shares)) > 0 THEN
sum((efq_shares)*realized_spread)::numeric/sum((efq_shares))::numeric END
realized_spread,
case when sum((efq_shares)*quoted_spread) > 0 THEN
(sum((efq_shares)*effective_spread)::numeric/sum((efq_shares)*quoted_spread)::numeric)
* 100 END EFQ,
sum(trades_total) trades, sum((shares_total)) shares,
sum(shares_pos) shares_pos, sum(shares_neg) shares_neg,
sum(orders_sent) orders, sum(shares_sent) ordered_shares,
sum(pi_pos + pi_neg) variance,
case when sum((shares_total)) > 0 THEN sum(pi_pos + pi_neg) /
sum((shares_total))::numeric END avg_variance,
case when sum(orders_sent) > 0 THEN sum(shares_sent) / sum(orders_sent) END
avg_shares_order,
case when sum(shares_sent) > 0 then
least((sum((shares_total))::numeric/sum(shares_sent)::numeric)::numeric*100,100)
END pct_shares_executed,
sum(notional_value) notional_value,
sum(orders_filled) orders_filled,
case when sum(part1_shares) > 0 then sum(part1_shares * speed)::numeric /
sum(part1_shares)::numeric end speed
from ro_summaries_v
where day = '2016-11-07'
group by source_code
ORDER BY day_display desc NULLS LAST
LIMIT 25 OFFSET 0;
Commenting out source_code (and the group by) allows the query to run.
Alternately, commenting out several of the case statements also allows it
to run.
In the meantime I've set max_parallel_workers_per_gather = 0 and the query
works fine.
Please let me know if you need any more information.
Steve Randall <srandall@s3.com> writes:
> running pg 9.6.1 on AWS
> max_parallel_workers_per_gather = 4
> This only occurs in some schemas, not all. The affected schemas have
> approx 1 million rows in the table. Much larger schemas (65 million rows)
> and smaller schemas(100K rows) do not have the problem.
FWIW, I failed to duplicate this from the information provided.
Can you get us EXPLAIN output for the troublesome case, or does it
fail before printing the EXPLAIN output? How about a stack trace
from the errfinish call?
regards, tom lane
I have been able to reproduce the error in my development environment. EXPLAIN output is attached. EXPLAIN ANALYZE fails with the error mentioned.
I have to do a little configuring to get the stack trace logs, but it will be a couple of weeks before they are ready. Tomorrow I head to San Francisco for a few days off prior to pgConf.
On Wed, Nov 9, 2016 at 12:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Steve Randall <srandall@s3.com> writes:
> running pg 9.6.1 on AWS
> max_parallel_workers_per_gather = 4
> This only occurs in some schemas, not all. The affected schemas have
> approx 1 million rows in the table. Much larger schemas (65 million rows)
> and smaller schemas(100K rows) do not have the problem.
FWIW, I failed to duplicate this from the information provided.
Can you get us EXPLAIN output for the troublesome case, or does it
fail before printing the EXPLAIN output? How about a stack trace
from the errfinish call?
regards, tom lane
Вложения
Steve Randall <srandall@s3.com> writes:
> I have been able to reproduce the error in my development environment.
> EXPLAIN output is attached. EXPLAIN ANALYZE fails with the error
> mentioned.
Ah, I've duplicated it. My previous attempt to reverse-engineer your
test case was not selecting a partial-aggregation plan.
Immediate impression is that the logic for planning partial grouped
aggregation did not get the zero-sort-keys case right. In my hands
it produces
TRAP: BadArgument("!(nkeys > 0)", File: "tuplesort.c", Line: 763)
2016-11-10 10:44:26.955 EST [14415] LOG: server process (PID 15726) was t=
erminated by signal 6: Aborted
but in a non-assert build of course you'd get some other misbehavior.
regards, tom lane
I wrote: > Immediate impression is that the logic for planning partial grouped > aggregation did not get the zero-sort-keys case right. Yeah: after flattening your view, the planner was left with GROUP BY 'S3'::character varying, which is a no-op, but it mistakenly inserted a no-op Sort into the plan to sort by that. One-liner fix is here if you need it right away: https://git.postgresql.org/gitweb/?p=3Dpostgresql.git;a=3Dcommitdiff;h=3D7= defc3b97a31537547053946808a83e7234d1b61 regards, tom lane
Thank you. On Thu, Nov 10, 2016 at 10:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: > > Immediate impression is that the logic for planning partial grouped > > aggregation did not get the zero-sort-keys case right. > > Yeah: after flattening your view, the planner was left with GROUP BY > 'S3'::character varying, which is a no-op, but it mistakenly inserted > a no-op Sort into the plan to sort by that. One-liner fix is here > if you need it right away: > > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h= > 7defc3b97a31537547053946808a83e7234d1b61 > > > regards, tom lane >