> Sure, but that doesn't make the patch correct. The patch proposes
> that, when parallelism in use, a subquery scan will produce fewer rows
> than when parallelism is not in use, and that's 100% false. Compare
> this with the case of a parallel sequential scan. If a table contains
> 1000 rows, and we scan it with a regular Seq Scan, the Seq Scan will
> return 1000 rows. But if we scan it with a Parallel Seq Scan using
> say 4 workers, the number of rows returned in each worker will be
> substantially less than 1000, because 1000 is now the *total* number
> of rows to be returned across *all* processes, and what we need is the
> number of rows returned in *each* process.
for now fuction cost_subqueryscan always using *total* rows even parallel
path. like this:
Gather (rows=30000)
Workers Planned: 2
-> Subquery Scan (rows=30000) -- *total* rows, should be equal subpath
-> Parallel Seq Scan (rows=10000)
Maybe the codes:
/* Mark the path with the correct row estimate */
if (param_info)
path->path.rows = param_info->ppi_rows;
else
path->path.rows = baserel->rows;
should change to:
/* Mark the path with the correct row estimate */
if (path->path.parallel_workers > 0)
path->path.rows = path->subpath->rows;
else if (param_info)
path->path.rows = param_info->ppi_rows;
else
path->path.rows = baserel->rows;