Hi all, I’m working on an FDW that would benefit greatly from parallel foreign scan. I have implemented the callbacks
describedhere:https://www.postgresql.org/docs/devel/fdw-callbacks.html#FDW-CALLBACKS-PARALLEL. and I see a big
improvementin certain plans.
My problem is that I can’t seem to get a parallel foreign scan in a query that does not contain an aggregate.
For example:
SELECT count(*) FROM foreign table;
Gives me a parallel scan, but
SELECT * FROM foreign table;
Does not.
I’ve been fiddling with the costing GUCs, foreign scan row estimates, and foreign scan cost estimates - I can force the
costof a partial path to be much lower than a sequential foreign scan, but no luck.
Any troubleshooting advice?
A second related question - how can I find the actual number of workers chose for my ForeignScan? At the moment, I
lookingat ParallelContext->nworkers (inside of the InitializeDSMForeignScan() callback) because that seems to be the
firstcallback function that might provide the worker count. I need the *actual* worker count in order to evenly
distributemy workload. I can’t use the usual trick of having each worker grab the next available chunk (because I have
toavoid seek operations on compressed data). In other words, it is of great advantage for each worker to read
contiguouschunks of data - seeking to another part of the file is prohibitively expensive.
Thanks for all help.
— Korry