Making the vacuum possible to choose between two data representations sounds good. I implemented the patch that changes dead tuple representation to bitmap before. I will measure the performance of bitmap representation again and post them.
Sounds great! I haven't seen your patch, but what I would suggest is to compute page density (D) = relpages/(dead+live tuples) and experiment with bitmap of sizes of D to 2D bits per page. May I also suggest that instead of putting in efforts in implementing the overflow area, just count how many dead TIDs would fall under overflow area for a given choice of bitmap size.
It might be a good idea to experiment with different vacuum scale factor, varying between 2% to 20% (may be 2, 5, 10, 20). You can probably run a longish pgbench test on a large table and then save the data directory for repeated experiments, although I'm not sure if pgbench will be a good choice because HOT will prevent accumulation of dead pointers, in which case you may try adding another index on abalance column.
It'll be worth measuring memory consumption of both representations as well as performance implications on index vacuum. I don't expect to see any major difference in either heap scans.