Re: A population of population counts
От | David Rowley |
---|---|
Тема | Re: A population of population counts |
Дата | |
Msg-id | CAKJS1f9yQ-V1U3=RoQk-cHqy670wGyTZFu82nm67S7wZ7SZtXw@mail.gmail.com обсуждение исходный текст |
Ответ на | A population of population counts (Thomas Munro <thomas.munro@enterprisedb.com>) |
Ответы |
Re: A population of population counts
|
Список | pgsql-hackers |
On 7 May 2016 at 12:41, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > Hi > > I noticed that we have three "number_of_ones" tables under contrib and > two under src, and some new specially masked variants for visibility > maps. > > Would it be an improvement if we just defined one table with external > linkage, and accessed it via a macros/functions popcount_uint8, and > wider versions _uint32, popcount_array(data, len) that sum the > popcounts of their component bytes? > > Then there would be less duplication, and future opportunities to use > fancy built-ins/assembler instructions/vectorisation in one central > place, and to work in larger sizes than bytes. > > Perhaps we could also get rid of the new special masked popcount > tables by masking the value we look up instead, eg walk through the > page calling popcount_uint64(value & FROZEN_BITMASK_64). > > As for the rightmost_one_pos table in bitmapset.c, couldn't the > various bms_XXX functions just use ffs(n) - 1 and work word-at-a-time? > That generates a bsf instruction at -O2 on this machine. > > The micro-optimisation opportunities probably don't matter, but I > wondered if it might at least be interesting to delete a bunch of > code, and re-use a standard interface for something that apparently > several modules need to do. I remember looking at GCC's __builtin_popcount() about 6 months ago with thoughts about using it for bms_num_members(). I did see performance improvements from micro-benchmarks to compare it to the number_of_ones[] array, and saw a small improvement. Likely the improvements I did see with those would actually have been better in a real world case, since not having a number_of_ones[] array in a CPU cache might be more useful for a workload with a more contended cache. I'd like to see us using those functions, when they're available and falling back on the array when they're not. Sounds like that would just be a new configure test. Perhaps a good home for some shared code would be numutils.c. I see the functions there declared in builtins.h which I see used in contrib/spi. I think it could all be made to work quite nicely with types like bitmapword just with some preprocessor trickery. I'd say go for it. Sounds like a like these would be some nice reusable functions that we already have a suitable home for. -- David Rowley http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
В списке pgsql-hackers по дате отправления: