On Sat, Dec 16, 2023 at 4:19 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> We actually noticed this or a closely-related problem before [1]
> and briefly discussed the possibility of rearranging the generated
> code to make it less indigestible to clang. But there was no concrete
> idea about what to do specifically, and the thread slid off the radar
> screen.
I've never paid attention to the output of -ftime-report before but
this difference stands out pretty clearly with clang16:
---User Time--- --System Time-- --User+System-- ---Wall
Time--- --- Name ---
201.2266 ( 99.6%) 0.0074 ( 99.3%) 201.2341 ( 99.6%) 207.1308 (
99.6%) SLPVectorizerPass
The equivalent line for clang15 is:
3.0979 ( 64.8%) 0.0000 ( 0.0%) 3.0979 ( 64.8%) 3.0996 (
64.8%) SLPVectorizerPass
The thing Andres showed in that other thread was like this (though in
my output it's grown "#2") which is much of the time in 15, but "only"
goes up by a couple of seconds in 16, so it's not our primary problem:
9.1890 ( 73.1%) 0.0396 ( 23.9%) 9.2286 ( 72.4%) 9.6586 (
72.9%) Greedy Register Allocator #2