Обсуждение: contrib/sepgsql regression tests have been broken for months
I tried to run contrib/sepgsql's regression tests today, and was
rather astonished when they failed. Investigating, there are
some context lines like "LINE 1: ALTER TABLE regtest_table_4
ALTER COLUMN y TYPE float;" in the test output that were not there
before. A bit of bisecting showed that the change happened with
65281391a937293db7fa747be218def0e9794550 is the first bad commit
commit 65281391a937293db7fa747be218def0e9794550 (HEAD)
Author: Michael Paquier <michael@paquier.xyz>
Date: Mon Jan 27 13:51:23 2025 +0900
Print out error position for some ALTER TABLE ALTER COLUMN type
So, okay, that's a perfectly respectable thing to do, and I can't
really fault Michael or Jian for not having tested its effects on
sepgsql. But how come it took this long to notice?
I think that rhinoceros is the only BF member testing with
--with-selinux. Looking at its logs, it is running the sepgsql tests
(as a custom module) in v17 and before, but not in v18 or HEAD.
I suppose that this is a consequence of trying to rely on the
TAP-test infrastructure that was installed by aeb8ea361 (just a few
days before the aforesaid change, as luck would have it). That TAP
test does work for me, but it does not run on rhinoceros because
(1) there's no --enable-tap-tests in its configure command, and
(2) it doesn't set up environment variable PG_TEST_EXTRA to include
"sepgsql".
Anyway, we seem to need the attached in v18 and HEAD,
and we really ought to get BF coverage going again.
regards, tom lane
diff --git a/contrib/sepgsql/expected/ddl.out b/contrib/sepgsql/expected/ddl.out
index 7e8deae4f93..accb903f5ce 100644
--- a/contrib/sepgsql/expected/ddl.out
+++ b/contrib/sepgsql/expected/ddl.out
@@ -304,6 +304,8 @@ ALTER TABLE regtest_table_4 ALTER COLUMN y TYPE float;
LOG: SELinux: allowed { search } scontext=unconfined_u:unconfined_r:sepgsql_regtest_superuser_t:s0
tcontext=unconfined_u:object_r:sepgsql_schema_t:s0tclass=db_schema name="regtest_schema" permissive=0
LOG: SELinux: allowed { search } scontext=unconfined_u:unconfined_r:sepgsql_regtest_superuser_t:s0
tcontext=system_u:object_r:sepgsql_schema_t:s0tclass=db_schema name="public" permissive=0
LOG: SELinux: allowed { search } scontext=unconfined_u:unconfined_r:sepgsql_regtest_superuser_t:s0
tcontext=system_u:object_r:sepgsql_schema_t:s0tclass=db_schema name="pg_catalog" permissive=0
+LINE 1: ALTER TABLE regtest_table_4 ALTER COLUMN y TYPE float;
+ ^
LOG: SELinux: allowed { search } scontext=unconfined_u:unconfined_r:sepgsql_regtest_superuser_t:s0
tcontext=system_u:object_r:sepgsql_schema_t:s0tclass=db_schema name="pg_catalog" permissive=0
LOG: SELinux: allowed { setattr } scontext=unconfined_u:unconfined_r:sepgsql_regtest_superuser_t:s0
tcontext=unconfined_u:object_r:sepgsql_table_t:s0tclass=db_column name="regtest_schema.regtest_table_4.y" permissive=0
LOG: SELinux: allowed { execute } scontext=unconfined_u:unconfined_r:sepgsql_regtest_superuser_t:s0
tcontext=system_u:object_r:sepgsql_proc_exec_t:s0tclass=db_procedure name="pg_catalog.float8(integer)" permissive=0
@@ -388,7 +390,11 @@ ALTER TABLE regtest_ptable_4 ALTER COLUMN y TYPE float;
LOG: SELinux: allowed { search } scontext=unconfined_u:unconfined_r:sepgsql_regtest_superuser_t:s0
tcontext=unconfined_u:object_r:sepgsql_schema_t:s0tclass=db_schema name="regtest_schema" permissive=0
LOG: SELinux: allowed { search } scontext=unconfined_u:unconfined_r:sepgsql_regtest_superuser_t:s0
tcontext=system_u:object_r:sepgsql_schema_t:s0tclass=db_schema name="public" permissive=0
LOG: SELinux: allowed { search } scontext=unconfined_u:unconfined_r:sepgsql_regtest_superuser_t:s0
tcontext=system_u:object_r:sepgsql_schema_t:s0tclass=db_schema name="pg_catalog" permissive=0
+LINE 1: ALTER TABLE regtest_ptable_4 ALTER COLUMN y TYPE float;
+ ^
LOG: SELinux: allowed { search } scontext=unconfined_u:unconfined_r:sepgsql_regtest_superuser_t:s0
tcontext=system_u:object_r:sepgsql_schema_t:s0tclass=db_schema name="pg_catalog" permissive=0
+LINE 1: ALTER TABLE regtest_ptable_4 ALTER COLUMN y TYPE float;
+ ^
LOG: SELinux: allowed { search } scontext=unconfined_u:unconfined_r:sepgsql_regtest_superuser_t:s0
tcontext=system_u:object_r:sepgsql_schema_t:s0tclass=db_schema name="pg_catalog" permissive=0
LOG: SELinux: allowed { setattr } scontext=unconfined_u:unconfined_r:sepgsql_regtest_superuser_t:s0
tcontext=unconfined_u:object_r:sepgsql_table_t:s0tclass=db_column name="regtest_schema.regtest_ptable_4.y" permissive=0
LOG: SELinux: allowed { search } scontext=unconfined_u:unconfined_r:sepgsql_regtest_superuser_t:s0
tcontext=system_u:object_r:sepgsql_schema_t:s0tclass=db_schema name="pg_catalog" permissive=0
On Thu, Oct 23, 2025 at 05:36:01PM -0400, Tom Lane wrote: > So, okay, that's a perfectly respectable thing to do, and I can't > really fault Michael or Jian for not having tested its effects on > sepgsql. But how come it took this long to notice? Oops. I would have taken care of that should I have known... I check the compilation of sepgsql, but the tests require dependencies that are too heavy so it has always been some copy-paste operation when the buildfarm got angry on this test suite (combined with hopes to not break the tests). > I think that rhinoceros is the only BF member testing with > --with-selinux. I recall so, yes. > Anyway, we seem to need the attached in v18 and HEAD, > and we really ought to get BF coverage going again. Thanks for sending a patch (and 0758111f5d35)! -- Michael
Вложения
On 10/24/25 17:36, Tom Lane wrote: > I think that rhinoceros is the only BF member testing with > --with-selinux. Yes, as far as I know anyway. > Looking at its logs, it is running the sepgsql tests (as a custom module) in > v17 and before, but not in v18 or HEAD. I suppose that this is a consequence > of trying to rely on the TAP-test infrastructure that was installed by > aeb8ea361 (just a few days before the aforesaid change, as luck would have > it). That TAP test does work for me, but it does not run on rhinoceros > because (1) there's no --enable-tap-tests in its configure command, and (2) > it doesn't set up environment variable PG_TEST_EXTRA to include "sepgsql". > > Anyway, we seem to need the attached in v18 and HEAD, > and we really ought to get BF coverage going again. I will make those changes (and hope nothing breaks). -- Joe Conway PostgreSQL Contributors Team Amazon Web Services: https://aws.amazon.com
On 10/24/25 07:49, Joe Conway wrote: > On 10/24/25 17:36, Tom Lane wrote: >> Anyway, we seem to need the attached in v18 and HEAD, >> and we really ought to get BF coverage going again. > > I will make those changes (and hope nothing breaks). And of course they did break :-( Rhino is still running on RHEL 7.9 and it seems that needed perl RPMs are no longer in the yum repo. I will need some time to sort it out, but am on the road through the weekend, so it might be until sometime next week when I get enough 'round tuits' to get it resolved. -- Joe Conway PostgreSQL Contributors Team Amazon Web Services: https://aws.amazon.com
On 2025-Oct-23, Tom Lane wrote: > I think that rhinoceros is the only BF member testing with > --with-selinux. As I recall, there is/was one other animal doing it, but I don't remember which one it is. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
On 10/24/25 08:28, Joe Conway wrote: > On 10/24/25 07:49, Joe Conway wrote: >> On 10/24/25 17:36, Tom Lane wrote: >>> Anyway, we seem to need the attached in v18 and HEAD, >>> and we really ought to get BF coverage going again. >> >> I will make those changes (and hope nothing breaks). > > And of course they did break :-( > > Rhino is still running on RHEL 7.9 and it seems that needed perl RPMs are no > longer in the yum repo. I will need some time to sort it out, but am on the road > through the weekend, so it might be until sometime next week when I get enough > 'round tuits' to get it resolved. I predicted complaints about desupporting openssl < 1.1.1, just did not anticipate they would be from me ;-P https://www.postgresql.org/message-id/flat/9a9a43e7-f1b7-4d77-b3df-9138ecfc6f6b%40joeconway.com#278b38135345257044aa1cd41b8dda83 Since 18+ requires openssl 1.1.1+ and CentOS 7.9 has openssl 1.0.2, and rhino is still on CentOS 7.9, rhino does not build with openssl. However, with the tap tests enabled now, I am seeing this: cat ./build-farm-root/HEAD/rhinoceros.lastrun-logs/configure.log 8<-------- ... with_selinux='yes' with_ssl='no' with_system_tzdata='' ... 8<-------- and in src/test/modules/Makefile: 8<-------- ... ifeq ($(with_ssl),openssl) SUBDIRS += ssl_passphrase_callback else ALWAYS_SUBDIRS += ssl_passphrase_callbackssl_passphrase_callback endif 8<-------- yet: 8<-------- ./run_build.pl --nosend --nostatus --verbose --force --keepall Tue Oct 28 13:47:01 2025: buildfarm run for rhinoceros:HEAD starting rhinoceros:HEAD [13:47:01] checking out source ... ... rhinoceros:HEAD [13:55:00] running bin checks ... rhinoceros:HEAD [14:08:09] running make misc checks ... Branch: HEAD Stage ssl_passphrase_callbackCheck failed with status 2 8<-------- and: 8<-------- tail -n23 /opt/src/pgsql-git/build-farm-root/HEAD/pgsql.build/tmp_install/log/install.log|head -n3 ccache gcc -std=gnu11 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -fPIC -fvisibility=hidden -I. -I. -I../../../../src/include -D_GNU_SOURCE -I/usr/include/libxml2 -c -o ssl_passphrase_func.o ssl_passphrase_func.c ssl_passphrase_func.c:29:23: error: unknown type name ‘SSL_CTX’ static void set_rot13(SSL_CTX *context, bool isServerStart); 8<-------- I am not understanding why ssl_passphrase_callbackCheck is being run at all, but that is currently where I am stuck ¯\_(ツ)_/¯. For the moment I have disabled the cron job on rhino. I guess I need to up the urgency on getting the OS upgraded to something supported... -- Joe Conway PostgreSQL Contributors Team Amazon Web Services: https://aws.amazon.com
On 10/25/25 11:29, Álvaro Herrera wrote: > On 2025-Oct-23, Tom Lane wrote: > >> I think that rhinoceros is the only BF member testing with >> --with-selinux. > > As I recall, there is/was one other animal doing it, but I don't > remember which one it is. I could be wrong, but I am not aware of there ever being another animal doing selinux testing. -- Joe Conway PostgreSQL Contributors Team Amazon Web Services: https://aws.amazon.com
On Tue, Oct 28, 2025 at 2:47 PM Joe Conway <mail@joeconway.com> wrote: > I am not understanding why ssl_passphrase_callbackCheck is being run at > all, but that is currently where I am stuck ¯\_(ツ)_/¯. Old buildfarm client (REL_11). SSL tests aren't skipped until at least 12, it looks like. (I ran into similar problems with OAuth recently.) --Jacob
On 10/28/25 18:20, Jacob Champion wrote: > On Tue, Oct 28, 2025 at 2:47 PM Joe Conway <mail@joeconway.com> wrote: >> I am not understanding why ssl_passphrase_callbackCheck is being run at >> all, but that is currently where I am stuck ¯\_(ツ)_/¯. > > Old buildfarm client (REL_11). SSL tests aren't skipped until at least > 12, it looks like. (I ran into similar problems with OAuth recently.) > > --Jacob Thanks -- will give that a try! -- Joe Conway PostgreSQL Contributors Team Amazon Web Services: https://aws.amazon.com
On 10/28/25 18:49, Joe Conway wrote: > On 10/28/25 18:20, Jacob Champion wrote: >> On Tue, Oct 28, 2025 at 2:47 PM Joe Conway <mail@joeconway.com> wrote: >>> I am not understanding why ssl_passphrase_callbackCheck is being run at >>> all, but that is currently where I am stuck ¯\_(ツ)_/¯. >> >> Old buildfarm client (REL_11). SSL tests aren't skipped until at least >> 12, it looks like. (I ran into similar problems with OAuth recently.) >> >> --Jacob > > Thanks -- will give that a try! Ok so that worked to an extent, and I have now reenabled rhino buildfarm cron job. Everything < 18 passes. 18 and HEAD get this failure: ----------------------------- # +++ tap check in contrib/sepgsql +++ # # The SELinux boolean 'sepgsql_regression_test_mode' must be # turned on in order to enable the rules necessary to run the # regression tests. # # You can turn on this variable using the following commands: # # $ sudo setsebool sepgsql_regression_test_mode on # # For security reasons, it is suggested that you turn off this # variable when regression testing is complete and the associated # rules are no longer needed. ----------------------------- I believe the boolean is flipped in the 17 and below tests by the test script. How is that supposed to happen for the tap tests? -- Joe Conway PostgreSQL Contributors Team Amazon Web Services: https://aws.amazon.com
Joe Conway <mail@joeconway.com> writes:
> Everything < 18 passes.
Hmm ...
> 18 and HEAD get this failure:
> # The SELinux boolean 'sepgsql_regression_test_mode' must be
> # turned on in order to enable the rules necessary to run the
> # regression tests.
> I believe the boolean is flipped in the 17 and below tests by the test
> script. How is that supposed to happen for the tap tests?
I'd be quite astonished if the buildfarm were trying to do anything
that requires root privilege, and even more astonished if it were
succeeding. I think the expectation is that you turned on
sepgsql_regression_test_mode manually before enabling this buildfarm
test. I don't understand how < 18 would be passing if it weren't on,
so the likely bet is that the test_sepgsql script is mistaken about
how it's checking that. Said script does work for me, but maybe
RHEL7's getsebool output is different from later versions?
regards, tom lane
I wrote:
> ... I think the expectation is that you turned on
> sepgsql_regression_test_mode manually before enabling this buildfarm
> test. I don't understand how < 18 would be passing if it weren't on,
> so the likely bet is that the test_sepgsql script is mistaken about
> how it's checking that. Said script does work for me, but maybe
> RHEL7's getsebool output is different from later versions?
No, wait ... test_sepgsql is what we were using before, but 18 and
HEAD should be running contrib/sepgsql/t/001_sepgsql.pl. And
rhino's HEAD run does reflect that:
[15:04:03.343](0.010s) # checking selinux environment
[15:04:03.343](0.000s) # checking for matchpathcon
[15:04:03.353](0.010s) # checking for runcon
[15:04:03.361](0.008s) # checking for sestatus
[15:04:03.366](0.005s) # checking current user domain
[15:04:03.372](0.006s) # current user domain is 'unconfined_t'
[15:04:03.373](0.000s) # checking selinux operating mode
[15:04:03.380](0.007s) # current operating mode is 'enforcing'
[15:04:03.380](0.000s) # checking for sepgsql-regtest policy
[15:04:03.387](0.007s) # checking whether policy is enabled
[15:04:03.391](0.005s) # sepgsql_regression_test_mode is 'off'
[15:04:03.392](0.000s) #
# The SELinux boolean 'sepgsql_regression_test_mode' must be
# turned on in order to enable the rules necessary to run the
# regression tests.
I poked around in the buildfarm client and was surprised to
find that the old TestSepgsql.pm module does in fact expect
to have sudo privileges, and it seems to install, enable,
and eventually remove the sepgsql-regtest kernel module.
I thought we were trying to get rid of that requirement though.
(For sure, you won't ever see me running the buildfarm
client under a sudo-capable account.) I think the new idea
is to leave the module installed and active, which is kind
of problematic if we want to also use TestSepgsql.pm in the
back branches.
I also don't quite understand how 001_sepgsql.pl's "checking for
sepgsql-regtest policy" test is passing if the previous
TestSepgsql.pm run removed that module ...
regards, tom lane
On 10/29/25 19:36, Tom Lane wrote: > I poked around in the buildfarm client and was surprised to > find that the old TestSepgsql.pm module does in fact expect > to have sudo privileges, and it seems to install, enable, > and eventually remove the sepgsql-regtest kernel module. > > I thought we were trying to get rid of that requirement though. > (For sure, you won't ever see me running the buildfarm > client under a sudo-capable account.) I think the new idea > is to leave the module installed and active, which is kind > of problematic if we want to also use TestSepgsql.pm in the > back branches. > > I also don't quite understand how 001_sepgsql.pl's "checking for > sepgsql-regtest policy" test is passing if the previous > TestSepgsql.pm run removed that module ... I suppose one solution is to create a new buildfarm animal and then use rhino only for <= pg17 and <rhino-prime> only for pg18+ -- Joe Conway PostgreSQL Contributors Team Amazon Web Services: https://aws.amazon.com
Joe Conway <mail@joeconway.com> writes:
> On 10/29/25 19:36, Tom Lane wrote:
>> ... I think the new idea
>> is to leave the module installed and active, which is kind
>> of problematic if we want to also use TestSepgsql.pm in the
>> back branches.
> I suppose one solution is to create a new buildfarm animal and then use
> rhino only for <= pg17 and <rhino-prime> only for pg18+
It looks like it'd work to revert to using TestSepgsql.pm in all
branches. You'd need to remove 'sepgsql' from PG_TEST_EXTRA so
that v18/master don't try to run the conflicting 001_sepgsql.pl
tests.
We need a better idea about how the new test method can coexist
with the old one, but for now I'd just like rhino to be running
one or the other successfully ...
regards, tom lane
On 10/31/25 15:48, Tom Lane wrote: > Joe Conway <mail@joeconway.com> writes: >> On 10/29/25 19:36, Tom Lane wrote: >>> ... I think the new idea >>> is to leave the module installed and active, which is kind >>> of problematic if we want to also use TestSepgsql.pm in the >>> back branches. > >> I suppose one solution is to create a new buildfarm animal and then use >> rhino only for <= pg17 and <rhino-prime> only for pg18+ > > It looks like it'd work to revert to using TestSepgsql.pm in all > branches. You'd need to remove 'sepgsql' from PG_TEST_EXTRA so > that v18/master don't try to run the conflicting 001_sepgsql.pl > tests. I'm not clear on whether this involves just undoing the recent changes I made, or if the buildfarm code itself needs changes. FWIW I did this part: > remove 'sepgsql' from PG_TEST_EXTRA But what about "--enable-tap-tests"? That makes the full buildfarm cycle across all branches take something like 2.5 hours. I removed that part for now at least. > We need a better idea about how the new test method can coexist > with the old one, but for now I'd just like rhino to be running > one or the other successfully ... ok, back to the original configuration (although now with the latest buildfarm client) -- Joe Conway PostgreSQL Contributors Team Amazon Web Services: https://aws.amazon.com
On 2025-11-01 Sa 8:50 AM, Joe Conway wrote: > On 10/31/25 15:48, Tom Lane wrote: >> Joe Conway <mail@joeconway.com> writes: >>> On 10/29/25 19:36, Tom Lane wrote: >>>> ... I think the new idea >>>> is to leave the module installed and active, which is kind >>>> of problematic if we want to also use TestSepgsql.pm in the >>>> back branches. >> >>> I suppose one solution is to create a new buildfarm animal and then >>> use rhino only for <= pg17 and <rhino-prime> only for pg18+ >> >> It looks like it'd work to revert to using TestSepgsql.pm in all >> branches. You'd need to remove 'sepgsql' from PG_TEST_EXTRA so >> that v18/master don't try to run the conflicting 001_sepgsql.pl >> tests. > > I'm not clear on whether this involves just undoing the recent changes > I made, or if the buildfarm code itself needs changes. > > FWIW I did this part: >> remove 'sepgsql' from PG_TEST_EXTRA > > But what about "--enable-tap-tests"? That makes the full buildfarm > cycle across all branches take something like 2.5 hours. I removed > that part for now at least. > >> We need a better idea about how the new test method can coexist >> with the old one, but for now I'd just like rhino to be running >> one or the other successfully ... > > ok, back to the original configuration (although now with the latest > buildfarm client) Can we just disable the cleanup stage of the buildfarm module? Then the semodule would not be uninstalled. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com