action #180002
closedopenQA-in-openQA test fails in dashboard with 403 Forbidden size:S
0%
Description
Observation¶
openQA test in scenario openqa-Tumbleweed-dev-x86_64-openqa_install_nginx@64bit-2G fails in
dashboard
Reproducible¶
Fails since (at least) Build :TW.35862 (current job)
Expected result¶
Last good: :TW.35861 (or more recent)
Further details¶
Always latest result in this scenario: latest
Suggestions¶
- The nginx error resembles selinux errors we were seeing recently. Maybe a regression or a new case we didn't cover so far?
- Investigate what else changed
- Ensure this reproduces, and also check other tests
- nginx tests consistentl fail, others pass
- Bring back the original selinux workaround as a mitigation
Rollback steps¶
Files
Updated by okurz about 1 month ago
- Tags set to reactive work, alert, openqa-in-openqa
Updated by emiler about 1 month ago
- Status changed from New to In Progress
- Assignee set to emiler
Updated by openqa_review about 1 month ago
- Due date set to 2025-04-22
Setting due date based on mean cycle time of SUSE QE Tools
Updated by livdywan about 1 month ago
Mentioned in the daily. This looks to be selinux related. @emiler is looking into a fix.
Updated by emiler about 1 month ago · Edited
If we need the test to pass immediately, we can temporarily re-apply the original workaround, which is disabling SELinux for httpd_t
with: assert_script_run('semanage permissive -a httpd_t');
Updated by tinita about 1 month ago
emiler wrote in #note-6:
If we need the test to pass immediately, we can temporarily re-apply the original workaround, which is disabling SELinux for
httpd_t
with:assert_script_run('semanage permissive -a httpd_t');
I think that would be good.
Updated by emiler about 1 month ago · Edited
The workaround doesn't seem to work: https://openqa.opensuse.org/tests/4981786
I tried setting both of the following options in the prepare
module:
assert_script_run('semanage permissive -a httpd_t');
assert_script_run('setenforce 0');
The issue probably lies somewhere else.
Updated by livdywan about 1 month ago · Edited
- File clipboard-202504101059-bp2jd.png clipboard-202504101059-bp2jd.png added
- Priority changed from High to Urgent
Let's make sure we can offer help as needed to make the tests work today.
Maybe @dheidler can provide some help, having looked into #178822 before?
Updated by livdywan about 1 month ago
- Related to action #178822: openQA in openQA tests failing with unreachable webUI, possibly due to SELinux size:S added
Updated by gpuliti about 1 month ago
- Subject changed from openQA-in-openQA test fails in dashboard with 403 Forbidden to openQA-in-openQA test fails in dashboard with 403 Forbidden size:S
- Description updated (diff)
Updated by okurz about 1 month ago
emiler wrote in #note-8:
The workaround doesn't seem to work: https://openqa.opensuse.org/tests/4981786
I tried setting both of the following options in the
prepare
module:assert_script_run('semanage permissive -a httpd_t'); assert_script_run('setenforce 0');
The issue probably lies somewhere else.
https://github.com/realcharmer/os-autoinst-distri-openQA/blob/d452191341b7dda6c626b9f20e7190c812581c6c/tests/install/prepare.pm#L11 is too early because https://openqa.opensuse.org/tests/4981786/modules/apparmor/steps/1/src does a reboot which undoes the changes.
Updated by dheidler about 1 month ago · Edited
If this problem is selinux related, it should work if you run the test with the current master version of os-autoinst-distri-openQA:
https://github.com/realcharmer/os-autoinst-distri-openQA/blob/master/tests/install/prepare.pm
There we set a boolean which is persistent and should survive a reboot:
# SELinux: allow web proxy to connect to openQA backend
assert_script_run('semanage boolean -m -1 httpd_can_network_connect');
Updated by dheidler about 1 month ago
But I don't think this is related to selinux.
When I setenforce 1
on my local system and do NOT semanage boolean -m -1 httpd_can_network_connect
(or semanage boolean -m -0 httpd_can_network_connect
)
I will get 502 from nginx - not 403.
Updated by dheidler about 1 month ago
I mean it could still be selinux but not the connection from nginx to openqa-webui.
Updated by emiler about 1 month ago
dheidler wrote in #note-13:
There we set a boolean which is persistent and should survive a reboot:
# SELinux: allow web proxy to connect to openQA backend assert_script_run('semanage boolean -m -1 httpd_can_network_connect');
At first I assumed Nginx would need a different/additional permission. Setting setenforce 0
does not help either. The issue comes from somewhere else.
Is it acceptable to exclude this test module for the time being, if I don't solve the issue today?
Updated by livdywan about 1 month ago
- Description updated (diff)
- Priority changed from Urgent to High
https://github.com/os-autoinst/os-autoinst-distri-openQA/pull/235
Note that this only handles 403 errors on nginx as @emiler pointed out.
Updated by livdywan about 1 month ago
- Copied to action #180785: openQA-in-openQA tests on o3 are not being scheduled added
Updated by emiler 27 days ago · Edited
- Status changed from In Progress to Feedback
Here are some results of my debugging. The openQA UI is accessible via http://127.0.0.1
, but not trough http://localhost
. This is due to the default configuration in /etc/nginx/vhosts.d/openqa.conf
, which is using the default_server
keyword in the listen
statement. Issue is that the nginx.conf
configuration takes precedence and tries to instead serve static files from /srv/www/htdocs
when request is coming from http://localhost
.
Possible solutions:
- Change the default
nginx.conf
configuration and remove thedefault_server
option. - Modify the test to access the openQA web UI trough
http://127.0.0.1
instead ofhttp://localhost
. - ?
This is also mentioned in https://open.qa/docs/#_nginx_proxy.
Updated by okurz 17 days ago
- Due date deleted (
2025-04-22) - Status changed from Resolved to Workable
I think we can do better and should continue:
- https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/master/tests/openqa/webui/dashboard.pm#L19 also uses "firefox http://localhost" to show the local openQA instance so this either will fail soon or is at least inconsistent
- We should never suggest to users to work with IPv4-only addresses. We need to do better. Please find a way to make this work with "localhost"
Updated by openqa_review 12 days ago
- Due date set to 2025-05-14
Setting due date based on mean cycle time of SUSE QE Tools
Updated by emiler 12 days ago · Edited
On order to cover all addresses, meaning http://127.0.0.1
, http://[::1]
and http://localhost
, we have to edit the default Nginx configuration. It should be simple: run sed on the config to modify the default_server
option and reload Nginx. This approach is mentioned in the documentation, so it's not doing anything crazy. Consequently, we should also test all the addresses, not just one.
Updated by szarate 12 days ago
cross linking (as it also shows on TW snapshots): https://bugzilla.opensuse.org/show_bug.cgi?id=1240733
Updated by emiler 10 days ago · Edited
Actually, my change only fixes our internal tests.
The changes done in the PR are just following what's mentioned in the documentation (https://open.qa/docs/#_nginx_proxy), which is a valid approach, but this step would have to be done in other tests as well.
Perhaps a better solution would be to change the default nginx confiuration directly upstream, in particular do this: https://github.com/os-autoinst/os-autoinst-distri-openQA/pull/240/files#diff-d348c4980f5b9277fb54ad564c972f1e385994e21b1027270ea1dde689b7c8dcR31
Not sure though how/if we should change the default configuration distributed by packages, as this step has to be done after installing nginx in /etc/nginx/nginx.conf
.
Updated by emiler 10 days ago
- Status changed from Workable to In Progress
Ok, I've taken a completely wrong approach.
The root cause of the regression is this: https://build.opensuse.org/request/show/1265447#diff_2_n107
Our regex for nginx configuration was explicitly expecting two spaces. I'll revert my previous patch of the internal test and improve the regex.
Updated by emiler 7 days ago
- Status changed from Feedback to Resolved
Both merged, test passing: https://openqa.opensuse.org/tests/5039342