action #150908
closedo3 "Unable to fetch build results" and "Internal server error" on some pages size:M
0%
Description
Observation¶
From https://suse.slack.com/archives/C02CANHLANP/p1700046210637459
(Dominique Leuenberger) Seems O3 went back into the same fail state as we had seen yesterday:
Unable to fetch build results
and https://openqa.opensuse.org/tests/3727497 says "Internal server error" but not more details. Couldn't find anything obvious in journalctl -u openqa-webui
Rollback actions¶
- Re-enable openqa-auto-update on o3 again
Updated by livdywan about 1 year ago
https://status.opensuse.org/ says it's all working. I assume that needs to be fixed?
Updated by tinita about 1 year ago
- Status changed from New to In Progress
- Assignee set to tinita
Updated by tinita about 1 year ago
from /var/log/openqa:
[2023-11-15T11:11:50.988143Z] [debug] [pid:28037] Updating seen of worker 982 from worker_status (free)
[2023-11-15T11:11:50.995202Z] [warn] [pid:28037] Unable to verify whether worker 982 runs its job(s) as expected: DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception: DBD::Pg::st execute failed: ERROR: column me.backend_info does not exist
LINE 1: ...result, me.reason, me.clone_id, me.blocked_by_id, me.backend...
^ [for Statement "SELECT me.id, me.result_dir, me.archived, me.state, me.priority, me.result, me.reason, me.clone_id, me.blocked_by_id, me.backend_info, me.TEST, me.DISTRI, me.VERSION, me.FLAVOR, me.ARCH, me.BUILD, me.MACHINE, me.group_id, me.assigned_worker_id, me.t_started, me.t_finished, me.logs_present, me.passed_module_count, me.failed_module_count, me.softfailed_module_count, me.skipped_module_count, me.externally_skipped_module_count, me.scheduled_product_id, me.result_size, me.t_created, me.t_updated FROM jobs me WHERE ( ( me.assigned_worker_id = ? AND t_finished IS NULL ) ) ORDER BY t_created DESC" with ParamValues: 1='982'] at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/Workers.pm line 245
[2023-11-15T11:11:51.094791Z] [debug] [pid:28037] Updating seen of worker 896 from worker_status (free)
[2023-11-15T11:11:51.102115Z] [warn] [pid:28037] Unable to verify whether worker 896 runs its job(s) as expected: DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception: DBD::Pg::st execute failed: ERROR: column me.backend_info does not exist
LINE 1: ...result, me.reason, me.clone_id, me.blocked_by_id, me.backend...
^ [for Statement "SELECT me.id, me.result_dir, me.archived, me.state, me.priority, me.result, me.reason, me.clone_id, me.blocked_by_id, me.backend_info, me.TEST, me.DISTRI, me.VERSION, me.FLAVOR, me.ARCH, me.BUILD, me.MACHINE, me.group_id, me.assigned_worker_id, me.t_started, me.t_finished, me.logs_present, me.passed_module_count, me.failed_module_count, me.softfailed_module_count, me.skipped_module_count, me.externally_skipped_module_count, me.scheduled_product_id, me.result_size, me.t_created, me.t_updated FROM jobs me WHERE ( ( me.assigned_worker_id = ? AND t_finished IS NULL ) ) ORDER BY t_created DESC" with ParamValues: 1='896'] at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/Workers.pm line 245
Updated by okurz about 1 year ago
- Description updated (diff)
- Priority changed from Immediate to Urgent
packages were downgraded due to the system unable to reach download.opensuse.org. I tried with
sed -i 's/http:/https:/g' /etc/zypp/repos.d/*.repo && zypper ref
but same problem.
Working on this with tina
zypper --no-refresh in --oldpackage --allow-vendor-change /var/cache/zypp/packages/devel_openQA/x86_64/openQA-*-4.6.1699952945.e6799a9-lp155.6163.1.x86_64.rpm /var/cache/zypp/packages/devel_openQA_Leap/noarch/perl-Mojolicious-9.340.0-lp154.2.1.noarch.rpm
and to retrigger according incomplete jobs with openqa-advanced-retrigger-jobs
. openQA back in action.
Updated by okurz about 1 year ago
- Related to action #150845: openqaworker-arm22 broken due to packages automatically removed size:M added
Updated by tinita about 1 year ago
We force merged https://github.com/os-autoinst/openQA/pull/5361 which fixes the autoupdate issue, so after that we shouldn't see unwanted downgrades anymore.
I will monitor package build and publishing and then enable autoupdate again
Updated by tinita about 1 year ago
new packages arrived at http://download.opensuse.org/repositories/devel:/openQA/15.5/x86_64/
Started openqa-auto-update service, it's currently updating a lot of packages
Updated by tinita about 1 year ago
- Status changed from In Progress to Feedback
finished. I did systemctl start openqa-auto-update.service
which ran the autoupdate, but I can't enable the service:
systemctl enable openqa-auto-update.service
The unit files have no installation config (WantedBy=, RequiredBy=, Also=,
Alias= settings in the [Install] section, and DefaultInstance= for template
units). This means they are not meant to be enabled using systemctl.
Maybe we should have disabled/enabled the .timer instead?
webui running fine again.
Updated by mkittler about 1 year ago
Yes, you only need to enable --now
the timer unit.
EDIT: I see you have already done that now. So this ticket can supposedly be considered resolved.
Updated by tinita about 1 year ago
I wonder why we didn't get a notification from logwarn. We also had this issue yesterday for about 15 minutes.
[2023-11-14T13:14:52.868112Z] [error] [ss5maBT9dSgc] DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception: DBD::Pg::st execute failed: ERROR: column me.backend_info does not exist
...
[2023-11-14T13:30:09.810755Z] [error] [7QU0oTvJCIqm] DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception: DBD::Pg::st execute failed: ERROR: column me.backend_info does not e
xist
Updated by okurz about 1 year ago
- Priority changed from Urgent to High
Reducing prio as the urgency was resolved with all your actions, thanks! So, resolve or do you want to look into the logwarn errors?
Updated by tinita about 1 year ago
I checked with a test logfile if logwarn would report the lines, and it does, and I don't see any errors from cron in the root mailbox, so currently I'm out of ideas why it didn't report...
Updated by livdywan about 1 year ago
- Subject changed from o3 "Unable to fetch build results" and "Internal server error" on some pages to o3 "Unable to fetch build results" and "Internal server error" on some pages size:M
- Status changed from Feedback to Resolved
Everything works as intended. Awesome!
Updated by tinita about 1 year ago
- Related to action #151013: o3 yielding "502 Bad Gateway" from nginx 2023-11-19, why was the config overwritten? size:M added