action #175710
opencoordination #161414: [epic] Improved salt based infrastructure management
OSD openqa.ini is corrupted, invalid characters, again 2025-01-17
0%
Updated by okurz 14 days ago
- Copied from action #163790: OSD openqa.ini is corrupted, invalid characters size:M added
Updated by tinita 8 days ago · Edited
- Priority changed from Low to High
- Target version changed from Tools - Next to Ready
While looking into #176013 I noticed that the search https://openqa.suse.de/minion does not allow to search for obs_rsync* tasks. They are just gone from the select. (Compare https://openqa.opensuse.org/minion )
I looked on osd if there were any config changes.
The openqa.config:
-rw-r--r-- 1 geekotest root 10243 Jan 22 23:54 openqa.ini │
The snapshot from Nov 7 is significantly bigger:
-rw-r--r-- 2 martchus root 14262 Nov 7 15:32 openqa.ini
I'm looking at the diff, but in both the obs_rsync plugin is configured. The diff is mostly comment lines
Updated by tinita 8 days ago
- Related to action #176013: [alert] web UI: Too many Minion job failures alert size:S added
Updated by tinita 8 days ago
I just tried to restart the gru service:
Jan 23 00:08:20 openqa systemd[1]: Stopping The openQA daemon for various background tasks like cleanup and saving needles...
Jan 23 00:13:20 openqa systemd[1]: openqa-gru.service: State 'stop-sigterm' timed out. Killing.
Jan 23 00:13:20 openqa systemd[1]: openqa-gru.service: Killing process 13903 (openqa) with signal SIGKILL.
Jan 23 00:13:20 openqa systemd[1]: openqa-gru.service: Killing process 26956 (openqa) with signal SIGKILL.
Jan 23 00:13:20 openqa systemd[1]: openqa-gru.service: Main process exited, code=killed, status=9/KILL
Jan 23 00:13:20 openqa systemd[1]: openqa-gru.service: Failed with result 'timeout'.
Jan 23 00:13:20 openqa systemd[1]: Stopped The openQA daemon for various background tasks like cleanup and saving needles.
Jan 23 00:13:20 openqa systemd[1]: openqa-gru.service: Consumed 20min 30.720s CPU time.
Jan 23 00:13:20 openqa systemd[1]: Started The openQA daemon for various background tasks like cleanup and saving needles.
So it is running, but something went wrong.
Updated by okurz 8 days ago
- Related to action #175407: salt state for machine monitor.qe.nue2.suse.org was broken for almost 2 months, nothing was alerting us size:S added
Updated by tinita 8 days ago · Edited
tinita wrote in #note-3:
I looked on osd if there were any config changes.
The openqa.config:
-rw-r--r-- 1 geekotest root 10243 Jan 22 23:54 openqa.ini
I had made a local backup of that file. I copied that now to osd into my home directory as openqa.ini-2025-01-22T23:54
Updated by nicksinger 7 days ago
I just found https://gitlab.suse.de/openqa/salt-states-openqa/-/jobs/3701756 which shows also broken files on tumblesle in /etc/zypp/zypp.conf which looked like:
## Configuration file for software management
## /etc/zypp/zypp.conf
##
## Boolean values are 0 1 yes no on off true false
}
[main]
solver.dupAllowVendorChange = True
I removed the stray "}" at the top. Maybe this is also related to "corrupted files".
Updated by tinita 7 days ago
- Related to action #176124: OSD influxdb minion route seemingly returns only a very small number of failed minion jobs, not all added
Updated by tinita 4 days ago
- Related to action #176175: [alert] Grafana failed to start due to corrupted config file added
Updated by okurz 3 days ago
- Copied to action #176250: file corruption in salt controlled config files size:M added
Updated by okurz 3 days ago
- Copied to deleted (action #176250: file corruption in salt controlled config files size:M)
Updated by okurz 3 days ago
- Blocked by action #176250: file corruption in salt controlled config files size:M added