action #102975
closedFix missing openqa.o.o data on metrics.o.o size:M
metrics.o.o shows no current data.
Acceptance criteria¶
- AC1: Current data of o3 can be seen in the metrics
- The API route appears to work, but the graphs contain no data
- Talk to Witold
- Become an openSUSE hero, get access to the network and come up with a fix
Updated by livdywan about 3 years ago
- Subject changed from Fix missing openqa.o.o data on metrics.o.o to Fix missing openqa.o.o data on metrics.o.o size:M
- Description updated (diff)
- Status changed from New to Workable
Updated by kraih about 3 years ago
Spoke with Witold, he will take a look at the Grafana setup.
Updated by openqa_review about 3 years ago
- Due date set to 2021-12-10
Setting due date based on mean cycle time of SUSE QE Tools
Updated by okurz about 3 years ago
- Status changed from In Progress to Feedback
We are still waiting for Witold to look into this. If there is no update until the start of next week some of us can try with openSUSE Heroes VPN ourselves for a start.
EDIT: I also tried something myself so I connected to the openSUSE Heroes VPN, could login to as "okurz" but to fix systemd services I would need sudo permissions or the root password. So I asked in!$9TX85dwTKhLgAFGo6Hu3-dw0CpaiFJdCtyoa2l7yU68 for help
okurz@metrics:/home/okurz> systemctl --failed
● osrt-metrics-access.service loaded failed failed openSUSE Release Tools: metrics - access logs
● osrt-metrics@openSUSE:Factory.service loaded failed failed openSUSE Release Tools: metrics for openSUSE:Factory
LOAD = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB = The low-level unit activation state, values depend on unit type.
2 loaded units listed.
okurz@metrics:/home/okurz> systemctl status osrt-metrics-access.service
● osrt-metrics-access.service - openSUSE Release Tools: metrics - access logs
Loaded: loaded (/usr/lib/systemd/system/osrt-metrics-access.service; disabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Fri 2021-12-03 04:27:49 UTC; 6h ago
TriggeredBy: ● osrt-metrics-access.timer
Process: 2141 ExecStart=/usr/bin/osrt-metrics-access-aggregate (code=exited, status=255/EXCEPTION)
Main PID: 2141 (code=exited, status=255/EXCEPTION)
Warning: some journal files were not opened due to insufficient permissions.
so two failed service but I am not allowed to read logs.
Updated by okurz about 3 years ago
@kraih over which chat channel can I reach Witold? Is he in ?
Updated by kraih about 3 years ago
okurz wrote:
@kraih over which chat channel can I reach Witold? Is he in ?
You can reach him on Slack. He told me that he won't get around to it before next week though. I'm trying to get some info on how things are set up in the meantime (so i can maybe take a look before that).
Updated by kraih about 3 years ago
Another interesting tidbit, Witold was not the one who set up metrics.o.o, he has just recently reverse engineered and fixed access metrics. Who set up the machine originally is still unknown was Jimmy Berry, and he did not leave much documentation unfortunately.
Updated by okurz about 3 years ago
- Status changed from Feedback to Resolved
So we fixed it together now. I added the "tools-team" from os-autoinst github organisation to /etc/grafana/grafana.ini so I could log in using github to and look into the panel to understand where it gets data from. We learned that telegraf is used and it gets its configuration from /usr/share/openSUSE-release-tools/metrics/telegraf which is maintained on github so I created
and edited it locally, then restarted osrt-metrics-telegraf.service