action #68785
[monitoring] Setup of QA generic monitoring instance
0%
History
#2
Updated by okurz 6 months ago
What I did:
- Connect with virt-manager to to qsf-cluster.qa.suse.de
- Configure new machine with virt-install, network install loading from download.opensuse.org/tumbleweed/repo/oss/
- 4 cores, 8GB RAM, name "stats", description "stats (Maintainer: okurz@suse.de)", 40GB new storage, kernel options
autoyast=https://w3.suse.de/~okurz/ay.xml
- content of ay.xml:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE profile> <profile xmlns="http://www.suse.com/1.0/yast2ns" xmlns:config="http://www.suse.com/1.0/configns"> <general> <mode> <confirm config:type="boolean">false</confirm> </mode> </general> <bootloader> <global> <timeout config:type="integer">0</timeout> </global> </bootloader> <networking> <keep_install_network config:type="boolean">true</keep_install_network> </networking> <software> <install_recommended config:type="boolean">true</install_recommended> <products config:type="list"> <product>openSUSE</product> </products> <packages config:type="list"> <package>openssh</package> <package>sudo</package> </packages> </software> <user_defaults> <expire/> <group>100</group> <groups/> <home>/home</home> <inactive>-1</inactive> <no_groups config:type="boolean">true</no_groups> <shell>/bin/bash</shell> <skel>/etc/skel</skel> <umask>022</umask> </user_defaults> <users config:type="list"> <user> <username>root</username> <user_password>$6$OHtabasWX3LK$dzWQazasWNgjg8h5afcT9ZtQltxDpkiDYZFzMOdg2f2frJ7euW10b4kHVvABPx8KxN4BbChgqja.tiZJ63ks41</user_password> <encrypted config:type="boolean">true</encrypted> </user> <user> <authorized_keys config:type="list"> <authorized_key>ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAILAtWUGdPW5LO1rMqVULy0VWKJ4ba+y2uglpi3gaZvuB okurz@linux-28d6.suse</authorized_key> </authorized_keys> <encrypted config:type="boolean">true</encrypted> <fullname>Oliver Kurz</fullname> <gid>100</gid> <home>/home/okurz</home> <home_btrfs_subvolume config:type="boolean">false</home_btrfs_subvolume> <shell>/bin/bash</shell> <uid>1000</uid> <user_password>$6$OHtabasWX3LK$dzWQazasWNgjg8h5afcT9ZtQltxDpkiDYZFzMOdg2f2frJ7euW10b4kHVvABPx8KxN4BbChgqja.tiZJ63ks41</user_password> <username>okurz</username> </user> </users> <groups config:type="list"> <group> <groupname>wheel</groupname> <userlist>okurz</userlist> </group> </groups> <partitioning config:type="list"> <drive> <initialize config:type="boolean">true</initialize> <partitions config:type="list"> <partition> <mount>/</mount> <size>max</size> <filesystem config:type="symbol">btrfs</filesystem> </partition> <partition> <mount>swap</mount> <size>auto</size> </partition> </partitions> </drive> </partitioning> <ntp-client> <ntp_policy>auto</ntp_policy> <ntp_servers config:type="list"> <ntp_server> <address>2.opensuse.pool.ntp.org</address> <iburst config:type="boolean">true</iburst> <offline config:type="boolean">false</offline> </ntp_server> </ntp_servers> <ntp_sync>manual</ntp_sync> </ntp-client> <scripts> <post-scripts config:type="list"> <script> <filename>setup.sh</filename> <interpreter>shell</interpreter> <debug config:type="boolean">true</debug> <source><![CDATA[ echo '%wheel ALL=(ALL) NOPASSWD: ALL' >>/etc/sudoers echo '0 3 * * 0 root zypper -n dup --replacefiles --auto-agree-with-licenses --force-resolution --download-in-advance' >> /etc/cron.d/auto-update systemctl enable --now sshd zypper -n ar -f http://download.opensuse.org/tumbleweed/repo/non-oss/ repo-non-oss zypper -n ar -f http://download.opensuse.org/tumbleweed/repo/oss/ repo-oss zypper -n ar -f http://download.opensuse.org/update/tumbleweed/ repo-update curl -sfL https://get.k3s.io | sh - ]]></source> </script> </post-scripts> </scripts> <firewall> <enable_firewall config:type="boolean">true</enable_firewall> <start_firewall config:type="boolean">true</start_firewall> <FW_CONFIGURATIONS_EXT>sshd</FW_CONFIGURATIONS_EXT> </firewall> <ssh_import> <import config:type="boolean">true</import> <device>/dev/vda2</device> </ssh_import> <timezone> <hwclock>UTC</hwclock> <timezone>Europe/Berlin</timezone> </timezone> </profile>
Machine is accessible now but without a nice DNS name yet. For now it's 1c036.qa.suse.de (stats.qa.suse.de is already used by "snipe-vm", purpose unknown). k8s can be used:
k3s check-config k3s kubectl get node k3s kubectl create deployment hello-node --image=k8s.gcr.io/echoserver:1.4 k3s kubectl get deployments k3s kubectl get pods k3s kubectl get events k3s kubectl config view k3s kubectl expose deployment hello-node --type=LoadBalancer --port=8080 k3s kubectl get services curl http://localhost:8080 3s kubectl delete service hello-node k3s kubectl delete deployment hello-node
Setup helm on the same machine with curl -s https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 | bash
as per https://helm.sh/docs/intro/install/ , configured as per https://rancher.com/docs/k3s/latest/en/cluster-access/ with
export KUBECONFIG=/etc/rancher/k3s/k3s.yaml kubectl get pods --all-namespaces helm ls --all-namespaces
Then installed grafana:
helm repo add bitnami https://charts.bitnami.com/bitnami helm install my-release --set admin.password=susetesting bitnami/grafana
Also https://kubeapps.com/ looks fancy. Installed following https://github.com/kubeapps/kubeapps/blob/master/docs/user/getting-started.md as well as rancher with https://rancher.com/docs/rancher/v2.x/en/installation/k8s-install/helm-rancher/ 'cause why not :)
I failed to manually setup the right parameters to make the grafana instance accessible from outside so I did
helm install mygrafana --set admin.password=susetesting --set service.type=LoadBalancer bitnami/grafana
which makes grafana available on http://1c036.qa.suse.de:3000
I followed https://randy-stad.gitlab.io/posts/2020-01-29-k3s-traefik-dashboard/ to have a nice dashboard for traefik but so far I did not manage to provide nicer DNS names. But first https://randy-stad.gitlab.io/posts/2020-01-29-k3s-traefik-dashboard/ for the name of the host itself.
#4
Updated by okurz 6 months ago
- Due date deleted (
2020-07-24)
We had the QA SLE metrics workshop but szarate+jorauch could not yet show something on grafana instances for now. We shortly discussed the question what instance to use.
szarate, jorauch (added as watchers) I recommend we can use both personal instances (that you set up) as well as http://1c036.qa.suse.de:3000/ for experimentation and https://stats.openqa-monitor.qa.suse.de/ as our main production instance where we can ensure proper provisioning using https://gitlab.suse.de/openqa/salt-states-openqa/-/tree/master/openqa/monitoring . We can also configure a CNAME entry for the host to be less openqa-centric. WDYT?
EDIT: Added the CNAME proposal in https://gitlab.suse.de/qa-sle/qanet-configs/-/merge_requests/12
#7
Updated by okurz 6 months ago
- Status changed from Feedback to Workable
- Assignee deleted (
okurz)
DNS name merged in https://gitlab.suse.de/qa-sle/qanet-configs/-/merge_requests/12 but I think the grafana instance (or nginx?) does not answer to the new hostname yet.
#10
Updated by okurz 5 months ago
certificate problems resolved in #69613 . https://monitor.qa.suse.de also works now but so far http://monitor.qa.suse.de does not redirect. That should be done next.
#11
Updated by okurz 5 months ago
task from last metrics workshop meeting: Include a description on the home dashboard. I shortly looked up how the home dashboard can be changed. I found a way to change content of the main text window but could not save it. Also it seems as if the home dashboard can not come from provisioning directly. We could also include a link to the qa metrics internal wiki page.
References¶
#12
Updated by okurz 4 months ago
- Status changed from Workable to In Progress
- Assignee set to okurz
Trying to include the following HTML on the home dashboard:
<div class="text-center dashboard-header"> <span>Home Dashboard for SUSE QA</span> <p> Monitoring, Alerting, Trending for SUSE QA. Mainly used by the team <a href="https://progress.opensuse.org/projects/qa/wiki/Wiki#QA-tools-Team-description">SUSE QA Tools</a>. </p> <p> Find the overall status of the openqa.suse.de (OSD) infrastructure on <a href"https://stats.openqa-monitor.qa.suse.de/d/4KkGdvvZk/osd-status-overview?orgId=1">OSD status overview</a> </p> <p> Please find more information for on the <a href="https://confluence.suse.com/display/qasle/QA+Metrics">QA Metrics</a> page. </p> </div>
but I realized I can do markdown as well. The header isn't that fancy but still I prefer markdown:
# Home Dashboard for SUSE QA Monitoring, Alerting, Trending for SUSE QA. Mainly used by the team [SUSE QA Tools](https://progress.opensuse.org/projects/qa/wiki/Wiki#QA-tools-Team-description). Find the overall status of the openqa.suse.de (OSD) infrastructure on [OSD status overview](https://stats.openqa-monitor.qa.suse.de/d/4KkGdvvZk/osd-status-overview?orgId=1). Please find more information for on the [QA Metrics](https://confluence.suse.com/display/qasle/QA+Metrics) page.
For this I saved in the preferences of the home dashboard a copy as "Home Copy" and changed that. I can also save that dashboard however it does not seem to be used as the first entry point
#13
Updated by okurz 4 months ago
- Status changed from In Progress to Resolved
I needed to "star" the home dashboard and then I could select "Home" as the new dashboard for the organisation in https://stats.openqa-monitor.qa.suse.de/org . Updated the text to be smaller because previously it would only show the first three lines or so. This should suffice for now.
As there is also a complete backup of the grafana database with https://gitlab.suse.de/qa-sle/backup-server-salt/-/blob/master/rsnapshot/rsnapshot.conf#L30 I consider the ticket resolved.