Project

General

Profile

action #27735

[tools][sle][functional][u][hard]Use ttm (totest-manager.py) from http://github.com/openSUSE/osc-plugin-factory/ for SLE

Added by okurz almost 4 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Infrastructure
Target version:
SUSE QA - Milestone 17
Start date:
2017-11-14
Due date:
2018-07-31
% Done:

0%

Estimated time:
Difficulty:
hard

Description

Motivation

ttm is used very successfully for openSUSE Tumbleweed as well as openSUSE Leap now. Using the same approach should help us with our regular review process even if we do not actually "release" any build by it

Acceptance criteria

  • AC1: ttm is running and ensured to be kept running for current SLE versions in development, e.g. SLE15 on a production instance
  • AC2: there is visible output to the SLE test reviewers, e.g. comments, IRC message, email, etc.

Tasks

  • Checkout http://github.com/openSUSE/osc-plugin-factory/ for SLE, get to know it
  • Add test for SLE
  • Implement check for SLE with correspondigly adapted IBS project paths
  • Add ttm to a production machine
  • Make sure ttm is started on system startup, kept running when crashed
  • Make sure ttm repo on that machine is updated automatically
  • Optional: Add notification feature, e.g. interconnect with https://github.com/openSUSE/suse_msg

Related issues

Related to openQA Infrastructure - action #37300: Create salt config for openqa-serviceRejected2018-06-13

Related to openQA Project - action #37303: better ttm comment in openqaNew2018-06-13

Related to openQA Project - action #27847: IRC bot with updates about test runs, e.g. if builds are finishedResolved2017-11-17

History

#1 Updated by okurz almost 4 years ago

  • Status changed from New to In Progress

#2 Updated by okurz over 3 years ago

  • Subject changed from Use ttm (totest-manager.py) from http://github.com/openSUSE/osc-plugin-factory/ for SLE to [tools][sle][functional][hard]Use ttm (totest-manager.py) from http://github.com/openSUSE/osc-plugin-factory/ for SLE
  • Due date set to 2018-04-10
  • Status changed from In Progress to Workable
  • Target version changed from Milestone 14 to Milestone 15

ok, let's do the initial part in functional then

#3 Updated by cwh over 3 years ago

  • Difficulty set to hard

#4 Updated by okurz over 3 years ago

  • Subject changed from [tools][sle][functional][hard]Use ttm (totest-manager.py) from http://github.com/openSUSE/osc-plugin-factory/ for SLE to [tools][sle][functional][u][hard]Use ttm (totest-manager.py) from http://github.com/openSUSE/osc-plugin-factory/ for SLE
  • Due date changed from 2018-04-10 to 2018-04-24

not enough capacity in S14, moving

#5 Updated by mgriessmeier over 3 years ago

  • Due date changed from 2018-04-24 to 2018-05-08
  • Target version changed from Milestone 15 to Milestone 16

#6 Updated by dheidler over 3 years ago

  • Status changed from Workable to In Progress
  • Assignee set to dheidler

#7 Updated by dheidler over 3 years ago

Started implementing AMQP support for ttm results.

#8 Updated by mgriessmeier over 3 years ago

  • Due date changed from 2018-05-08 to 2018-05-22

#9 Updated by dheidler over 3 years ago

TTM command:

./totest-manager.py --project-base SLE --dry --debug --verbose run SUSE:SLE-15:GA

#11 Updated by dheidler over 3 years ago

Merged.

Waiting for AMQP account on rabbit.suse.de

#12 Updated by dheidler over 3 years ago

Deployed on openqa-service machine.

'suse.ttm.build.inprogress':'{"project": "SUSE:SLE-15:GA", "failed_jobs": {"ignored": [], "relevant": []}, "build": "634.1"}'

Currently running with --dry option.

#13 Updated by mgriessmeier over 3 years ago

  • Due date changed from 2018-05-22 to 2018-06-05

#14 Updated by dheidler over 3 years ago

opensuse ttm is now sending amqp events to amqp.opensuse.org

#16 Updated by dheidler over 3 years ago

Sidenote: TTM is included in the openSUSE-release-tools-totest-manager RPM from openSUSE-release-tools base pkg.

#17 Updated by mgriessmeier over 3 years ago

  • Due date changed from 2018-06-05 to 2018-06-19
  • Target version changed from Milestone 16 to Milestone 17

#20 Updated by dheidler over 3 years ago

TTM irc messages are now sent to #openqa-events and #openqa-test.

#21 Updated by dheidler over 3 years ago

totest-manager is running on openqa-service.suse.de

#22 Updated by okurz over 3 years ago

I see the following messages for SLE15 build 665.1 in #openqa-test:

[5 Jun 2018 22:36:26] [Notice] -hermes to #openqa-test- TTM: SUSE:SLE-15:GA build 665.1 inprogress - not unknown fails yet!
[6 Jun 2018 00:13:05] [Notice] -hermes to #openqa-test- TTM: SUSE:SLE-15:GA build 665.1 inprogress - not unknown fails yet!
[6 Jun 2018 00:23:31] [Notice] -hermes to #openqa-test- TTM: SUSE:SLE-15:GA build 665.1 fail - unknown fails: https://openqa.suse.de/t1746061 https://openqa.suse.de/t1746655 https://openqa.suse.de/t1746680 https://openqa.suse.de/t1746829 https://openqa.suse.de/t1746830
[6 Jun 2018 00:39:12] [Notice] -hermes to #openqa-test- TTM: SUSE:SLE-15:GA build 665.1 fail - unknown fails: https://openqa.suse.de/t1746061 https://openqa.suse.de/t1746655 https://openqa.suse.de/t1746662 https://openqa.suse.de/t1746680 https://openqa.suse.de/t1746829 https://openqa.suse.de/t1746830
[6 Jun 2018 00:44:26] [Notice] -hermes to #openqa-test- TTM: SUSE:SLE-15:GA build 665.1 fail - unknown fails: https://openqa.suse.de/t1746061 https://openqa.suse.de/t1746655 https://openqa.suse.de/t1746662 https://openqa.suse.de/t1746680 https://openqa.suse.de/t1746816 https://openqa.suse.de/t1746829 https://openqa.suse.de/t1746…
[6 Jun 2018 01:00:11] [Notice] -hermes to #openqa-test- TTM: SUSE:SLE-15:GA build 665.1 fail - unknown fails: https://openqa.suse.de/t1746061 https://openqa.suse.de/t1746367 https://openqa.suse.de/t1746655 https://openqa.suse.de/t1746662 https://openqa.suse.de/t1746680 https://openqa.suse.de/t1746816 https://openqa.suse.de/t1746…
[6 Jun 2018 01:15:59] [Notice] -hermes to #openqa-test- TTM: SUSE:SLE-15:GA build 665.1 fail - unknown fails: https://openqa.suse.de/t1746061 https://openqa.suse.de/t1746367 https://openqa.suse.de/t1746655 https://openqa.suse.de/t1746662 https://openqa.suse.de/t1746680 https://openqa.suse.de/t1746689 https://openqa.suse.de/t1746…
[6 Jun 2018 01:37:01] [Notice] -hermes to #openqa-test- TTM: SUSE:SLE-15:GA build 665.1 fail - unknown fails: https://openqa.suse.de/t1746061 https://openqa.suse.de/t1746199 https://openqa.suse.de/t1746367 https://openqa.suse.de/t1746655 https://openqa.suse.de/t1746662 https://openqa.suse.de/t1746680 https://openqa.suse.de/t1746…
[6 Jun 2018 01:52:50] [Notice] -hermes to #openqa-test- TTM: SUSE:SLE-15:GA build 665.1 fail - unknown fails: https://openqa.suse.de/t1746061 https://openqa.suse.de/t1746138 https://openqa.suse.de/t1746199 https://openqa.suse.de/t1746367 https://openqa.suse.de/t1746655 https://openqa.suse.de/t1746662 https://openqa.suse.de/t1746…
[6 Jun 2018 01:58:06] [Notice] -hermes to #openqa-test- TTM: SUSE:SLE-15:GA build 665.1 fail - unknown fails: https://openqa.suse.de/t1746061 https://openqa.suse.de/t1746138 https://openqa.suse.de/t1746199 https://openqa.suse.de/t1746367 https://openqa.suse.de/t1746482 https://openqa.suse.de/t1746655 https://openqa.suse.de/t1746…
[6 Jun 2018 02:29:46] [Notice] -hermes to #openqa-test- TTM: SUSE:SLE-15:GA build 665.1 fail - unknown fails: https://openqa.suse.de/t1746061 https://openqa.suse.de/t1746138 https://openqa.suse.de/t1746199 https://openqa.suse.de/t1746367 https://openqa.suse.de/t1746381 https://openqa.suse.de/t1746482 https://openqa.suse.de/t1746…
[6 Jun 2018 03:01:18] [Notice] -hermes to #openqa-test- TTM: SUSE:SLE-15:GA build 665.1 fail - unknown fails: https://openqa.suse.de/t1746061 https://openqa.suse.de/t1746138 https://openqa.suse.de/t1746199 https://openqa.suse.de/t1746367 https://openqa.suse.de/t1746381 https://openqa.suse.de/t1746406 https://openqa.suse.de/t1746…

So I fear we have the same problem as for tumblesle-release for now: Too spammy as considered by some. I guess first message about "inprogress" is fine, I wonder about the need for the second though. Also the fail messages seem to be too many. Maybe we can just tweak the buffering in the IRC bot to silence more follow-up messages. For tumblesle-release I did the same by reducing the buffer length so much that all messages are considered "the same". The job URLs are not really interesting considering a full build result as most likely there will be too many jobs to be interested in individual jobs. We should maybe just use a short-URL pointing to the test overview. E.g. could look like this:

[6 Jun 2018 03:01:18] [Notice] -hermes to #openqa-test- TTM: SUSE:SLE-15:GA build 665.1 fail - unknown fails: http://s.qa.suse.de/suse_sle-15_ga_665.1-fails

pointing to https://openqa.suse.de/tests/overview?result=failed&result=incomplete&arch=&failed_modules=&build=665.1&distri=sle&groupid=110&version=15#

#24 Updated by dheidler over 3 years ago

  • Status changed from In Progress to Feedback

ttm is now running with systemd timer to avoid memory leak.

waiting for PR merge.

#25 Updated by dheidler over 3 years ago

  • Status changed from Feedback to Resolved

PR merged.

ACs fulfilled.

#26 Updated by okurz over 3 years ago

  • Status changed from Resolved to In Progress

sorry, I don't agree.

First, the service is running on openqa-service.suse.de as you stated. How is that machine managed? We should have at least a wiki page describing the installation and configuration, better some system management repository, e.g. salt.

What would happen if the machine dies? We should think about the whole PLM (product lifecycle management) as we want to solution that continues to run in the future as well.

Second, AC2 mentions "there is visible output to the SLE test reviewers, e.g. comments, IRC message, email, etc.". I argue this is not fulfilled when the bot just runs in #openqa-test so it is not really "visible". Could you elaborate a bit more about the the commenting in the according job groups?

Third, by now SLE15 can be considered released but we can care about SLE12-SP4 as well as SLE15-SP1. Do we need to add configuration to cover this? I assume we need to start according systemd jobs.

#27 Updated by dheidler over 3 years ago

First, the service is running on openqa-service.suse.de as you stated. How is that machine managed? We should have at least a wiki page describing the installation and configuration, better some system management repository, e.g. salt.

This is not in the scope of this ticket.
It should be handled in a separate ticket.
The machine has some config files managed by salt, but this uses some odd suse-it salt repos that I don't understand.

What would happen if the machine dies? We should think about the whole PLM (product lifecycle management) as we want to solution that continues to run in the future as well.

I think that suse-it creates backups of the vm server.

Second, AC2 mentions "there is visible output to the SLE test reviewers, e.g. comments, IRC message, email, etc.". I argue this is not fulfilled when the bot just runs in #openqa-test so it is not really "visible". Could you elaborate a bit more about the the commenting in the according job groups?

There are IRC messages as explicitly mentioned in the AC.
Further messages would require modification of ttm which should get addressed in a separate ticket.

Third, by now SLE15 can be considered released but we can care about SLE12-SP4 as well as SLE15-SP1. Do we need to add configuration to cover this? I assume we need to start according systemd jobs.

Using my own systemd unit file (extra args: --norelease --project-base SLE run) and my systemd timer, it should have been sufficient to do:

systemctl enable --now osrt-totest-manager@SUSE:SLE-11-SP4:GA.timer

But ttm doesn't seem to support SLE-11-SP4 or SLE-15-SP1:

osrt-totest-manager --verbose --norelease --project-base SLE run SUSE:SLE-12-SP4:GA
2018-06-12 10:43:39,196 - t:1012 ERROR Project SUSE:SLE-12-SP4:GA not recognized. Possible values [openSUSE:Leap:15.0, openSUSE:Leap:15.0:Ports, SUSE:SLE-15:GA,
openSUSE:Factory:zSystems, openSUSE:Factory, openSUSE:Leap:15.0:Images, openSUSE:Factory:ARM, openSUSE:Factory:PowerPC]

#28 Updated by dheidler over 3 years ago

  • Related to action #37300: Create salt config for openqa-service added

#29 Updated by dheidler over 3 years ago

#30 Updated by dheidler over 3 years ago

The problem about SLE-12-SP4 is that SUSE:SLE-12-SP4:GA:TEST doesn't exist.

#31 Updated by dheidler over 3 years ago

SUSE:SLE-15-SP1:GA:TEST is existing, but doesn't contain anything, yet.

#32 Updated by okurz over 3 years ago

dheidler wrote:

The problem about SLE-12-SP4 is that SUSE:SLE-12-SP4:GA:TEST doesn't exist.

https://build.suse.de/project/show/SUSE:SLE-12-SP4:GA:TEST exists

#33 Updated by okurz over 3 years ago

dheidler wrote:

SUSE:SLE-15-SP1:GA:TEST is existing, but doesn't contain anything, yet.

ok, is that a problem?

#34 Updated by dheidler over 3 years ago

okurz wrote:

https://build.suse.de/project/show/SUSE:SLE-12-SP4:GA:TEST exists

then I had a typo when I checked the ibs webui.

#35 Updated by dheidler over 3 years ago

okurz wrote:

SUSE:SLE-15-SP1:GA:TEST is existing, but doesn't contain anything, yet.

ok, is that a problem?

2018-06-15 14:30:06,418 - t2:521 WARNING can't find SUSE:SLE-12-SP4:GA:TEST iso version
2018-06-15 14:30:07,895 - t2:529 INFO current_snapshot None: failed

yes - ttm doesn't get the current snapshot.

#36 Updated by dheidler over 3 years ago

  • Status changed from In Progress to Feedback

Created PR https://github.com/openSUSE/openSUSE-release-tools/pull/1579 without being able to test this.

I think, we have to wait for builds in that projects.

#37 Updated by okurz over 3 years ago

  • Target version changed from Milestone 17 to Milestone 17

#38 Updated by okurz over 3 years ago

  • Related to action #27847: IRC bot with updates about test runs, e.g. if builds are finished added

#39 Updated by mgriessmeier over 3 years ago

  • Due date changed from 2018-06-19 to 2018-07-03

#40 Updated by okurz about 3 years ago

  • Parent task set to #37357

#41 Updated by mgriessmeier about 3 years ago

  • Due date changed from 2018-07-03 to 2018-07-31

#42 Updated by okurz about 3 years ago

The last mentioned PR was merged 20 days ago. dheidler do you plan to continue here yourself or leave it to someone else?

#43 Updated by dheidler about 3 years ago

  • Status changed from Feedback to In Progress

It seems that we do have SLE15-SP1 and SLE12-SP4 builds in the right place by now.

https://github.com/openSUSE/openSUSE-release-tools/pull/1616

#44 Updated by dheidler about 3 years ago

PRs got merged.

Waiting for RPM getting updated.

#45 Updated by dheidler about 3 years ago

Updated IRC bot to remember the status messages of multiple products.

#46 Updated by dheidler about 3 years ago

  • Status changed from In Progress to Feedback

Waiting for RPM.

#47 Updated by dheidler about 3 years ago

  • Status changed from Feedback to Resolved

RPM updated on openqa-service.
Now also running tests for SLE15SP1 and SLE12SP4.

#48 Updated by okurz about 3 years ago

Good. Thank you. This story is very important for the whole team to understand as the idea was that it should help the team. I hope you like to demo it during the sprint review? We can now try to create more subtickets in the parent epic.

Also available in: Atom PDF