action #138356
closedcoordination #121720: [saga][epic] Migration to QE setup in PRG2+NUE3 while ensuring availability
coordination #123800: [epic] Provide SUSE QE Tools services running in PRG2 aka. Prg CoLo
coordination #137630: [epic] QE (non-openQA) setup in PRG2
Migration of qam.suse.de to PRG2 size:M
0%
Description
Motivation¶
See parent
Acceptance criteria¶
- AC1: Common services supplied from qam.suse.de, including at least teregen as well as dashboard.qam.suse.de, are supplied from PRG2 after migration
Suggestions¶
- Announce upfront
- Follow https://jira.suse.com/browse/ENGINFRA-3071
- Monitor the situation, in particular gitlab CI pipelines for qem-dashboard and bot-ng
- Adapt tooling were needed
- Coordinate to configure firewall as needed
Rollback actions¶
enable pipeline schedules in https://gitlab.suse.de/qa-maintenance/bot-ng/-/pipeline_schedulesDONE
Updated by okurz about 1 year ago
- Status changed from New to In Progress
- Priority changed from Normal to High
Meeting with mmanev about planned migration of qam2.suse.de. migration of qam2.suse.de to PRG2 datacenter planned this Wednesday impacting at least QAM template generation and dashboard.qam.suse.de . Expect the systems to be unavailable during the timeframe 2023-10-25 0700Z-1900Z. The VM has 50G of storage on a single virtual drive so the migration itself will likely not take too long. As the VM is still in an "old" network zone I provided rough requirements to mmanev about necessary inbound+outbound connections. Likely something will be missed and needs to be handled case by case. Expect a Slack thread opened by mmanev for coordination.
Updated by okurz about 1 year ago
https://suse.slack.com/archives/C02CANHLANP/p1698066009284279
@here migration of qam2.suse.de to PRG2 datacenter planned this Wednesday impacting at least QAM template generation and dashboard.qam.suse.de . Expect the systems to be unavailable during the timeframe 2023-10-25 0700Z-1900Z. Further details in https://progress.opensuse.org/issues/138356
Updated by openqa_review about 1 year ago
- Due date set to 2023-11-07
Setting due date based on mean cycle time of SUSE QE Tools
Updated by livdywan about 1 year ago
- Subject changed from Migration of qam.suse.de to PRG2 to Migration of qam.suse.de to PRG2 size:M
Discussed briefly in the estimations. Good as-is.
Updated by okurz about 1 year ago
migration was delayed by SUSE-IT, planned in https://suse.slack.com/archives/C04MDKHQE20/p1698395123650769
(Marko Manev) Migration for qam2.suse.de
(Marko Manev) @Oliver Kurz Would it be OK to migrate this VM on Tuesday 31.10.2023? We were short staffed this week, so we could not get to it. We can also aim on Monday, but I would not want to re-schedule again.
(Oliver Kurz) Can we make it 02.11.2023?
Updated by okurz about 1 year ago
again 2023-11-02 was not possible for SUSE-IT. I am suggesting to follow up today in https://suse.slack.com/archives/C04MDKHQE20/p1699002851906079?thread_ts=1698395123.650769&cid=C04MDKHQE20
Updated by okurz about 1 year ago
- Due date changed from 2023-11-07 to 2023-11-17
special hackweek due-date bump
Updated by okurz about 1 year ago
Migration will commence today as per https://suse.slack.com/archives/C04MDKHQE20/p1699007784006209?thread_ts=1698395123.650769&cid=C04MDKHQE20
Announced again to #eng-testing in https://suse.slack.com/archives/C02CANHLANP/p1699009277427629?thread_ts=1698066009.284279&cid=C02CANHLANP
The migration of qam2.suse.de will commence today at 1230Z in about 90m as per https://suse.slack.com/archives/C04MDKHQE20/p1699007784006209?thread_ts=1698395123.650769&cid=C04MDKHQE20
Updated by jbaier_cz about 1 year ago
The machine is migrated and DNS is changed: https://gitlab.suse.de/OPS-Service/salt/-/merge_requests/4347; there were some minor issues with accessing some of the external resources (rufus, l3support) but that was solved. I still see some issue with template generator, will investigate.
Updated by jbaier_cz about 1 year ago
And now we are again facing #92686, the other side of the NFS mount needs to be updated.
Updated by jbaier_cz about 1 year ago
Issue solved (context for next time: https://suse.slack.com/archives/C029APBKLGK/p1699022608671839). Now we only have problems in qem-bot pipeline, I guess the DNS records are not updated another firewall problem.
Updated by okurz about 1 year ago
thank you for your good work so far. Next steps pending reaction from SUSE-IT firewall admins, i.e. lhaleplidis to enable communication from gitlab CI runners to qam.suse.de as discussed in https://suse.slack.com/archives/C04MDKHQE20/p1699030210024579?thread_ts=1698395123.650769&cid=C04MDKHQE20
(Oliver Kurz) I read up to here, thanks for the work so far. So I understand NFS mounts are fine. And gitlab runners can't reach qam.suse.de yet so that needs firewall enablement?
(Jiri Novak) yes, i pinged lazaros also in pm
(Lazaros Haleplidis) here sorry on a meeting with US, let me read up
(Lazaros Haleplidis) ok, fixed, can you try once more please?
(Jan Baier) so far I do not see a change
(Jan Baier) problem still persists, see https://gitlab.suse.de/jbaier_cz/ci-test/-/jobs/1954839
(Jan Baier) btw. how can we find out which rules are currently in effect? Is there like a repo with configuration?
Updated by okurz about 1 year ago
- Copied to action #139130: Migration of openqa-service to PRG2 size:M added
Updated by jbaier_cz about 1 year ago
I believe we are good here and we made it just in time before HackWeek :)
Updated by okurz about 1 year ago
- Due date deleted (
2023-11-17) - Status changed from In Progress to Resolved
Agreed, https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs looks good and I found no more related alerts.