action #54137

Upgrade osd to a supported Leap version (from 42.3)

Added by okurz 9 months ago. Updated 7 months ago.

Status:ResolvedStart date:11/07/2019
Priority:HighDue date:
Assignee:okurz% Done:

0%

Category:-
Target version:-
Duration:

Description

After o3 has been successfully upgraded to Leap 15.1 we can now do the same for osd with good confidence.


Related issues

Related to openQA Infrastructure - action #55652: ** PROBLEM Service Alert: openqa.suse.de/fs_/var/lib/open... Resolved 16/08/2019
Related to openQA Infrastructure - action #55658: [osd] All jobs at midnight were incomplete and restarted ... Resolved 16/08/2019 31/08/2019
Blocks openQA Project - action #53915: switch "travis_test" aka. "openqa_dev" container to a sup... Resolved 07/09/2019
Copied from openQA Infrastructure - action #43976: bring o3 to same OS (or salt) version as workers, e.g. op... Resolved 15/11/2018 05/07/2019
Copied to openQA Infrastructure - action #55607: Upgrade all OSD workers to a supported OS version (e.g. f... Resolved 11/07/2019

History

#1 Updated by okurz 9 months ago

  • Copied from action #43976: bring o3 to same OS (or salt) version as workers, e.g. openSUSE Leap 15.0 added

#2 Updated by okurz 8 months ago

  • Blocks action #53915: switch "travis_test" aka. "openqa_dev" container to a supported base since Leap 42.3 is EOL added

#3 Updated by okurz 8 months ago

  • Copied to action #55607: Upgrade all OSD workers to a supported OS version (e.g. from Leap 42.3 to 15.1) and consistent for all added

#4 Updated by okurz 8 months ago

  • Related to action #55652: ** PROBLEM Service Alert: openqa.suse.de/fs_/var/lib/openqa is WARNING ** - Cleanup of results+logs added

#5 Updated by okurz 8 months ago

  • Related to action #55658: [osd] All jobs at midnight were incomplete and restarted due to cron-based automatic apache restart at 0:00 added

#6 Updated by okurz 8 months ago

  • Status changed from New to In Progress
  • Assignee set to okurz
  • Conducting backup of / volume. On backup.qa as root:
rsync -aHP --one-file-system --exclude=/tmp/ --exclude=/lost+found/ openqa.suse.de:/ /home/backup/osd/root-complete/
  • Then first setting prios for repos and checking that there are no important updates pending:
zypper mr -p 105 8
zypper mr -p 90 4
zypper mr -p 95 3
  • mariadb wants to be installed, does not seem to make sense. With reviewing /var/log/zypp/history I could find out that mariadb-errormessages is installed, uninstalled it. Still wants to be installed, don't know how to find out "why does package want to be installed"

  • Checking all config files already needing an update:

for i in $(find /etc/ -name '*.rpm*') ; do vimdiff ${i%.rpm*} $i; done
find /etc/ -name '*.rpm*' | grep -v 'rpm-utils' | xargs rm
zypper rm $(zypper packages --unneeded| awk '/^i/{ print $5 }' ORS=" ")

in a loop until the list was turning out empty. Afterwards deleted not currently booted kernels

  • Deleted some data from different /home/* folders and sent email to each user when I deleted data. All data is still available in the backup

  • Encountered potential problems packages on upgrade, conducting test upgrade in chroot:

rsync -aHP backup.qa:/home/backup/osd/root-complete/ poo54137_test_upgrade_of_osd/root-complete/ --exclude=home/*
for i in sys proc dev dev/shm dev/pts run ; do mount -o bind /$i poo54137_test_upgrade_of_osd/root-complete/$i; done
chroot /abuild/poo54137_test_upgrade_of_osd/root-complete/
zypper rl postfix
zypper in postfix
mkdir /tmp
rm /var/log
mkdir -p /var/log
zypper dup
zypper rm $(zypper packages --unneeded| awk '/^i/{ print $5 }' ORS=" ")
zypper dup

this resulted seemingly in a loop. zypper dup wants to install "SUSEConnect drm-kmp-default ft2demos rollback-helper" but they end up as being reported as "unneeded", maybe a limitation of the chroot experimentation environment or maybe solvable by enough reruns (see above for reason of looping).

Then conducted upgrade with

cd /etc/zypp/repos.d/
sed -i -e 's/42\.3/$releasever/g' *
sed -i -e 's@suse/@@' *
zypper --releasever=15.1 ref
zypper -n --releasever=15.1 dup --auto-agree-with-licenses --replacefiles --download-in-advance

turning out everything good.

EDIT: Understood now what happened on OSD itself: I tried to change repo files but salt recipes put in back a version without $releasever hence zypper tried to install or rather keep packages from them resulting in unresolvable dependencies when trying to upgrade -> https://gitlab.suse.de/openqa/salt-states-openqa/merge_requests/133

  • Plan or to revisit: Update the repos so that we can upgrade with "$releasever":
cd /etc/zypp/repos.d/
sed -i -e 's/42\.3/$releasever/g' *
zypper --releasever=15.1 ref
sudo -u geekotest /opt/openqa-scripts/dump-psql && zypper -n --releasever=15.1 dup --auto-agree-with-licenses --replacefiles --download-in-advance
rpmconfigcheck 
for i in $(cat /var/adm/rpmconfigcheck) ; do vimdiff ${i%.rpm*} $i ; done
for i in $(cat /var/adm/rpmconfigcheck) ; do rm $i ; done
reboot

#7 Updated by coolo 8 months ago

going through rpmconfigcheck now - seems to running though

#8 Updated by okurz 8 months ago

main upgrade done. I sent an email to MLs. There seems to be one problem, first reported by @asmorodskyi that isos post end up with an error, see rocket chat testing channel. Jobs are scheduled though.

Also check_mk seems to have vanished. From the vague comments by coolo I assume he has removed some files or checks or packages. https://thruk.suse.de/thruk/cgi-bin/extinfo.cgi?type=2&host=openqa.suse.de&service=Check_MK&backend=7215e#pnp_th2/1566143151/1566233151/0 is reporting a problem about this.

#9 Updated by kbabioch 8 months ago

After having a look on openqa.suse.de I've noticed that Postfix was removed, which also triggered the monitoring check to be removed. Instead Exim was installed (but not properly configured), so there are deferred mails in the queue.

I've installed Postfix instead now. For any future updates, please make sure to keep Postfix and double-check the configuration /etc/sysconfig/mail /etc/sysconfig/postfix.

#10 Updated by okurz 7 months ago

@kbabioch Thanks for fixing this.

Another problem was reported by @mgriessmeier that the FTP download of the "suse.ins" file is not behaving as in before in a way that newlines are swallowed. I compared the vsftpd.conf from the backup with the current config and found some differences, mainly:

--- vsftpd_old.conf 2019-08-22 15:30:17.964392658 +0200
+++ vsftpd_new.conf 2019-08-22 15:27:02.620202257 +0200
@@ -169,7 +169,7 @@
 # raw file.
 # ASCII mangling is a horrible feature of the protocol.
 ascii_upload_enable=YES
-ascii_download_enable=YES
+#ascii_download_enable=YES
 #
 # Set to NO if you want to disallow the  PASV  method of obtaining a data
 # connection.

reverted this one line, restarted the service and verified with mgriessmeier that the downloaded file looks fine again. Looks like "rpmconfigcheck" by coolo gone wrong?

@coolo what's your status on this one and your further plan?

#11 Updated by coolo 7 months ago

All files we need to touch should be in the salt config - and as such it's a slow process. Did you do this for vsftpd?

#12 Updated by okurz 7 months ago

coolo wrote:

All files we need to touch should be in the salt config - and as such it's a slow process. Did you do this for vsftpd?

I guess that's not an honest question as you probably would have expected to see a merge request already :) Do you suggest to supply the complete file or patching the base template or example file?

Done with https://gitlab.suse.de/openqa/salt-states-openqa/merge_requests/143

#13 Updated by okurz 7 months ago

  • Status changed from In Progress to Resolved

merged, everything else looks fine. Considered done.

Also available in: Atom PDF