Project

General

Profile

action #44078

openQA Project - coordination #103947: [saga][epic] Scale up: Future proof backup of o3+osd

openQA Project - coordination #102710: [epic] Improve our backup

Implement proper backups for o3 size:M

Added by okurz about 3 years ago. Updated about 1 month ago.

Status:
Resolved
Priority:
Low
Assignee:
Target version:
Start date:
2018-11-20
Due date:
% Done:

0%

Estimated time:

Description

Motivation

We should find a backup space for o3, e.g. 4TB for test data/assets

Acceptance criteria

  • AC1: Test assets are backed up and can be restored
  • AC2: Test results are backed up and can be restored
  • AC3: Screenshots are backed up and can be restored

Suggestions

  • Use storage.qa.suse.de
  • We already have database backups elsewhere
  • Consider snapshots (if available) or rsync

Out of scope

Further details


Related issues

Related to openQA Infrastructure - action #69577: Handle installation of the new "Storage Server"Resolved2020-08-04

Related to openQA Infrastructure - action #88546: Make use of the new "Storage Server", e.g. complete OSD backupResolved

Related to openQA Infrastructure - action #96269: Define what a "complete OSD backup" should or can includeResolved2021-07-29

Copied to openQA Infrastructure - action #94015: proper backup for osdResolved

Copied to openQA Infrastructure - action #102713: Implement proper backups for o3 - list of installed packagesResolved

History

#2 Updated by okurz almost 3 years ago

As o3 is in a special network I recommend rsnapshot on backup.qa.suse.de syncing the data from o3.

#3 Updated by okurz almost 3 years ago

#4 Updated by okurz almost 3 years ago

  • Status changed from New to In Progress
  • Assignee set to okurz

Talked with tbro. 4TB is a problem. Anything in the range of up to 100GB is no problem to have, using space on old netapp, on request. However, I think then we can just as easily go with backup.qa which is easier for us given that we have full control of that machine and can trigger backups from there as suggested in #44078#note-2

#5 Updated by okurz almost 3 years ago

  • Status changed from In Progress to Workable
  • Assignee deleted (okurz)

Unassigning again as preparation for longer absence. I should really not leave tickets assigned to me dangling "In Progress" :)

#7 Updated by okurz almost 3 years ago

Automatic backup for the o3 webui host introduced with https://gitlab.suse.de/okurz/backup-server-salt/tree/master/rsnapshot covering so far /etc and the SQL database dumps. As next steps I recommend to save a bit more if feasible from /var/lib/openqa as well as from the workers which are transactional-servers. Maybe as well just /etc plus an autoyast profile.

#8 Updated by okurz over 1 year ago

  • Priority changed from Normal to Low
  • Target version set to Ready

#9 Updated by okurz over 1 year ago

  • Related to action #69577: Handle installation of the new "Storage Server" added

#10 Updated by okurz over 1 year ago

  • Status changed from Workable to Blocked
  • Assignee set to okurz

waiting for #69577 or #41918 first

#11 Updated by okurz 7 months ago

#12 Updated by okurz 7 months ago

  • Status changed from Blocked to New
  • Assignee deleted (okurz)

storage.qa.suse.de in place and usable

#13 Updated by cdywan 6 months ago

  • Subject changed from proper backup for o3 to Implement proper backups for o3 size:M
  • Description updated (diff)
  • Status changed from New to Workable

#14 Updated by okurz 6 months ago

  • Related to action #88546: Make use of the new "Storage Server", e.g. complete OSD backup added

#15 Updated by okurz 6 months ago

  • Related to action #96269: Define what a "complete OSD backup" should or can include added

#16 Updated by okurz 5 months ago

  • Description updated (diff)

#17 Updated by okurz 5 months ago

  • Description updated (diff)

#18 Updated by okurz 5 months ago

  • Description updated (diff)

#19 Updated by cdywan 5 months ago

I'm taking it now. Discussing it in the weekly I feel like it might be easier to treat it as a spike and propose something, and then see if we're happy with that :-D

#20 Updated by cdywan 5 months ago

  • Assignee set to cdywan

#21 Updated by cdywan 5 months ago

  • AC4: List of installed packages is backed up for reference
$ rpm -qa
[...]
$ zypper in gnome-maps-40.1-1.2.x86_64
[...]
Loading repository data...
Reading installed packages...
'gnome-maps.x86_64 = 40.1-1.2' is already installed.
There is an update candidate 'gnome-maps-40.4-1.1.x86_64' for 'gnome-maps-40.1-1.2.x86_64', but it does not match the specified version, architecture, or repository.
$ rpm -qa > packages.txt; sudo zypper in -D $(sed 's/-[a-z].*//' packages.txt)
[...]
$ zypper -x se --type=package -i
[...]
$ pip3 install --user yq
[...]
$ zypper -x se --type=package -i > packages.xml
$ sudo zypper in -D $(~/.local/bin/xq '.stream|."search-result"|."solvable-list"|.solvable|.[]|."@name"' packages.xml | sed 's/"//g')

Proof of concept using xq to process XML output from zypper seems to work for (re)storing the list of installed packages.

#22 Updated by okurz 5 months ago

I wasn't aware about "yq", sounds great! But why do you care about this? rpm -qa should suffice and we don't want to recover systems from this list, just have it ready as a reference. I see now that in https://progress.opensuse.org/journals/427051/diff?detail_id=404851 you added the "list of packages". As this is a rather short document I suggest to just include it in the existing rsnapshot config for backup.qa.suse.de, see https://gitlab.suse.de/qa-sle/backup-server-salt/-/blob/master/rsnapshot/rsnapshot.conf#L27 for how we call commands to backup SQL. I suggest you exclude the part about "list of packages" into another ticket or just do it to get to the more demanding/interesting parts of screenshots+assets+results for which we should use storage.qa.suse.de as target

#23 Updated by cdywan 5 months ago

  • Assignee deleted (cdywan)

okurz wrote:

I wasn't aware about "yq", sounds great! But why do you care about this? rpm -qa should suffice and we don't want to recover systems from this list, just have it ready as a reference.

Like I already said on the call, "proper backups" includes being able to recover from said backups in my book. And the other ACs mention that as well. I don't know what "a reference" means in this context. I'll unassign since it seems we're still unclear on the goal of the ticket.

#24 Updated by mkittler 2 months ago

  • Assignee set to mkittler
  • Target version deleted (Ready)

One simple option would be a rsnapshot setup like on https://gitlab.suse.de/qa-sle/backup-server-salt/-/tree/master/rsnapshot. It would be simple in the sense that we wouldn't need to invent something new and we wouldn't rely on specific filesystem features (other than hard links).

Another option would be creating a small script which runs on storage.qa.suse.de which would utilize btrfs: It would make a new btrfs subvolume/snapshot to store the previous state and invoke rsync to get in sync with o3. We'd run this script via cron or a systemd timer similar to rsnapshot. This shouldn't be hard to do although I'd needed to figure the required btrfs commands first.

#25 Updated by mkittler 2 months ago

  • Target version set to Ready

#26 Updated by mkittler 2 months ago

  • Target version deleted (Ready)

For played around with btrfs commands for 2nd option. I suppose the following commands would do it.

First time setup:

mkdir -p /storage/o3-backup/snapshots
btrfs subvolume create /storage/o3-backup/fs-tree

The actual backup and snapshotting:

mkdir -p /storage/o3-backup/fs-tree/var/lib/openqa
rsync -aHP root@o3:/var/lib/openqa/ /storage/o3-backup/fs-tree/var/lib/openqa
btrfs subvolume snapshot -r "/storage/o3-backup/fs-tree" "/storage/o3-backup/snapshots/$(date --iso-8601=date)"

Cleanup of old snapshots:

cd /storage/o3-backup/snapshots
for snapshot in *; do test "$(( $(date '+%s') - $(date --date="$snapshot" '+%s') ))" -ge "$snapshot_retention_in_seconds" && btrfs subvolume delete "$snapshot"; done

Maybe I'm reinventing the wheel here but I suppose it doesn't need to be more complicated than this. Note that we cannot use btrfs send/receive because the fs on o3 isn't using btrfs.

#27 Updated by mkittler 2 months ago

  • Target version set to Ready

#28 Updated by okurz 2 months ago

mkittler wrote:

Maybe I'm reinventing the wheel here

I am sure you are :) Your approach sounds nice but I am sure there is something we overlook and sooner or later we want a true backup application. I suggest to stay with rsnapshot but if you want something more fancy maybe https://github.com/borgbackup/borg sounds better to you? Regarding btrfs snapshots I think they are great for snapshotting system drives (and taking backups from these snapshots) but not for storing the backups in snapshots.

#29 Updated by mkittler 2 months ago

  • Status changed from Workable to In Progress

I've checked available backup tools and we've discussed them in the chat. For the sake of simplicity we decided to stick with rsnapshot (which we're already using on the backup.qa.suse.de). We couldn't benefit much from the features of borg and restic anyways and btrbk isn't an option as our source file system doesn't use btrfs.

Draft SR for configuration changes: https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/612

I've also already started a backup manually and it is still running right now. Files will be stored under /storage/rsnapshot (which is a btrfs subvolume and has compression enabled), e.g. /storage/rsnapshot/alpha.0/openqa.opensuse.org/root/var/lib/openqa/….

#30 Updated by mkittler 2 months ago

Unfortunately the backup failed with rsync error code 12. It had copied 181 GiB before. Maybe the connection was interrupted in the middle. I'll try again.

I'm also currently checking the size of /var/lib/openqa on o3 to have some value to compare the backup against (to check for completeness). This shows that simply walking through this huge filesystem tree takes a while. I suppose we'll have to consider this when thinking about the frequency we'd like to run rsnapshot.

#31 Updated by mkittler 2 months ago

The backup I've started this morning is still running (in a tmux session of user martchus). Let's see how long it is going to take.

By the way, we're dealing with roughly 6 TiB of data:

martchus@ariel:~> df -h /var/lib/openqa /var/lib/openqa/share
Dateisystem            Größe Benutzt Verf. Verw% Eingehängt auf
/dev/vdb1               5,0T    3,5T  1,6T   69% /var/lib/openqa
/dev/mapper/vg0-assets  4,0T    3,2T  869G   79% /var/lib/openqa/share

(Not everything on /dev/vdb1 is under /var/lib/openqa, though.)

#32 Updated by mkittler 2 months ago

The cleanup I started yesterday is still running. It is still filling up /storage, we're currently at:

martchus@storage:~> sudo btrfs filesystem df /storage
Data, RAID1: total=486.00GiB, used=485.83GiB
System, RAID1: total=8.00MiB, used=96.00KiB
Metadata, RAID1: total=21.00GiB, used=20.30GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

#33 Updated by mkittler 2 months ago

The backup is still ongoing:

martchus@storage:~> sudo btrfs filesystem df /storage
Data, RAID1: total=1.27TiB, used=1.27TiB
System, RAID1: total=8.00MiB, used=208.00KiB
Metadata, RAID1: total=52.00GiB, used=50.73GiB
GlobalReserve, single: total=512.00MiB, used=0.00B
martchus@storage:~> sudo btrfs filesystem df /storage
Data, RAID1: total=1.28TiB, used=1.28TiB
System, RAID1: total=8.00MiB, used=208.00KiB
Metadata, RAID1: total=52.00GiB, used=51.26GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

#34 Updated by mkittler 2 months ago

The backup is still ongoing:

martchus@storage:~> sudo btrfs filesystem df /storage
Data, RAID1: total=7.82TiB, used=7.82TiB
System, RAID1: total=8.00MiB, used=1.09MiB
Metadata, RAID1: total=101.00GiB, used=99.36GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

I doubt that we can practically fulfill all the acceptance criteria. This is just getting too big and slow.

#35 Updated by okurz 2 months ago

Agreed. Also the alert "storage: partitions usage (%) alert" triggered. I paused the alert now. Please ensure to re-enable it when the situation was resolved.

So let's reconsider the solutions. We can also achieve the acceptance criteria with the help of SUSE IT and the NetApp storage solution we already have. I suggest to ask EngInfra if and how we would be able to recover a) a complete filesystem content after a potential loss of data and b) selective data in case of partial loss, e.g. if by user error we loose some assets or results

For the current backup approach I suggest you exclude all non-fixed assets from the backup. Also exclude /assets/tests as all this content should be in git or is considered not important.

#36 Updated by okurz 2 months ago

  • Parent task set to #102710

#37 Updated by okurz 2 months ago

  • Copied to action #102713: Implement proper backups for o3 - list of installed packages added

#38 Updated by okurz 2 months ago

  • Description updated (diff)

I split out the "save list of packages" as proposed and also as the last tasks conducted did not cover this -> #102713

#39 Updated by mkittler 2 months ago

I can exclude /var/lib/openqa/share/tests which would safe us 5.4 GiB and /var/lib/openqa/share/factory/*/!(fixed) which would safe use 6 TiB. (By the way /var/lib/openqa/share/factory/*/fixed/*is only 3.17 GiB.)

I suppose I'd remove it already in the current backup.

#40 Updated by mkittler 2 months ago

I'm now running the backup with excludes. Because of a mistake in the config file I had to abort and retry the backup. Unfortunately now rsnapshot didn't include the --link-dest parameter on the second run:

storage:/storage # rsnapshot alpha
echo 22770 > /var/run/rsnapshot.pid 
mv /storage/rsnapshot/alpha.1/ /storage/rsnapshot/_delete.22770/ 
mv /storage/rsnapshot/alpha.0/ /storage/rsnapshot/alpha.1/ 
mkdir -m 0755 -p /storage/rsnapshot/alpha.0/openqa.opensuse.org/ 
/usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded \
    --exclude=/var/lib/openqa/share/tests \
    +rsync_long_args=--exclude=/var/lib/openqa/share/factory/* \
    +rsync_long_args=--include=/var/lib/openqa/share/factory/*/fixed/** \
    --rsh=/usr/bin/ssh \
    --link-dest=/storage/rsnapshot/alpha.1/openqa.opensuse.org/root \
    root@openqa.opensuse.org:/var/lib/openqa \
    /storage/rsnapshot/alpha.0/openqa.opensuse.org/root 
----------------------------------------------------------------------------
rsnapshot encountered an error! The program was invoked with these options:
/usr/bin/rsnapshot alpha 
----------------------------------------------------------------------------
ERROR: /usr/bin/rsync syntax or usage error. Does this version of rsync support --link-dest?
touch /storage/rsnapshot/alpha.0/ 
rm -f /var/run/rsnapshot.pid 
/usr/bin/rm -rf /storage/rsnapshot/_delete.22770 
^CERROR: Warning! /usr/bin/rm failed.
ERROR: Error! rm_rf("/storage/rsnapshot/_delete.22770")
storage:/storage # rsnapshot alpha
echo 22883 > /var/run/rsnapshot.pid 
mv /storage/rsnapshot/alpha.1/ /storage/rsnapshot/_delete.22883/ 
mv /storage/rsnapshot/alpha.0/ /storage/rsnapshot/alpha.1/ 
mkdir -m 0755 -p /storage/rsnapshot/alpha.0/openqa.opensuse.org/ 
/usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded \
    --exclude=/var/lib/openqa/share/tests \
    --exclude=/var/lib/openqa/share/factory/* \
    --include=/var/lib/openqa/share/factory/*/fixed/** --rsh=/usr/bin/ssh \
    root@openqa.opensuse.org:/var/lib/openqa \
    /storage/rsnapshot/alpha.0/openqa.opensuse.org/root

#41 Updated by mkittler about 2 months ago

The storage server was rebooted so I'm not sure whether the backup has been fully concluded. I'll give it another manual try. This time it included --link-dest correctly.

I've also been creating an Infra ticket to clear up the questions: https://sd.suse.com/servicedesk/customer/portal/1/SD-67510

#42 Updated by mkittler about 2 months ago

Trying again after running into ERROR: /usr/bin/rsync returned 12 while processing root@openqa.opensuse.org:/var/lib/openqa. If it happens more often (this is now the 2nd time, see #44078#note-30) I could try create a small rsync wrapper to implement retries.

#43 Updated by mkittler about 2 months ago

I enabled now retries via the rsync config. The backup is still running after splitting it into multiple backup commands. It is still at the images directory (which is likely the directory which takes the longest).

#44 Updated by mkittler about 2 months ago

The storage server has been rebooted during the last backup. At least it got past the images. Trying it once more. I'd like to conduct one manual run successfully before configuring anything persistently.


In the meantime Infra replied. There's a backup provided by the storage itself which allows to go back 3 days. There are still open questions, though.

#45 Updated by okurz about 2 months ago

I have unpaused the storage disk usage alert again.

#46 Updated by mkittler about 2 months ago

It is still running but now with the separate backup commands one can at least see that it progressed to the testresults:

storage:/home/martchus # rsnapshot alpha
echo 6870 > /var/run/rsnapshot.pid 
mv /storage/rsnapshot/alpha.3/ /storage/rsnapshot/_delete.6870/ 
mv /storage/rsnapshot/alpha.2/ /storage/rsnapshot/alpha.3/ 
mv /storage/rsnapshot/alpha.1/ /storage/rsnapshot/alpha.2/ 
mv /storage/rsnapshot/alpha.0/ /storage/rsnapshot/alpha.1/ 
mkdir -m 0755 -p /storage/rsnapshot/alpha.0/openqa.opensuse.org/ 
/usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded \
    --rsh=/usr/bin/ssh \
    --link-dest=/storage/rsnapshot/alpha.1/openqa.opensuse.org/root \
    root@openqa.opensuse.org:/var/lib/openqa/images \
    /storage/rsnapshot/alpha.0/openqa.opensuse.org/root 
/usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded \
    --rsh=/usr/bin/ssh \
    --link-dest=/storage/rsnapshot/alpha.1/openqa.opensuse.org/root \
    root@openqa.opensuse.org:/var/lib/openqa/testresults \
    /storage/rsnapshot/alpha.0/openqa.opensuse.org/root 

#47 Updated by mkittler about 2 months ago

The backup has been completed today. I suppose we can do a backup every three days. So I updated https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/612 to do the backup twice per week ("alpha") and once per month ("beta"). I suppose we can keep three copies of each (but might need to adjust that later if the de-duplication between snapshots has less effect than expected).

#48 Updated by mkittler about 2 months ago

  • Status changed from In Progress to Feedback

So I'm waiting for feedback on the SR and for feedback from Infra.

By the way, that's the current disk usage with one full snapshot, one which has at least all images and one which is also incomplete:

martchus@storage:/srv> sudo btrfs filesystem df /storage 
Data, RAID1: total=3.76TiB, used=3.62TiB
System, RAID1: total=8.00MiB, used=624.00KiB
Metadata, RAID1: total=223.00GiB, used=171.97GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

#49 Updated by mkittler about 1 month ago

The SR has been merged and the rsnapshot/cron config is updated/in-place as expected. cron is also running so let's see whether it works as expected.

About the Infra ticket: I'm actually not quite sure whether I'm following the latest reply.

#50 Updated by okurz about 1 month ago

Sounds great. I added the information about the current backup to https://progress.opensuse.org/projects/openqav3/wiki/Wiki/diff?utf8=%E2%9C%93&version=134&version_from=133&commit=View+differences . This includes both the solution developed here as well as information from https://sd.suse.com/servicedesk/customer/portal/1/SD-67510 . I resolved https://sd.suse.com/servicedesk/customer/portal/1/SD-67510 . As storage.qa.suse.de is monitored as part of our infrastructure we would receive alerts in case either services fail or the storage depletes. Only that we don't immediately receive alerts when rsnapshot would fail. With systemd services and timers we would catch that as part of the "failed systemd services" alert. Anyway, if you like you can call this ticket "Resolved" and keep any followup tasks for the parent epic.

#51 Updated by mkittler about 1 month ago

I'll have a look at it for a few more days. Currently it already looks good:

martchus@storage:~> systemctl status cron.service
● cron.service - Command Scheduler
     Loaded: loaded (/usr/lib/systemd/system/cron.service; enabled; vendor preset: enabled)
     Active: active (running) since Sun 2021-12-05 03:32:35 CET; 4 days ago
   Main PID: 2697 (cron)
      Tasks: 7 (limit: 4915)
     CGroup: /system.slice/cron.service
             ├─ 2697 /usr/sbin/cron -n
             ├─ 5474 /usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded --rsh=/usr/bin/ssh --link-dest=/storage/rsnapshot/alpha.1/openqa.opensuse.org/root root@openqa.opensuse.org:/var/lib/openqa/images /storage/rsnapshot/alpha.0/openqa.opensuse.org/root
             ├─ 5475 /usr/bin/ssh -l root openqa.opensuse.org rsync --server --sender -logDtprRe.iLsfxC --numeric-ids . /var/lib/openqa/images
…

#52 Updated by mkittler about 1 month ago

Looks like the backup was performed successfully and the next backup is already ongoing:

[2021-12-09T00:00:01] echo 22030 > /var/run/rsnapshot.pid
[2021-12-09T00:00:01] mv /storage/rsnapshot/alpha.2/ /storage/rsnapshot/_delete.22030/
[2021-12-09T00:00:01] mv /storage/rsnapshot/alpha.1/ /storage/rsnapshot/alpha.2/
[2021-12-09T00:00:01] mv /storage/rsnapshot/alpha.0/ /storage/rsnapshot/alpha.1/
[2021-12-09T00:00:01] mkdir -m 0755 -p /storage/rsnapshot/alpha.0/openqa.opensuse.org/
[2021-12-09T00:00:01] /usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded --rsh=/usr/bin/ssh --link-dest=/storage/rsnapshot/alpha.1/openqa.opensuse.org/root root@openqa.opensuse.org:/var/lib/openqa/images /storage/rsnapshot/alpha.0/openqa.opensuse.org/root
[2021-12-10T04:25:06] /usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded --rsh=/usr/bin/ssh --link-dest=/storage/rsnapshot/alpha.1/openqa.opensuse.org/root root@openqa.opensuse.org:/var/lib/openqa/testresults /storage/rsnapshot/alpha.0/openqa.opensuse.org/root
[2021-12-10T11:33:37] /usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded --hard-links --rsh=/usr/bin/ssh --link-dest=/storage/rsnapshot/alpha.1/openqa.opensuse.org/root root@openqa.opensuse.org:/var/lib/openqa/share/factory/iso/fixed /storage/rsnapshot/alpha.0/openqa.opensuse.org/root
[2021-12-10T11:33:37] /usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded --hard-links --rsh=/usr/bin/ssh --link-dest=/storage/rsnapshot/alpha.1/openqa.opensuse.org/root root@openqa.opensuse.org:/var/lib/openqa/share/factory/hdd/fixed /storage/rsnapshot/alpha.0/openqa.opensuse.org/root
[2021-12-10T11:34:16] /usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded --hard-links --rsh=/usr/bin/ssh --link-dest=/storage/rsnapshot/alpha.1/openqa.opensuse.org/root root@openqa.opensuse.org:/var/lib/openqa/osc-plugin-factory /storage/rsnapshot/alpha.0/openqa.opensuse.org/root
[2021-12-10T11:34:16] /usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded --hard-links --rsh=/usr/bin/ssh --link-dest=/storage/rsnapshot/alpha.1/openqa.opensuse.org/root root@openqa.opensuse.org:/var/lib/openqa/osc-plugin-factory-new /storage/rsnapshot/alpha.0/openqa.opensuse.org/root
[2021-12-10T11:34:17] touch /storage/rsnapshot/alpha.0/
[2021-12-10T11:34:17] rm -f /var/run/rsnapshot.pid
[2021-12-10T11:34:17] /usr/bin/rm -rf /storage/rsnapshot/_delete.22030
[2021-12-10T11:58:59] /usr/bin/rsnapshot alpha: completed successfully
[2021-12-13T00:00:01] /usr/bin/rsnapshot alpha: started
[2021-12-13T00:00:01] echo 1146 > /var/run/rsnapshot.pid
[2021-12-13T00:00:01] mv /storage/rsnapshot/alpha.2/ /storage/rsnapshot/_delete.1146/
[2021-12-13T00:00:01] mv /storage/rsnapshot/alpha.1/ /storage/rsnapshot/alpha.2/
[2021-12-13T00:00:01] mv /storage/rsnapshot/alpha.0/ /storage/rsnapshot/alpha.1/
[2021-12-13T00:00:01] mkdir -m 0755 -p /storage/rsnapshot/alpha.0/openqa.opensuse.org/
[2021-12-13T00:00:01] /usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded --rsh=/usr/bin/ssh --link-dest=/storage/rsnapshot/alpha.1/openqa.opensuse.org/root root@openqa.opensuse.org:/var/lib/openqa/images /storage/rsnapshot/alpha.0/openqa.opensuse.org/root

#53 Updated by okurz about 1 month ago

  • Status changed from Feedback to Resolved

We have all three ACs covered with our current configuration in https://gitlab.suse.de/openqa/salt-states-openqa/-/blob/master/etc/backup/rsnapshot.conf . If we would run out of space or services would fail then our monitoring would alert us.

#54 Updated by mkittler about 1 month ago

Just for the record, remaining logs of the backup which finished yesterday:

[2021-12-13T00:00:01] /usr/bin/rsnapshot alpha: started
[2021-12-13T00:00:01] echo 1146 > /var/run/rsnapshot.pid
[2021-12-13T00:00:01] mv /storage/rsnapshot/alpha.2/ /storage/rsnapshot/_delete.1146/
[2021-12-13T00:00:01] mv /storage/rsnapshot/alpha.1/ /storage/rsnapshot/alpha.2/
[2021-12-13T00:00:01] mv /storage/rsnapshot/alpha.0/ /storage/rsnapshot/alpha.1/
[2021-12-13T00:00:01] mkdir -m 0755 -p /storage/rsnapshot/alpha.0/openqa.opensuse.org/
[2021-12-13T00:00:01] /usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded --rsh=/usr/bin/ssh --link-dest=/storage/rsnapshot/alpha.1/openqa.opensuse.org/root root@openqa.opensuse.org:/var/lib/openqa/images /storage/rsnapshot/alpha.0/openqa.opensuse.org/root
[2021-12-13T13:00:10] /usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded --rsh=/usr/bin/ssh --link-dest=/storage/rsnapshot/alpha.1/openqa.opensuse.org/root root@openqa.opensuse.org:/var/lib/openqa/testresults /storage/rsnapshot/alpha.0/openqa.opensuse.org/root
[2021-12-13T21:13:01] /usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded --hard-links --rsh=/usr/bin/ssh --link-dest=/storage/rsnapshot/alpha.1/openqa.opensuse.org/root root@openqa.opensuse.org:/var/lib/openqa/share/factory/iso/fixed /storage/rsnapshot/alpha.0/openqa.opensuse.org/root
[2021-12-13T21:13:01] /usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded --hard-links --rsh=/usr/bin/ssh --link-dest=/storage/rsnapshot/alpha.1/openqa.opensuse.org/root root@openqa.opensuse.org:/var/lib/openqa/share/factory/hdd/fixed /storage/rsnapshot/alpha.0/openqa.opensuse.org/root
[2021-12-13T21:13:02] /usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded --hard-links --rsh=/usr/bin/ssh --link-dest=/storage/rsnapshot/alpha.1/openqa.opensuse.org/root root@openqa.opensuse.org:/var/lib/openqa/osc-plugin-factory /storage/rsnapshot/alpha.0/openqa.opensuse.org/root
[2021-12-13T21:13:02] /usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded --hard-links --rsh=/usr/bin/ssh --link-dest=/storage/rsnapshot/alpha.1/openqa.opensuse.org/root root@openqa.opensuse.org:/var/lib/openqa/osc-plugin-factory-new /storage/rsnapshot/alpha.0/openqa.opensuse.org/root
[2021-12-13T21:13:03] touch /storage/rsnapshot/alpha.0/
[2021-12-13T21:13:03] rm -f /var/run/rsnapshot.pid
[2021-12-13T21:13:03] /usr/bin/rm -rf /storage/rsnapshot/_delete.1146
[2021-12-13T22:20:10] /usr/bin/rsnapshot alpha: completed successfully

So it took 22 hours and 20 minutes to run.

Also available in: Atom PDF