action #112037
closed[sle][migration][sle15sp5][HA] figure out the differences between "node01.qcow2" and "node02.qcow2"
100%
Description
Figure out the differences between "node01.qcow2" and "node02.qcow2":
For all migration in HA job group, all use parallel mode to run node1 and node2 with different qcow files
like sle-15-SP3-aarch64-ha-alpha-alpha-node01.qcow2 Length: 3465412608 (3.2G)
$tools md5sum sle-15-SP3-aarch64-ha-alpha-alpha-node01.qcow2
b3969a1c9cacee475b03d8c3378775ba sle-15-SP3-aarch64-ha-alpha-alpha-node01.qcow2
and sle-15-SP3-aarch64-ha-alpha-alpha-node02.qcow2 Length: 3474456576 (3.2G)
$tools md5sum sle-15-SP3-aarch64-ha-alpha-alpha-node02.qcow2
be4af945f88e07485073774afba912b6 sle-15-SP3-aarch64-ha-alpha-alpha-node02.qcow2
After investigation, we found both qcow file are the same.
[GW]: need further check which part is different? how to check different? @lili check it?
Updated by llzhao almost 2 years ago
Here are the investigation summaries:
step-0. Take x86_64 for example (https://openqa.suse.de/tests/8751181#dependencies)
step-1. Compared migration_offline_dvd_sle15sp3_ha_alpha_node01 & migration_offline_dvd_sle15sp3_ha_alpha_node02 json files: no critical difference
step-2. Checked gitlab HA yaml files
https://gitlab.suse.de/qa-css/openqa_ha_sap/-/blob/master/openQA/yaml_job_groups/ha/sle15_sp4_ha.yml
- contains migration_offline_dvd_sle15sp3_ha_alpha_node01 and so on
https://gitlab.suse.de/qa-css/openqa_ha_sap/-/blob/master/openQA/yaml_job_groups/ha/development.yml
- contains ha-alpha-alpha-node01.qcow but not for x86_64
- contains:
sle-15-SP2-Online-x86_64:
- create_hdd_ha_textmode_publish
- ha_alpha_node01_publish
- ha_alpha_node02_publish
- ha_supportserver_publish
step-3. Take x86_64 for example (https://openqa.suse.de/tests/8751181#dependencies), here is the workflow:
create_hdd_ha_textmode_publish:
|- sle-15-SP3-x86_64-ha-alpha-alpha-node01.qcow2
|- migration_offline_dvd_sle15sp3_ha_alpha_node01 (settings are same)
|- migration_verify_node01 (whatever)
|- sle-15-SP3-x86_64-ha-alpha-alpha-node02.qcow2
|- migration_offline_dvd_sle15sp3_ha_alpha_node02 (settings are same)
|- migration_*verify*_node02 (whatever)
step-4. Checked in openQA webui test suites' configurations:
create_hdd_ha_textmode_publish: PUBLISH_HDD_1=%DISTRI%-%VERSION%-%ARCH%-Build%BUILD%-HA-BV.qcow2
ADDONS=
DESKTOP=textmode
HDDSIZEGB=15
INSTALLONLY=1
PUBLISH_HDD_1=%DISTRI%-%VERSION%-%ARCH%-Build%BUILD%-HA-BV.qcow2
PUBLISH_PFLASH_VARS=%DISTRI%-%VERSION%-%ARCH%-Build%BUILD%-HA-BV-uefi-vars.qcow2
SCC_ADDONS=ha,geo
SCC_DEREGISTER=1
SCC_REGISTER=installation
VIDEOMODE=text
_HDDMODEL=scsi-hd
ha_alpha_node01_publish:
BOOT_HDD_IMAGE=1
CLUSTER_NAME=alpha
DESKTOP=textmode
HA_CLUSTER=1
HA_CLUSTER_DRBD=1
HA_CLUSTER_INIT=yes
HDD_1=%DISTRI%-%VERSION%-%ARCH%-Build%BUILD%-HA-BV.qcow2
HOSTNAME=%CLUSTER_NAME%-node01
INSTALLONLY=1
NICTYPE=tap
PARALLEL_WITH=ha_supportserver_publish
PUBLISH_HDD_1=%DISTRI%-%VERSION%-%ARCH%-ha-%CLUSTER_NAME%-%HOSTNAME%.qcow2
PUBLISH_PFLASH_VARS=%DISTRI%-%VERSION%-%ARCH%-ha-%CLUSTER_NAME%-%HOSTNAME%-uefi-vars.qcow2
QEMU_DISABLE_SNAPSHOTS=1
UEFI_PFLASH_VARS=%DISTRI%-%VERSION%-%ARCH%-Build%BUILD%-HA-BV-uefi-vars.qcow2
USE_LVMLOCKD=0
USE_SUPPORT_SERVER=1
WORKER_CLASS=tap
_HDDMODEL=scsi-hd
ha_alpha_node02_publish:
BOOT_HDD_IMAGE=1
CLUSTER_NAME=alpha
DESKTOP=textmode
HA_CLUSTER=1
HA_CLUSTER_DRBD=1
HA_CLUSTER_JOIN=%CLUSTER_NAME%-node01
HDD_1=%DISTRI%-%VERSION%-%ARCH%-Build%BUILD%-HA-BV.qcow2
HOSTNAME=%CLUSTER_NAME%-node02
INSTALLONLY=1
NICTYPE=tap
PARALLEL_WITH=ha_supportserver_publish
PUBLISH_HDD_1=%DISTRI%-%VERSION%-%ARCH%-ha-%CLUSTER_NAME%-%HOSTNAME%.qcow2
PUBLISH_PFLASH_VARS=%DISTRI%-%VERSION%-%ARCH%-ha-%CLUSTER_NAME%-%HOSTNAME%-uefi-vars.qcow2
QEMU_DISABLE_SNAPSHOTS=1
UEFI_PFLASH_VARS=%DISTRI%-%VERSION%-%ARCH%-Build%BUILD%-HA-BV-uefi-vars.qcow2
USE_LVMLOCKD=0
USE_SUPPORT_SERVER=1
WORKER_CLASS=tap
_HDDMODEL=scsi-hd
(both have: clvmd/lvmlockd+cluster_md+ocfs2+drbd_passive+xfs+fencing)
step-5. Checked the diff of the "Settings"
> diff ha_alpha_node01_publish ha_alpha_node02_publish
6c6
< HA_CLUSTER_INIT=yes
---
> HA_CLUSTER_JOIN=%CLUSTER_NAME%-node01
8c8
< HOSTNAME=%CLUSTER_NAME%-node01
---
> HOSTNAME=%CLUSTER_NAME%-node02
step-6. From the diff we can see the difference of "node01.qcow2" and "node02.qcow2" are as following:
lib # find .| xargs grep -s HA_CLUSTER_INIT
...
./main_common.pm: check_var('HA_CLUSTER_INIT', 'yes') ? loadtest 'ha/ha_cluster_init' : loadtest 'ha/ha_cluster_join';...
That is:
Both based on a same parent job "create_hdd_ha_textmode_publish",
node01.qcow2: Deploy a cluster with YaST
node02.qcow2: Join a cluster deployed by YaST
Updated by llzhao almost 2 years ago
Summary:
They are different:
node01.qcow2: Deploy a cluster with YaST
node02.qcow2: Join a cluster deployed by YaST
Updated by llzhao almost 2 years ago
- Status changed from In Progress to Resolved
- % Done changed from 0 to 100
Updated by llzhao almost 2 years ago
- Subject changed from [sle][migration][HA][backlog] figure out the differences between "node01.qcow2" and "node02.qcow2" to [sle][migration][sle15sp5][HA] figure out the differences between "node01.qcow2" and "node02.qcow2"