Project

General

Profile

Actions

action #112037

closed

[sle][migration][sle15sp5][HA] figure out the differences between "node01.qcow2" and "node02.qcow2"

Added by llzhao almost 2 years ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Spike/Research
Target version:
-
Start date:
2022-06-06
Due date:
% Done:

100%

Estimated time:
40.00 h
Difficulty:

Description

Figure out the differences between "node01.qcow2" and "node02.qcow2":

For all migration in HA job group, all use parallel mode to run node1 and node2 with different qcow files
like sle-15-SP3-aarch64-ha-alpha-alpha-node01.qcow2 Length: 3465412608 (3.2G)

$tools md5sum sle-15-SP3-aarch64-ha-alpha-alpha-node01.qcow2
b3969a1c9cacee475b03d8c3378775ba  sle-15-SP3-aarch64-ha-alpha-alpha-node01.qcow2

and sle-15-SP3-aarch64-ha-alpha-alpha-node02.qcow2 Length: 3474456576 (3.2G)

$tools md5sum sle-15-SP3-aarch64-ha-alpha-alpha-node02.qcow2
be4af945f88e07485073774afba912b6  sle-15-SP3-aarch64-ha-alpha-alpha-node02.qcow2

After investigation, we found both qcow file are the same.
[GW]: need further check which part is different? how to check different? @lili check it?

Actions #1

Updated by llzhao almost 2 years ago

  • Status changed from New to In Progress
Actions #2

Updated by llzhao almost 2 years ago

Here are the investigation summaries:

step-0. Take x86_64 for example (https://openqa.suse.de/tests/8751181#dependencies)

step-1. Compared migration_offline_dvd_sle15sp3_ha_alpha_node01 & migration_offline_dvd_sle15sp3_ha_alpha_node02 json files: no critical difference

step-2. Checked gitlab HA yaml files

https://gitlab.suse.de/qa-css/openqa_ha_sap/-/blob/master/openQA/yaml_job_groups/ha/sle15_sp4_ha.yml

  • contains migration_offline_dvd_sle15sp3_ha_alpha_node01 and so on

https://gitlab.suse.de/qa-css/openqa_ha_sap/-/blob/master/openQA/yaml_job_groups/ha/development.yml

  • contains ha-alpha-alpha-node01.qcow but not for x86_64
  • contains: sle-15-SP2-Online-x86_64:
    • create_hdd_ha_textmode_publish
    • ha_alpha_node01_publish
    • ha_alpha_node02_publish
    • ha_supportserver_publish

step-3. Take x86_64 for example (https://openqa.suse.de/tests/8751181#dependencies), here is the workflow:
create_hdd_ha_textmode_publish:
|- sle-15-SP3-x86_64-ha-alpha-alpha-node01.qcow2
|- migration_offline_dvd_sle15sp3_ha_alpha_node01 (settings are same)
|- migration_verify_node01 (whatever)

 |- sle-15-SP3-x86_64-ha-alpha-alpha-node02.qcow2
|- migration_offline_dvd_sle15sp3_ha_alpha_node02 (settings are same)
   |- migration_*verify*_node02 (whatever) 

step-4. Checked in openQA webui test suites' configurations:

create_hdd_ha_textmode_publish: PUBLISH_HDD_1=%DISTRI%-%VERSION%-%ARCH%-Build%BUILD%-HA-BV.qcow2

ADDONS=
DESKTOP=textmode
HDDSIZEGB=15
INSTALLONLY=1
PUBLISH_HDD_1=%DISTRI%-%VERSION%-%ARCH%-Build%BUILD%-HA-BV.qcow2
PUBLISH_PFLASH_VARS=%DISTRI%-%VERSION%-%ARCH%-Build%BUILD%-HA-BV-uefi-vars.qcow2
SCC_ADDONS=ha,geo
SCC_DEREGISTER=1
SCC_REGISTER=installation
VIDEOMODE=text
_HDDMODEL=scsi-hd

ha_alpha_node01_publish:

BOOT_HDD_IMAGE=1
CLUSTER_NAME=alpha
DESKTOP=textmode
HA_CLUSTER=1
HA_CLUSTER_DRBD=1
HA_CLUSTER_INIT=yes
HDD_1=%DISTRI%-%VERSION%-%ARCH%-Build%BUILD%-HA-BV.qcow2
HOSTNAME=%CLUSTER_NAME%-node01
INSTALLONLY=1
NICTYPE=tap
PARALLEL_WITH=ha_supportserver_publish
PUBLISH_HDD_1=%DISTRI%-%VERSION%-%ARCH%-ha-%CLUSTER_NAME%-%HOSTNAME%.qcow2
PUBLISH_PFLASH_VARS=%DISTRI%-%VERSION%-%ARCH%-ha-%CLUSTER_NAME%-%HOSTNAME%-uefi-vars.qcow2
QEMU_DISABLE_SNAPSHOTS=1
UEFI_PFLASH_VARS=%DISTRI%-%VERSION%-%ARCH%-Build%BUILD%-HA-BV-uefi-vars.qcow2
USE_LVMLOCKD=0
USE_SUPPORT_SERVER=1
WORKER_CLASS=tap
_HDDMODEL=scsi-hd

ha_alpha_node02_publish:

BOOT_HDD_IMAGE=1
CLUSTER_NAME=alpha
DESKTOP=textmode
HA_CLUSTER=1
HA_CLUSTER_DRBD=1
HA_CLUSTER_JOIN=%CLUSTER_NAME%-node01
HDD_1=%DISTRI%-%VERSION%-%ARCH%-Build%BUILD%-HA-BV.qcow2
HOSTNAME=%CLUSTER_NAME%-node02
INSTALLONLY=1
NICTYPE=tap
PARALLEL_WITH=ha_supportserver_publish
PUBLISH_HDD_1=%DISTRI%-%VERSION%-%ARCH%-ha-%CLUSTER_NAME%-%HOSTNAME%.qcow2
PUBLISH_PFLASH_VARS=%DISTRI%-%VERSION%-%ARCH%-ha-%CLUSTER_NAME%-%HOSTNAME%-uefi-vars.qcow2
QEMU_DISABLE_SNAPSHOTS=1
UEFI_PFLASH_VARS=%DISTRI%-%VERSION%-%ARCH%-Build%BUILD%-HA-BV-uefi-vars.qcow2
USE_LVMLOCKD=0
USE_SUPPORT_SERVER=1
WORKER_CLASS=tap
_HDDMODEL=scsi-hd

(both have: clvmd/lvmlockd+cluster_md+ocfs2+drbd_passive+xfs+fencing)

step-5. Checked the diff of the "Settings"

> diff ha_alpha_node01_publish ha_alpha_node02_publish
6c6
< HA_CLUSTER_INIT=yes
---
> HA_CLUSTER_JOIN=%CLUSTER_NAME%-node01
8c8
< HOSTNAME=%CLUSTER_NAME%-node01
---
> HOSTNAME=%CLUSTER_NAME%-node02

step-6. From the diff we can see the difference of "node01.qcow2" and "node02.qcow2" are as following:

lib # find .| xargs grep -s HA_CLUSTER_INIT
...
./main_common.pm:        check_var('HA_CLUSTER_INIT', 'yes') ? loadtest 'ha/ha_cluster_init' : loadtest 'ha/ha_cluster_join';...

That is:

Both based on a same parent job "create_hdd_ha_textmode_publish",
node01.qcow2: Deploy a cluster with YaST
node02.qcow2: Join a cluster deployed by YaST
Actions #3

Updated by llzhao almost 2 years ago

Summary:
They are different:
node01.qcow2: Deploy a cluster with YaST
node02.qcow2: Join a cluster deployed by YaST

Actions #4

Updated by llzhao almost 2 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100
Actions #5

Updated by llzhao almost 2 years ago

  • Subject changed from [sle][migration][HA][backlog] figure out the differences between "node01.qcow2" and "node02.qcow2" to [sle][migration][sle15sp5][HA] figure out the differences between "node01.qcow2" and "node02.qcow2"
Actions #6

Updated by coolgw over 1 year ago

  • Estimated time set to 40.00 h
Actions

Also available in: Atom PDF