Project

General

Profile

Actions

action #135923

closed

[qe-sap][tools]test fails in hana_install because NFS is too slow - Move NFS to OSD size:M

Added by apappas 5 months ago. Updated about 1 month ago.

Status:
Resolved
Priority:
Low
Assignee:
Category:
Bugs in existing tests
Target version:
Start date:
2023-09-18
Due date:
% Done:

0%

Estimated time:
Difficulty:
Tags:

Description

Observation

openQA test in scenario sle-15-SP4-SAP-DVD-Updates-x86_64-qam-sles4sap_scc_gnome_hana_cli@64bit-sap-qam fails in
hana_install

Test suite description

Testsuite maintained at https://gitlab.suse.de/qa-maintenance/qam-openqa-yml. Maintainer: qa-css team, QA and QAM for HA/SAP.
Test SAP HANA installation on SLES4SAP.

Reproducible

Fails since (at least) Build 20230915-1

Expected result

Last good: 20230914-1 (or more recent)

Acceptance Criteria:

  • AC1: The commands accessing data on the NFS server do not time out.

Suggestions

  • We can't rely on the NFS share from OSD to be mounted on workers so consider to use the NFS server provided by OSD within tests
  • Find out what this "HANA2 directory" contains. Does it's content change for each SLE build?
  • Suggest to ticket reporter to ask Eng-Infra if another VM next to OSD can provide better performance or if it shares all resources with osd hence would bring no benefit

Further details

Always latest result in this scenario: latest

HANA and NetWeaver tests need their install media either mounted or mounted and copied over for the test run to proceed. Until the migration to the PRG2, the jobs that were assigned to prg_office worker class were connected to qesap-nfs.qa.suse.cz and the jobs in NUE1 where assigned to a now retired NFS Server in NUE1-SRV2.

Sadly, the connection between PRG_office and PRG2 is not adequate enough and all tests that require an NFS server fail consistently due to timeouts

To fix this, I propose we use the NFS Server on OSD so that the data doesn't have to be bottlenecked by the external connecton.

The current size of the data is at a bit less than 500GB.


Related issues 1 (1 open0 closed)

Related to openQA Tests - action #135938: [qe-sap] test fails in hana_install copying from NFS server qesap-nfs.qa.suse.cz with timeoutIn Progressacarvajal2023-09-18

Actions
Actions #1

Updated by apappas 5 months ago

  • Subject changed from [tools][sap]test fails in hana_install because NFS is too slow - Move NFS to OSD to [sap]test fails in hana_install because NFS is too slow - Move NFS to OSD

Actually the jobs are on the prg_office. Investigating further...

Actions #2

Updated by apappas 5 months ago

I am trying with a connection to nue2 https://openqa.suse.de/tests/12168011#

Actions #3

Updated by okurz 5 months ago

  • Related to action #135938: [qe-sap] test fails in hana_install copying from NFS server qesap-nfs.qa.suse.cz with timeout added
Actions #4

Updated by apappas 5 months ago

  • Subject changed from [sap]test fails in hana_install because NFS is too slow - Move NFS to OSD to [qe-sap][tools]test fails in hana_install because NFS is too slow - Move NFS to OSD

After a 2 week investigation, we found that our NFS servers do not have the bandwidth to respond to the increased load that the new increased concurrent jobs impose. To combat this I am requesting that at least the HANA2 directory be moved to the OSD NFS server.

The requirements are 351G of space and the ability to host 4 connections per SAP incident and SAP aggregate build.

Actions #5

Updated by apappas 5 months ago

To move on I need to learn where to put the HANA2 directory. I did a findmnt and I coulndĀ“t be 100% sure where the /var/lib/openqa/share/factory/other/fixed/ is mounted. I refrained from simply copying everything over because I don't want to bring OSD down.

Actions #6

Updated by okurz 5 months ago

  • Assignee set to okurz
  • Target version set to Ready

So a you asked in Slack as well I stated there that we can't rely on the NFS share from OSD to be mounted on workers however you can use the NFS server provided by OSD within tests. So this HANA2 directory, what does it contain? Does it's content change for each SLE build? I suggest you also ask Eng-Infra if another VM next to OSD can provide better performance or if it shares all resources with osd hence would bring no benefit

Actions #7

Updated by okurz 5 months ago

  • Tags set to support
  • Status changed from New to Feedback

As you added [tools] to the subject I will treat this as a "support" ticket so far assuming that I can help you to conduct the actual work :)

Actions #8

Updated by okurz 5 months ago

  • Subject changed from [qe-sap][tools]test fails in hana_install because NFS is too slow - Move NFS to OSD to [qe-sap][tools]test fails in hana_install because NFS is too slow - Move NFS to OSD size:M
  • Description updated (diff)
Actions #9

Updated by okurz 4 months ago

  • Priority changed from Normal to Low
  • Target version changed from Ready to Tools - Next

No response yet, assuming less priority on side of qe-sap. Moving to "Next".

Actions #10

Updated by okurz about 1 month ago

  • Status changed from Feedback to Resolved

Still no response. I assume the problem solved itself.

Actions

Also available in: Atom PDF