Project

General

Profile

Actions

action #120831

closed

coordination #121876: [epic] Handle openQA review failures in Yam squad - SLE 15 SP5

[Research: 16h] Research recent iscsi failures

Added by JERiveraMoya over 1 year ago. Updated about 1 year ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
Start date:
2022-11-22
Due date:
% Done:

0%

Estimated time:

Description

Motivation

We need to tackle this iscsi issue.
Apparently it is not a YaST bug but we need to investigate the rest of the components involved.
These are the failure seen for now:
https://openqa.suse.de/tests/9982052#step/iscsi_configuration/11
https://openqa.suse.de/tests/10021483#step/ibft/40

Acceptance criteria

AC1: Collect failure in this ticket
AC2: Try to isolate the issue, if it is in the product or in the qemu/worker configuration.
AC3: File corresponding tickets/bugs according to result


Files

iscsi-error.png (45.6 KB) iscsi-error.png coolgw, 2022-11-24 06:19

Related issues 2 (1 open1 closed)

Related to qe-yam - action #120714: [Research 24h] Investigate test fails in iscsi_client with wrong exit codeResolvedcoolgw2022-11-18

Actions
Related to openQA Infrastructure - action #121507: Iscsi issue on OSD worker New2022-12-06

Actions
Actions #1

Updated by JERiveraMoya over 1 year ago

  • Description updated (diff)
Actions #2

Updated by JERiveraMoya over 1 year ago

  • Tags deleted (qe-yast-refinement)
Actions #3

Updated by JERiveraMoya over 1 year ago

  • Related to action #120714: [Research 24h] Investigate test fails in iscsi_client with wrong exit code added
Actions #4

Updated by coolgw over 1 year ago

Currently we have two kind of error.
1) lsscsi can not find remote node, the example failed link: https://openqa.suse.de/tests/9982052#step/iscsi_configuration/12

If you use "iscsiadm --mode discovery --op update --type sendtargets --portal 10.137.10.6" try to get remote target then error will happen(see attach pic)

scsi_error_handler be called and also find scsi_try_target_reset be called(see log below)

suspect something error happen during scsi initial process, suspect error can be find in log:
2022-11-18T07:30:20.957540+00:00 install kernel: [ 151.153851][ C0] Call Trace:
2022-11-18T07:30:20.957561+00:00 install kernel: [ 151.153860][ C0]
2022-11-18T07:30:20.957587+00:00 install kernel: [ 151.153862][ C0] __schedule+0x2cd/0x1140
2022-11-18T07:30:20.957625+00:00 install kernel: [ 151.153866][ C0] ? __switch_to_asm+0x42/0x80
2022-11-18T07:30:20.957647+00:00 install kernel: [ 151.153878][ C0] ? scsi_try_target_reset+0x90/0x90 [scsi_mod 60019c3a58cde6bf8f6eb77f4d5743ac62119f27]
2022-11-18T07:30:20.957669+00:00 install kernel: [ 151.153907][ C0] schedule+0x64/0xe0
2022-11-18T07:30:20.957691+00:00 install kernel: [ 151.153912][ C0] scsi_error_handler+0x1da/0x5e0 [scsi_mod 60019c3a58cde6bf8f6eb77f4d5743ac62119f27]
2022-11-18T07:30:20.957713+00:00 install kernel: [ 151.153944][ C0] ? scsi_eh_get_sense+0x220/0x220 [scsi_mod 60019c3a58cde6bf8f6eb77f4d5743ac62119f27]
2022-11-18T07:30:20.957757+00:00 install kernel: [ 151.153961][ C0] kthread+0x156/0x180
2022-11-18T07:30:20.957779+00:00 install kernel: [ 151.153965][ C0] ? set_kthread_struct+0x50/0x50
2022-11-18T07:30:20.957800+00:00 install kernel: [ 151.153969][ C0] ret_from_fork+0x22/0x30
2022-11-18T07:30:20.957821+00:00 install kernel: [ 151.153975][ C0]
2022-11-18T07:30:20.957842+00:00 install kernel: [ 151.153977][ C0] task:scsi_tmf_0 state:I stack: 0 pid: 220 ppid: 2 flags:0x00004000
2022-11-18T07:30:20.957863+00:00 install kernel: [ 151.153981][ C0] Call Trace:
2022-11-18T07:30:20.957889+00:00 install kernel: [ 151.153983][ C0]
2022-11-18T07:30:20.957927+00:00 install kernel: [ 151.153985][ C0] __schedule+0x2cd/0x1140
2022-11-18T07:30:20.957947+00:00 install kernel: [ 151.153990][ C0] ? set_next_entity+0xdf/0x180
2022-11-18T07:30:20.957967+00:00 install kernel: [ 151.153994][ C0] ? set_next_task_fair+0x6b/0xa0
2022-11-18T07:30:20.957988+00:00 install kernel: [ 151.153998][ C0] ? try_to_grab_pending+0x150/0x150
2022-11-18T07:30:20.958008+00:00 install kernel: [ 151.154003][ C0] schedule+0x64/0xe0
2022-11-18T07:30:20.958033+00:00 install kernel: [ 151.154007][ C0] rescuer_thread+0x2de/0x360
2022-11-18T07:30:20.958070+00:00 install kernel: [ 151.154012][ C0] ? try_to_grab_pending+0x150/0x150
2022-11-18T07:30:20.958089+00:00 install kernel: [ 151.154016][ C0] kthread+0x156/0x180
2022-11-18T07:30:20.958109+00:00 install kernel: [ 151.154020][ C0] ? set_kthread_struct+0x50/0x50
2022-11-18T07:30:20.958129+00:00 install kernel: [ 151.154025][ C0] ret_from_fork+0x22/0x30
2022-11-18T07:30:20.958148+00:00 install kernel: [ 151.154030][ C0]
2022-11-18T07:30:20.958168+00:00 install kernel: [ 151.154033][ C0] task:ata_sff state:I stack: 0 pid: 221 ppid: 2 flags:0x00004000
2022-11-18T07:30:20.958192+00:00 install kernel: [ 151.154037][ C0] Call Trace:
2022-11-18T07:30:20.958212+00:00 install kernel: [ 151.154039][ C0]
2022-11-18T07:30:20.958245+00:00 install kernel: [ 151.154041][ C0] __schedule+0x2cd/0x1140
2022-11-18T07:30:20.958265+00:00 install kernel: [ 151.154045][ C0] ? set_next_entity+0xdf/0x180
2022-11-18T07:30:20.958285+00:00 install kernel: [ 151.154049][ C0] ? set_next_task_fair+0x6b/0xa0
2022-11-18T07:30:20.958304+00:00 install kernel: [ 151.154053][ C0] ? try_to_grab_pending+0x150/0x150
2022-11-18T07:30:20.958329+00:00 install kernel: [ 151.154057][ C0] schedule+0x64/0xe0
2022-11-18T07:30:20.958349+00:00 install kernel: [ 151.154061][ C0] rescuer_thread+0x2de/0x360
2022-11-18T07:30:20.958369+00:00 install kernel: [ 151.154066][ C0] ? try_to_grab_pending+0x150/0x150
2022-11-18T07:30:20.958388+00:00 install kernel: [ 151.154071][ C0] kthread+0x156/0x180
2022-11-18T07:30:20.958407+00:00 install kernel: [ 151.154090][ C0] ? set_kthread_struct+0x50/0x50
2022-11-18T07:30:20.958427+00:00 install kernel: [ 151.154094][ C0] ret_from_fork+0x22/0x30
2022-11-18T07:30:20.958451+00:00 install kernel: [ 151.154100][ C0]
2022-11-18T07:30:20.958472+00:00 install kernel: [ 151.154101][ C0] task:scsi_eh_1 state:S stack: 0 pid: 222 ppid: 2 flags:0x00004000
2022-11-18T07:30:20.958491+00:00 install kernel: [ 151.154104][ C0] Call Trace:
2022-11-18T07:30:20.958511+00:00 install kernel: [ 151.154106][ C0]
2022-11-18T07:30:20.958530+00:00 install kernel: [ 151.154107][ C0] __schedule+0x2cd/0x1140
2022-11-18T07:30:20.958566+00:00 install kernel: [ 151.154112][ C0] ? __wake_up_common_lock+0x87/0xc0
2022-11-18T07:30:20.958587+00:00 install kernel: [ 151.154116][ C0] ? scsi_try_target_reset+0x90/0x90 [scsi_mod 60019c3a58cde6bf8f6eb77f4d5743ac62119f27]
2022-11-18T07:30:20.962133+00:00 install kernel: [ 151.154131][ C0] schedule+0x64/0xe0
2022-11-18T07:30:20.962165+00:00 install kernel: [ 151.154135][ C0] scsi_error_handler+0x1da/0x5e0 [scsi_mod 60019c3a58cde6bf8f6eb77f4d5743ac62119f27]
2022-11-18T07:30:20.962187+00:00 install kernel: [ 151.154152][ C0] ? scsi_eh_get_sense+0x220/0x220 [scsi_mod 60019c3a58cde6bf8f6eb77f4d5743ac62119f27]
2022-11-18T07:30:20.962208+00:00 install kernel: [ 151.154168][ C0] kthread+0x156/0x180
2022-11-18T07:30:20.962228+00:00 install kernel: [ 151.154171][ C0] ? set_kthread_struct+0x50/0x50
2022-11-18T07:30:20.962248+00:00 install kernel: [ 151.154175][ C0] ret_from_fork+0x22/0x30
2022-11-18T07:30:20.962278+00:00 install kernel: [ 151.154198][ C0]
2022-11-18T07:30:20.962300+00:00 install kernel: [ 151.154200][ C0] task:scsi_tmf_1 state:I stack: 0 pid: 223 ppid: 2 flags:0x00004000
2022-11-18T07:30:20.962320+00:00 install kernel: [ 151.154203][ C0] Call Trace:
2022-11-18T07:30:20.962339+00:00 install kernel: [ 151.154205][ C0]
2022-11-18T07:30:20.962359+00:00 install kernel: [ 151.154207][ C0] __schedule+0x2cd/0x1140
2022-11-18T07:30:20.962379+00:00 install kernel: [ 151.154211][ C0] ? set_next_entity+0xdf/0x180
2022-11-18T07:30:20.962406+00:00 install kernel: [ 151.154215][ C0] ? set_next_task_fair+0x6b/0xa0
2022-11-18T07:30:20.962426+00:00 install kernel: [ 151.154219][ C0] ? try_to_grab_pending+0x150/0x150
2022-11-18T07:30:20.962446+00:00 install kernel: [ 151.154224][ C0] schedule+0x64/0xe0
2022-11-18T07:30:20.962466+00:00 install kernel: [ 151.154228][ C0] rescuer_thread+0x2de/0x360
2022-11-18T07:30:20.962486+00:00 install kernel: [ 151.154232][ C0] ? try_to_grab_pending+0x150/0x150
2022-11-18T07:30:20.962506+00:00 install kernel: [ 151.154237][ C0] kthread+0x156/0x180
2022-11-18T07:30:20.962526+00:00 install kernel: [ 151.154240][ C0] ? set_kthread_struct+0x50/0x50
2022-11-18T07:30:20.962553+00:00 install kernel: [ 151.154245][ C0] ret_from_fork+0x22/0x30
2022-11-18T07:30:20.962574+00:00 install kernel: [ 151.154251][ C0]
2022-11-18T07:30:20.962594+00:00 install kernel: [ 151.154253][ C0] task:scsi_eh_2 state:S stack: 0 pid: 224 ppid: 2 flags:0x00004000
2022-11-18T07:30:20.965681+00:00 install kernel: [ 151.154256][ C0] Call Trace:
2022-11-18T07:30:20.965692+00:00 install kernel: [ 151.154258][ C0]
2022-11-18T07:30:20.965701+00:00 install kernel: [ 151.154260][ C0] __schedule+0x2cd/0x1140
2022-11-18T07:30:20.965710+00:00 install kernel: [ 151.154264][ C0] ? __wake_up_common_lock+0x87/0xc0
2022-11-18T07:30:20.965720+00:00 install kernel: [ 151.154268][ C0] ? scsi_try_target_reset+0x90/0x90 [scsi_mod 60019c3a58cde6bf8f6eb77f4d5743ac62119f27]
2022-11-18T07:30:20.965730+00:00 install kernel: [ 151.154284][ C0] schedule+0x64/0xe0
2022-11-18T07:30:20.965740+00:00 install kernel: [ 151.154288][ C0] scsi_error_handler+0x1da/0x5e0 [scsi_mod 60019c3a58cde6bf8f6eb77f4d5743ac62119f27]
2022-11-18T07:30:20.965750+00:00 install kernel: [ 151.154305][ C0] ? scsi_eh_get_sense+0x220/0x220 [scsi_mod 60019c3a58cde6bf8f6eb77f4d5743ac62119f27]
2022-11-18T07:30:20.965760+00:00 install kernel: [ 151.154322][ C0] kthread+0x156/0x180
2022-11-18T07:30:20.965769+00:00 install kernel: [ 151.154325][ C0] ? set_kthread_struct+0x50/0x50
2022-11-18T07:30:20.965779+00:00 install kernel: [ 151.154329][ C0] ret_from_fork+0x22/0x30
2022-11-18T07:30:20.965897+00:00 install kernel: [ 151.154335][ C0]
2022-11-18T07:30:20.965911+00:00 install kernel: [ 151.154337][ C0] task:scsi_tmf_2 state:I stack: 0 pid: 225 ppid: 2 flags:0x00004000
2022-11-18T07:30:20.965921+00:00 install kernel: [ 151.154340][ C0] Call Trace:
2022-11-18T07:30:20.965930+00:00 install kernel: [ 151.154354][ C0]
2022-11-18T07:30:20.965940+00:00 install kernel: [ 151.154374][ C0] __schedule+0x2cd/0x1140
2022-11-18T07:30:20.965950+00:00 install kernel: [ 151.154379][ C0] ? set_next_entity+0xdf/0x180
2022-11-18T07:30:20.965960+00:00 install kernel: [ 151.154383][ C0] ? set_next_task_fair+0x6b/0xa0
2022-11-18T07:30:20.965970+00:00 install kernel: [ 151.154387][ C0] ? try_to_grab_pending+0x150/0x150
2022-11-18T07:30:20.965979+00:00 install kernel: [ 151.154392][ C0] schedule+0x64/0xe0
2022-11-18T07:30:20.965989+00:00 install kernel: [ 151.154396][ C0] rescuer_thread+0x2de/0x360
2022-11-18T07:30:20.965998+00:00 install kernel: [ 151.154401][ C0] ? try_to_grab_pending+0x150/0x150
2022-11-18T07:30:20.966089+00:00 install kernel: [ 151.154406][ C0] kthread+0x156/0x180
2022-11-18T07:30:20.966153+00:00 install kernel: [ 151.154410][ C0] ? set_kthread_struct+0x50/0x50
2022-11-18T07:30:20.966164+00:00 install kernel: [ 151.154414][ C0] ret_from_fork+0x22/0x30
2022-11-18T07:30:20.966173+00:00 install kernel: [ 151.154420][ C0]
2022-11-18T07:30:20.966183+00:00 install kernel: [ 151.154421][ C0] task:kworker/u2:3 state:I stack: 0 pid: 226 ppid: 2 flags:0x00004000
2022-11-18T07:30:20.966209+00:00 install kernel: [ 151.154426][ C0] Workqueue: 0x0 (loop7)
2022-11-18T07:30:20.966218+00:00 install kernel: [ 151.154429][ C0] Call Trace:
2022-11-18T07:30:20.966227+00:00 install kernel: [ 151.154436][ C0]
2022-11-18T07:30:20.966237+00:00 install kernel: [ 151.154438][ C0] __schedule+0x2cd/0x1140
2022-11-18T07:30:20.966246+00:00 install kernel: [ 151.154444][ C0] ? process_one_work+0x440/0x440
2022-11-18T07:30:20.966256+00:00 install kernel: [ 151.154449][ C0] schedule+0x64/0xe0
2022-11-18T07:30:20.966265+00:00 install kernel: [ 151.154453][ C0] worker_thread+0xab/0x3d0
2022-11-18T07:30:20.966355+00:00 install kernel: [ 151.154458][ C0] ? process_one_work+0x440/0x440
2022-11-18T07:30:20.966368+00:00 install kernel: [ 151.154463][ C0] kthread+0x156/0x180
2022-11-18T07:30:20.966377+00:00 install kernel: [ 151.154467][ C0] ? set_kthread_struct+0x50/0x50
2022-11-18T07:30:20.966387+00:00 install kernel: [ 151.154471][ C0] ret_from_fork+0x22/0x30
2022-11-18T07:30:20.966396+00:00 install kernel: [ 151.154477][ C0]
2022-11-18T07:30:20.966406+00:00 install kernel: [ 151.154479][ C0] task:kworker/0:3 state:I stack: 0 pid: 361 ppid: 2 flags:0x00004000

2) smartctl - i /dev/sda failed example failed link: https://openqa.suse.de/tests/10021483#step/ibft/40
Bug submitted: https://bugzilla.suse.com/show_bug.cgi?id=1203566

Actions #5

Updated by coolgw over 1 year ago

Increase memory / cpu seems 1st issue gone, but second issue still can be seen from time to time.
2 failed on ibft module / total 4 rerun with 4000M & 2 cpu.

Actions #6

Updated by coolgw over 1 year ago

Manual run command on host

qemu-system-x86_64 -nographic -serial mon:stdio \
-m 4000 \
-vnc :999 \
-enable-kvm \
-boot d \
-cdrom SLE-15-SP5-Online-x86_64-Build42.5-Media1.iso \
-netdev user,id=qanet0 \
-device virtio-net,netdev=qanet0,mac=5e:54:00:12:34:56 \
-kernel /usr/share/qemu/ipxe.lkrn \
-append 'dhcp && echo "abc" && sanhook iscsi:worker6.oqa.suse.de::3260:1:iqn.2016-02.openqa.de:for.openqa'

Actions #7

Updated by coolgw over 1 year ago

for i in {01..40} ; do openqa-clone-job --within-instance http://openqa.suse.de --skip-chained-deps \
10021454 TEST=$WEGAO_poo120831_$i BUILD=wegao_iscis_investigation _GROUP="wegao-test" ; done
https://openqa.suse.de/tests/overview?distri=sle&version=15-SP5&build=wegao_iscis_investigation

Actions #9

Updated by coolgw over 1 year ago

Investigate on running on same worker
for i in {01..05} ; do openqa-clone-job --within-instance http://openqa.suse.de --skip-chained-deps \

10036687 TEST=$WEGAO_poo120831_$i BUILD=wegao_iscis_investigation_same_worker _GROUP="wegao-test" \
WORKER_CLASS="qemu_x86_64,tap,qemu_x86_64_ibft,worker3"; done
Created job #10048535: sle-15-SP5-Online-x86_64-Build50.1-iscsi_ibft@64bit -> http://openqa.suse.de/t10048535
Created job #10048536: sle-15-SP5-Online-x86_64-Build50.1-iscsi_ibft@64bit -> http://openqa.suse.de/t10048536
Created job #10048537: sle-15-SP5-Online-x86_64-Build50.1-iscsi_ibft@64bit -> http://openqa.suse.de/t10048537
Created job #10048538: sle-15-SP5-Online-x86_64-Build50.1-iscsi_ibft@64bit -> http://openqa.suse.de/t10048538
Created job #10048539: sle-15-SP5-Online-x86_64-Build50.1-iscsi_ibft@64bit -> http://openqa.suse.de/t10048539

Actions #10

Updated by JERiveraMoya over 1 year ago

  • Description updated (diff)
Actions #11

Updated by coolgw over 1 year ago

try to connect private server instead of OSD server, error happen, https://bugzilla.suse.com/show_bug.cgi?id=1205853 submitted
https://openqa.suse.de/tests/10048456#step/perform_installation/2

Actions #12

Updated by coolgw over 1 year ago

Test against private server:
for i in {06..20} ; do openqa-clone-job --within-instance http://openqa.suse.de --skip-chained-deps 10055258 TEST=$WEGAO_poo120831_$i BUILD=wegao_iscis_investigation_private_server _GROUP="wegao-test" WORKER_CLASS="qemu_x86_64_staging,qemu_x86_64,qemu_x86_64_no_tmpfs,qemu_x86_64_ibft,worker2" PAUSE_AT=""; done

https://openqa.suse.de/tests/overview?distri=sle&version=15-SP5&build=wegao_iscis_investigation_private_server

Actions #13

Updated by coolgw over 1 year ago

verify private server which build by openqa slat command.
for i in {01..20} ; do openqa-clone-job --within-instance http://openqa.suse.de --skip-chained-deps 10055258 TEST=$WEGAO_poo120831_$i BUILD=wegao_iscis_investigation_private_server_openqa_salt_command _GROUP="wegao-test" WORKER_CLASS="qemu_x86_64_staging,qemu_x86_64,qemu_x86_64_no_tmpfs,qemu_x86_64_ibft,worker2" PAUSE_AT=""; done
https://openqa.suse.de/tests/overview?distri=sle&version=15-SP5&build=wegao_iscis_investigation_private_server_openqa_salt_command

verify https://github.com/os-autoinst/os-autoinst/pull/2219
for i in {01..30} ; do openqa-clone-job --within-instance http://openqa.suse.de --skip-chained-deps \
10070055 TEST=$WEGAO_poo120831_$i BUILD=wegao_iscis_investigation_use_ip_url _GROUP="wegao-test" ; done
https://openqa.suse.de/tests/overview?distri=sle&version=15-SP5&build=wegao_iscis_investigation_use_ip_url

Actions #14

Updated by coolgw over 1 year ago

Recheck worker3 after restart tgtd service.
openqa-clone-job --within-instance http://openqa.suse.de --skip-chained-deps \
10070055 BUILD=wegao_iscis_investigation_use_ip_url _GROUP=0 WORKER_CLASS="qemu_x86_64,tap,qemu_x86_64_ibft,worker3"
http://openqa.suse.de/t10075461

Test case will failed on Worker3 100%

Actions #16

Updated by coolgw over 1 year ago

Actions #17

Updated by coolgw over 1 year ago

  • Status changed from In Progress to Resolved
Actions #18

Updated by JERiveraMoya over 1 year ago

  • Parent task set to #121876
Actions #19

Updated by openqa_review about 1 year ago

  • Status changed from Resolved to Feedback

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: iscsi_ibft
https://openqa.suse.de/tests/10352695#step/iscsi_configuration/1

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The bugref in the openQA scenario is removed or replaced, e.g. label:wontfix:boo1234

Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.

Actions #20

Updated by JERiveraMoya about 1 year ago

  • Status changed from Feedback to Resolved
Actions

Also available in: Atom PDF