action #120831
closedcoordination #121876: [epic] Handle openQA review failures in Yam squad - SLE 15 SP5
[Research: 16h] Research recent iscsi failures
Description
Motivation¶
We need to tackle this iscsi issue.
Apparently it is not a YaST bug but we need to investigate the rest of the components involved.
These are the failure seen for now:
https://openqa.suse.de/tests/9982052#step/iscsi_configuration/11
https://openqa.suse.de/tests/10021483#step/ibft/40
Acceptance criteria¶
AC1: Collect failure in this ticket
AC2: Try to isolate the issue, if it is in the product or in the qemu/worker configuration.
AC3: File corresponding tickets/bugs according to result
Files
Updated by JERiveraMoya about 2 years ago
- Related to action #120714: [Research 24h] Investigate test fails in iscsi_client with wrong exit code added
Updated by coolgw about 2 years ago
- File iscsi-error.png iscsi-error.png added
Currently we have two kind of error.
1) lsscsi can not find remote node, the example failed link: https://openqa.suse.de/tests/9982052#step/iscsi_configuration/12
If you use "iscsiadm --mode discovery --op update --type sendtargets --portal 10.137.10.6" try to get remote target then error will happen(see attach pic)
scsi_error_handler be called and also find scsi_try_target_reset be called(see log below)
suspect something error happen during scsi initial process, suspect error can be find in log:
2022-11-18T07:30:20.957540+00:00 install kernel: [ 151.153851][ C0] Call Trace:
2022-11-18T07:30:20.957561+00:00 install kernel: [ 151.153860][ C0]
2022-11-18T07:30:20.957587+00:00 install kernel: [ 151.153862][ C0] __schedule+0x2cd/0x1140
2022-11-18T07:30:20.957625+00:00 install kernel: [ 151.153866][ C0] ? __switch_to_asm+0x42/0x80
2022-11-18T07:30:20.957647+00:00 install kernel: [ 151.153878][ C0] ? scsi_try_target_reset+0x90/0x90 [scsi_mod 60019c3a58cde6bf8f6eb77f4d5743ac62119f27]
2022-11-18T07:30:20.957669+00:00 install kernel: [ 151.153907][ C0] schedule+0x64/0xe0
2022-11-18T07:30:20.957691+00:00 install kernel: [ 151.153912][ C0] scsi_error_handler+0x1da/0x5e0 [scsi_mod 60019c3a58cde6bf8f6eb77f4d5743ac62119f27]
2022-11-18T07:30:20.957713+00:00 install kernel: [ 151.153944][ C0] ? scsi_eh_get_sense+0x220/0x220 [scsi_mod 60019c3a58cde6bf8f6eb77f4d5743ac62119f27]
2022-11-18T07:30:20.957757+00:00 install kernel: [ 151.153961][ C0] kthread+0x156/0x180
2022-11-18T07:30:20.957779+00:00 install kernel: [ 151.153965][ C0] ? set_kthread_struct+0x50/0x50
2022-11-18T07:30:20.957800+00:00 install kernel: [ 151.153969][ C0] ret_from_fork+0x22/0x30
2022-11-18T07:30:20.957821+00:00 install kernel: [ 151.153975][ C0]
2022-11-18T07:30:20.957842+00:00 install kernel: [ 151.153977][ C0] task:scsi_tmf_0 state:I stack: 0 pid: 220 ppid: 2 flags:0x00004000
2022-11-18T07:30:20.957863+00:00 install kernel: [ 151.153981][ C0] Call Trace:
2022-11-18T07:30:20.957889+00:00 install kernel: [ 151.153983][ C0]
2022-11-18T07:30:20.957927+00:00 install kernel: [ 151.153985][ C0] __schedule+0x2cd/0x1140
2022-11-18T07:30:20.957947+00:00 install kernel: [ 151.153990][ C0] ? set_next_entity+0xdf/0x180
2022-11-18T07:30:20.957967+00:00 install kernel: [ 151.153994][ C0] ? set_next_task_fair+0x6b/0xa0
2022-11-18T07:30:20.957988+00:00 install kernel: [ 151.153998][ C0] ? try_to_grab_pending+0x150/0x150
2022-11-18T07:30:20.958008+00:00 install kernel: [ 151.154003][ C0] schedule+0x64/0xe0
2022-11-18T07:30:20.958033+00:00 install kernel: [ 151.154007][ C0] rescuer_thread+0x2de/0x360
2022-11-18T07:30:20.958070+00:00 install kernel: [ 151.154012][ C0] ? try_to_grab_pending+0x150/0x150
2022-11-18T07:30:20.958089+00:00 install kernel: [ 151.154016][ C0] kthread+0x156/0x180
2022-11-18T07:30:20.958109+00:00 install kernel: [ 151.154020][ C0] ? set_kthread_struct+0x50/0x50
2022-11-18T07:30:20.958129+00:00 install kernel: [ 151.154025][ C0] ret_from_fork+0x22/0x30
2022-11-18T07:30:20.958148+00:00 install kernel: [ 151.154030][ C0]
2022-11-18T07:30:20.958168+00:00 install kernel: [ 151.154033][ C0] task:ata_sff state:I stack: 0 pid: 221 ppid: 2 flags:0x00004000
2022-11-18T07:30:20.958192+00:00 install kernel: [ 151.154037][ C0] Call Trace:
2022-11-18T07:30:20.958212+00:00 install kernel: [ 151.154039][ C0]
2022-11-18T07:30:20.958245+00:00 install kernel: [ 151.154041][ C0] __schedule+0x2cd/0x1140
2022-11-18T07:30:20.958265+00:00 install kernel: [ 151.154045][ C0] ? set_next_entity+0xdf/0x180
2022-11-18T07:30:20.958285+00:00 install kernel: [ 151.154049][ C0] ? set_next_task_fair+0x6b/0xa0
2022-11-18T07:30:20.958304+00:00 install kernel: [ 151.154053][ C0] ? try_to_grab_pending+0x150/0x150
2022-11-18T07:30:20.958329+00:00 install kernel: [ 151.154057][ C0] schedule+0x64/0xe0
2022-11-18T07:30:20.958349+00:00 install kernel: [ 151.154061][ C0] rescuer_thread+0x2de/0x360
2022-11-18T07:30:20.958369+00:00 install kernel: [ 151.154066][ C0] ? try_to_grab_pending+0x150/0x150
2022-11-18T07:30:20.958388+00:00 install kernel: [ 151.154071][ C0] kthread+0x156/0x180
2022-11-18T07:30:20.958407+00:00 install kernel: [ 151.154090][ C0] ? set_kthread_struct+0x50/0x50
2022-11-18T07:30:20.958427+00:00 install kernel: [ 151.154094][ C0] ret_from_fork+0x22/0x30
2022-11-18T07:30:20.958451+00:00 install kernel: [ 151.154100][ C0]
2022-11-18T07:30:20.958472+00:00 install kernel: [ 151.154101][ C0] task:scsi_eh_1 state:S stack: 0 pid: 222 ppid: 2 flags:0x00004000
2022-11-18T07:30:20.958491+00:00 install kernel: [ 151.154104][ C0] Call Trace:
2022-11-18T07:30:20.958511+00:00 install kernel: [ 151.154106][ C0]
2022-11-18T07:30:20.958530+00:00 install kernel: [ 151.154107][ C0] __schedule+0x2cd/0x1140
2022-11-18T07:30:20.958566+00:00 install kernel: [ 151.154112][ C0] ? __wake_up_common_lock+0x87/0xc0
2022-11-18T07:30:20.958587+00:00 install kernel: [ 151.154116][ C0] ? scsi_try_target_reset+0x90/0x90 [scsi_mod 60019c3a58cde6bf8f6eb77f4d5743ac62119f27]
2022-11-18T07:30:20.962133+00:00 install kernel: [ 151.154131][ C0] schedule+0x64/0xe0
2022-11-18T07:30:20.962165+00:00 install kernel: [ 151.154135][ C0] scsi_error_handler+0x1da/0x5e0 [scsi_mod 60019c3a58cde6bf8f6eb77f4d5743ac62119f27]
2022-11-18T07:30:20.962187+00:00 install kernel: [ 151.154152][ C0] ? scsi_eh_get_sense+0x220/0x220 [scsi_mod 60019c3a58cde6bf8f6eb77f4d5743ac62119f27]
2022-11-18T07:30:20.962208+00:00 install kernel: [ 151.154168][ C0] kthread+0x156/0x180
2022-11-18T07:30:20.962228+00:00 install kernel: [ 151.154171][ C0] ? set_kthread_struct+0x50/0x50
2022-11-18T07:30:20.962248+00:00 install kernel: [ 151.154175][ C0] ret_from_fork+0x22/0x30
2022-11-18T07:30:20.962278+00:00 install kernel: [ 151.154198][ C0]
2022-11-18T07:30:20.962300+00:00 install kernel: [ 151.154200][ C0] task:scsi_tmf_1 state:I stack: 0 pid: 223 ppid: 2 flags:0x00004000
2022-11-18T07:30:20.962320+00:00 install kernel: [ 151.154203][ C0] Call Trace:
2022-11-18T07:30:20.962339+00:00 install kernel: [ 151.154205][ C0]
2022-11-18T07:30:20.962359+00:00 install kernel: [ 151.154207][ C0] __schedule+0x2cd/0x1140
2022-11-18T07:30:20.962379+00:00 install kernel: [ 151.154211][ C0] ? set_next_entity+0xdf/0x180
2022-11-18T07:30:20.962406+00:00 install kernel: [ 151.154215][ C0] ? set_next_task_fair+0x6b/0xa0
2022-11-18T07:30:20.962426+00:00 install kernel: [ 151.154219][ C0] ? try_to_grab_pending+0x150/0x150
2022-11-18T07:30:20.962446+00:00 install kernel: [ 151.154224][ C0] schedule+0x64/0xe0
2022-11-18T07:30:20.962466+00:00 install kernel: [ 151.154228][ C0] rescuer_thread+0x2de/0x360
2022-11-18T07:30:20.962486+00:00 install kernel: [ 151.154232][ C0] ? try_to_grab_pending+0x150/0x150
2022-11-18T07:30:20.962506+00:00 install kernel: [ 151.154237][ C0] kthread+0x156/0x180
2022-11-18T07:30:20.962526+00:00 install kernel: [ 151.154240][ C0] ? set_kthread_struct+0x50/0x50
2022-11-18T07:30:20.962553+00:00 install kernel: [ 151.154245][ C0] ret_from_fork+0x22/0x30
2022-11-18T07:30:20.962574+00:00 install kernel: [ 151.154251][ C0]
2022-11-18T07:30:20.962594+00:00 install kernel: [ 151.154253][ C0] task:scsi_eh_2 state:S stack: 0 pid: 224 ppid: 2 flags:0x00004000
2022-11-18T07:30:20.965681+00:00 install kernel: [ 151.154256][ C0] Call Trace:
2022-11-18T07:30:20.965692+00:00 install kernel: [ 151.154258][ C0]
2022-11-18T07:30:20.965701+00:00 install kernel: [ 151.154260][ C0] __schedule+0x2cd/0x1140
2022-11-18T07:30:20.965710+00:00 install kernel: [ 151.154264][ C0] ? __wake_up_common_lock+0x87/0xc0
2022-11-18T07:30:20.965720+00:00 install kernel: [ 151.154268][ C0] ? scsi_try_target_reset+0x90/0x90 [scsi_mod 60019c3a58cde6bf8f6eb77f4d5743ac62119f27]
2022-11-18T07:30:20.965730+00:00 install kernel: [ 151.154284][ C0] schedule+0x64/0xe0
2022-11-18T07:30:20.965740+00:00 install kernel: [ 151.154288][ C0] scsi_error_handler+0x1da/0x5e0 [scsi_mod 60019c3a58cde6bf8f6eb77f4d5743ac62119f27]
2022-11-18T07:30:20.965750+00:00 install kernel: [ 151.154305][ C0] ? scsi_eh_get_sense+0x220/0x220 [scsi_mod 60019c3a58cde6bf8f6eb77f4d5743ac62119f27]
2022-11-18T07:30:20.965760+00:00 install kernel: [ 151.154322][ C0] kthread+0x156/0x180
2022-11-18T07:30:20.965769+00:00 install kernel: [ 151.154325][ C0] ? set_kthread_struct+0x50/0x50
2022-11-18T07:30:20.965779+00:00 install kernel: [ 151.154329][ C0] ret_from_fork+0x22/0x30
2022-11-18T07:30:20.965897+00:00 install kernel: [ 151.154335][ C0]
2022-11-18T07:30:20.965911+00:00 install kernel: [ 151.154337][ C0] task:scsi_tmf_2 state:I stack: 0 pid: 225 ppid: 2 flags:0x00004000
2022-11-18T07:30:20.965921+00:00 install kernel: [ 151.154340][ C0] Call Trace:
2022-11-18T07:30:20.965930+00:00 install kernel: [ 151.154354][ C0]
2022-11-18T07:30:20.965940+00:00 install kernel: [ 151.154374][ C0] __schedule+0x2cd/0x1140
2022-11-18T07:30:20.965950+00:00 install kernel: [ 151.154379][ C0] ? set_next_entity+0xdf/0x180
2022-11-18T07:30:20.965960+00:00 install kernel: [ 151.154383][ C0] ? set_next_task_fair+0x6b/0xa0
2022-11-18T07:30:20.965970+00:00 install kernel: [ 151.154387][ C0] ? try_to_grab_pending+0x150/0x150
2022-11-18T07:30:20.965979+00:00 install kernel: [ 151.154392][ C0] schedule+0x64/0xe0
2022-11-18T07:30:20.965989+00:00 install kernel: [ 151.154396][ C0] rescuer_thread+0x2de/0x360
2022-11-18T07:30:20.965998+00:00 install kernel: [ 151.154401][ C0] ? try_to_grab_pending+0x150/0x150
2022-11-18T07:30:20.966089+00:00 install kernel: [ 151.154406][ C0] kthread+0x156/0x180
2022-11-18T07:30:20.966153+00:00 install kernel: [ 151.154410][ C0] ? set_kthread_struct+0x50/0x50
2022-11-18T07:30:20.966164+00:00 install kernel: [ 151.154414][ C0] ret_from_fork+0x22/0x30
2022-11-18T07:30:20.966173+00:00 install kernel: [ 151.154420][ C0]
2022-11-18T07:30:20.966183+00:00 install kernel: [ 151.154421][ C0] task:kworker/u2:3 state:I stack: 0 pid: 226 ppid: 2 flags:0x00004000
2022-11-18T07:30:20.966209+00:00 install kernel: [ 151.154426][ C0] Workqueue: 0x0 (loop7)
2022-11-18T07:30:20.966218+00:00 install kernel: [ 151.154429][ C0] Call Trace:
2022-11-18T07:30:20.966227+00:00 install kernel: [ 151.154436][ C0]
2022-11-18T07:30:20.966237+00:00 install kernel: [ 151.154438][ C0] __schedule+0x2cd/0x1140
2022-11-18T07:30:20.966246+00:00 install kernel: [ 151.154444][ C0] ? process_one_work+0x440/0x440
2022-11-18T07:30:20.966256+00:00 install kernel: [ 151.154449][ C0] schedule+0x64/0xe0
2022-11-18T07:30:20.966265+00:00 install kernel: [ 151.154453][ C0] worker_thread+0xab/0x3d0
2022-11-18T07:30:20.966355+00:00 install kernel: [ 151.154458][ C0] ? process_one_work+0x440/0x440
2022-11-18T07:30:20.966368+00:00 install kernel: [ 151.154463][ C0] kthread+0x156/0x180
2022-11-18T07:30:20.966377+00:00 install kernel: [ 151.154467][ C0] ? set_kthread_struct+0x50/0x50
2022-11-18T07:30:20.966387+00:00 install kernel: [ 151.154471][ C0] ret_from_fork+0x22/0x30
2022-11-18T07:30:20.966396+00:00 install kernel: [ 151.154477][ C0]
2022-11-18T07:30:20.966406+00:00 install kernel: [ 151.154479][ C0] task:kworker/0:3 state:I stack: 0 pid: 361 ppid: 2 flags:0x00004000
2) smartctl - i /dev/sda failed example failed link: https://openqa.suse.de/tests/10021483#step/ibft/40
Bug submitted: https://bugzilla.suse.com/show_bug.cgi?id=1203566
Updated by coolgw about 2 years ago
Increase memory / cpu seems 1st issue gone, but second issue still can be seen from time to time.
2 failed on ibft module / total 4 rerun with 4000M & 2 cpu.
Updated by coolgw about 2 years ago
Manual run command on host
qemu-system-x86_64 -nographic -serial mon:stdio \
-m 4000 \
-vnc :999 \
-enable-kvm \
-boot d \
-cdrom SLE-15-SP5-Online-x86_64-Build42.5-Media1.iso \
-netdev user,id=qanet0 \
-device virtio-net,netdev=qanet0,mac=5e:54:00:12:34:56 \
-kernel /usr/share/qemu/ipxe.lkrn \
-append 'dhcp && echo "abc" && sanhook iscsi:worker6.oqa.suse.de::3260:1:iqn.2016-02.openqa.de:for.openqa'
Updated by coolgw about 2 years ago
for i in {01..40} ; do openqa-clone-job --within-instance http://openqa.suse.de --skip-chained-deps \
10021454 TEST=$WEGAO_poo120831_$i BUILD=wegao_iscis_investigation _GROUP="wegao-test" ; done
https://openqa.suse.de/tests/overview?distri=sle&version=15-SP5&build=wegao_iscis_investigation
Updated by JERiveraMoya almost 2 years ago
Updated by coolgw almost 2 years ago
Investigate on running on same worker
for i in {01..05} ; do openqa-clone-job --within-instance http://openqa.suse.de --skip-chained-deps \
10036687 TEST=$WEGAO_poo120831_$i BUILD=wegao_iscis_investigation_same_worker _GROUP="wegao-test" \
WORKER_CLASS="qemu_x86_64,tap,qemu_x86_64_ibft,worker3"; done
Created job #10048535: sle-15-SP5-Online-x86_64-Build50.1-iscsi_ibft@64bit -> http://openqa.suse.de/t10048535
Created job #10048536: sle-15-SP5-Online-x86_64-Build50.1-iscsi_ibft@64bit -> http://openqa.suse.de/t10048536
Created job #10048537: sle-15-SP5-Online-x86_64-Build50.1-iscsi_ibft@64bit -> http://openqa.suse.de/t10048537
Created job #10048538: sle-15-SP5-Online-x86_64-Build50.1-iscsi_ibft@64bit -> http://openqa.suse.de/t10048538
Created job #10048539: sle-15-SP5-Online-x86_64-Build50.1-iscsi_ibft@64bit -> http://openqa.suse.de/t10048539
Updated by coolgw almost 2 years ago
try to connect private server instead of OSD server, error happen, https://bugzilla.suse.com/show_bug.cgi?id=1205853 submitted
https://openqa.suse.de/tests/10048456#step/perform_installation/2
Updated by coolgw almost 2 years ago
Test against private server:
for i in {06..20} ; do openqa-clone-job --within-instance http://openqa.suse.de --skip-chained-deps 10055258 TEST=$WEGAO_poo120831_$i BUILD=wegao_iscis_investigation_private_server _GROUP="wegao-test" WORKER_CLASS="qemu_x86_64_staging,qemu_x86_64,qemu_x86_64_no_tmpfs,qemu_x86_64_ibft,worker2" PAUSE_AT=""; done
Updated by coolgw almost 2 years ago
verify private server which build by openqa slat command.
for i in {01..20} ; do openqa-clone-job --within-instance http://openqa.suse.de --skip-chained-deps 10055258 TEST=$WEGAO_poo120831_$i BUILD=wegao_iscis_investigation_private_server_openqa_salt_command _GROUP="wegao-test" WORKER_CLASS="qemu_x86_64_staging,qemu_x86_64,qemu_x86_64_no_tmpfs,qemu_x86_64_ibft,worker2" PAUSE_AT=""; done
https://openqa.suse.de/tests/overview?distri=sle&version=15-SP5&build=wegao_iscis_investigation_private_server_openqa_salt_command
verify https://github.com/os-autoinst/os-autoinst/pull/2219
for i in {01..30} ; do openqa-clone-job --within-instance http://openqa.suse.de --skip-chained-deps \
10070055 TEST=$WEGAO_poo120831_$i BUILD=wegao_iscis_investigation_use_ip_url _GROUP="wegao-test" ; done
https://openqa.suse.de/tests/overview?distri=sle&version=15-SP5&build=wegao_iscis_investigation_use_ip_url
Updated by coolgw almost 2 years ago
Recheck worker3 after restart tgtd service.
openqa-clone-job --within-instance http://openqa.suse.de --skip-chained-deps \
10070055 BUILD=wegao_iscis_investigation_use_ip_url _GROUP=0 WORKER_CLASS="qemu_x86_64,tap,qemu_x86_64_ibft,worker3"
http://openqa.suse.de/t10075461
Test case will failed on Worker3 100%
Updated by coolgw almost 2 years ago
disable worker3 temporarily.
https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/472/diffs
Updated by coolgw almost 2 years ago
- Related to action #121507: Iscsi issue on OSD worker added
Updated by coolgw almost 2 years ago
- Status changed from In Progress to Resolved
Updated by openqa_review almost 2 years ago
- Status changed from Resolved to Feedback
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: iscsi_ibft
https://openqa.suse.de/tests/10352695#step/iscsi_configuration/1
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.
Updated by JERiveraMoya almost 2 years ago
- Status changed from Feedback to Resolved