Project

General

Profile

action #46316

Updated by mloviska over 5 years ago

Iscsi initiator takes following step: 

 * logon to an iSCSI target 
 * make a single partition 
 * make an EXT4 filesystem on that partition 
 * mount that partition 
 * write to a file there 
 * read from the file there 

 We are assuming that these steps are synchronous. For example, logging in to a target will create a local disc device, but it takes a moment, since udev actually handles it. And making a single partition actually causes the kernel code to re-read the starting part of the disc in order for it to recognize that you now have a partition table when before there was none. In both of these examples, the result is asynchronous. So we really need to be able to wait for your desired result before going on. 

 To test this, perhaps you can just add some "sleep" calls in your openQA test to see if timing might be your issue. I'd suggest a "sleep 2" (just to be overly safe) between the "make a partition" step and the "make an EXT4 filesystem" step, since making the EXT4 filesystem seems to be the step that sometimes fails. 

 Lastly, looking back to your initial error text, it looks like the test is trying to restart open-iscsi. 

 
 ``` 
 [    320.908142] systemd[1]: Started Hostname Service. 
 [    326.785627] systemd[1]: Started Open-iSCSI. 
 [    326.871707] systemd[1]: Stopping Open-iSCSI... 
 [    417.006083] systemd[1]: iscsid.service: State 'stop-sigterm' timed out. Killing. 
 [    417.027160] systemd[1]: iscsid.service: Killing process 3113 (iscsid) with signal SIGKILL. 
 [    417.030997] systemd[1]: iscsid.service: Main process exited, code=killed, status=9/KILL 
 [    417.033006] systemd[1]: Stopped Open-iSCSI. 
 [    417.039294] systemd[1]: iscsid.service: Unit entered failed state. 
 [    417.051231] systemd[1]: iscsid.service: Failed with result 'timeout'. 
 [    417.053334] systemd[1]: Started Open-iSCSI. 
 [    432.885258] iscsid[3140]: iscsid: Connection1:0 to [target: iqn.2016-02.de.openqa:132, portal: 10.0.2.1,3260] through [iface: default] is operational now 
 ```  

  
 the "stop" part fails, resulting in systemd having to send a kill signal to oiscsi to stop. The "stop-sigterm failed" message is not good. And then to restart the service when it is in a failed state leads to kernel initiator/target communications, in the form of 3 different reconnection attempts. It almost looks like it keeps logging into the target but the target keeps booting it off?  

 As far as the test, it would be good to tear it down correctly, so that repeated connections to the target do not cause problems. That means we need to log out of the target after unmounting the disc.

Back