https://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842019-06-03T09:37:29ZopenSUSE Project Management ToolopenQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2173672019-06-03T09:37:29Zokurzokurz@suse.com
<ul><li><strong>Category</strong> set to <i>Infrastructure</i></li><li><strong>Assignee</strong> set to <i>asmorodskyi</i></li></ul><p><a class="user active user-mention" href="https://progress.opensuse.org/users/23378">@asmorodskyi</a>, seems like you changed the test but the worker does not have "tap", is it?</p>
openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2174692019-06-03T12:09:44Zokurzokurz@suse.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Feedback</i></li><li><strong>Assignee</strong> changed from <i>asmorodskyi</i> to <i>okurz</i></li></ul><pre><code>[03/06/2019 13:57:32] <okurz> riafarov: did *you* add `WORKER_CLASS=qemu_x86_64,tap` to wicked_basic_sut+wicked_basic_ref on o3? this is what the auditlog tells me however that does not sound like it would make any sense when we want to run it on aarch64 for example :)
[03/06/2019 13:58:41] <riafarov> okurz: yes I did that after talking to Anton
[03/06/2019 13:58:55] <riafarov> okurz: it's MM, so it's wrong to run it on aarch64
[03/06/2019 13:59:02] <okurz> riafarov: why?
[03/06/2019 13:59:08] <riafarov> okurz: and we have no idea how it managed to work there
[03/06/2019 13:59:37] <riafarov> okurz: do we have MM setup for arm?
[03/06/2019 14:01:28] <riafarov> okurz: for me it sounds like we should unschedule it for aarch64 then
[03/06/2019 14:01:33] <okurz> riafarov: Depends on what exactly qualifies as "MM setup" :) asmorodskyi and me also talked today and we agreed that probably "wicked_basic" relies on "basic multimachine" only – whatever that means but it works ;) So I will change it back and check that it properly works. Depending on when we add something like "wicked_advanced" we might see
[03/06/2019 14:01:33] <okurz> what's missing
[03/06/2019 14:02:40] <riafarov> okurz: it incompletes every time when executed on the wrong worker
[03/06/2019 14:02:58] <riafarov> okurz: RMs di great job to retrigger it 5 times before it happens
[03/06/2019 14:03:05] <okurz> riafarov: ok, I will check that. 6 days ago it was fine though: https://openqa.opensuse.org/tests/944262#
[03/06/2019 14:03:23] <riafarov> okurz: https://openqa.suse.de/tests/2943106/#step/boot_to_desktop/2
[03/06/2019 14:03:42] <riafarov> okurz: again, no idea how it manages to work
[03/06/2019 14:04:09] <riafarov> okurz: I've changed test suite setting to what is reasonable. If Anton is fine with your changes, feel free to revert
[03/06/2019 14:04:37] <okurz> riafarov: the example you mentioned was running covering openqaworker6+8, maybe an issue with the GRE tunnel which would allow "distributed multi-machine". When we are staying on the same host we should be fine
[03/06/2019 14:06:15] <riafarov> okurz: do not exclude 64bit runs from the equation
[03/06/2019 14:06:31] <riafarov> okurz: openqaworker4 doesn't support the scenario either
[03/06/2019 14:07:14] <riafarov> okurz: or openqaworker1 (do not rememeber which has tap device)
[03/06/2019 14:07:31] <riafarov> okurz: so in case you want to revert, also remove NICTYPE setting from the test suite
[03/06/2019 14:07:50] <riafarov> okurz: https://openqa.opensuse.org/tests/938212# here is failure on 64bit
[03/06/2019 14:08:38] <okurz> riafarov: yes, I get it know. ok, I will remove NICTYPE
</code></pre>
<p>So I set <code>WORKER_CLASS=tap</code> and removed <code>NICTYPE=tap</code> from both test suites "wicked_basic_sut" and "wicked_basic_ref".</p>
<p>Triggered for testing:</p>
<pre><code>build=20190601; openqa-client --host https://openqa.opensuse.org isos post _NO_OBSOLETE_BUILD=1 ARCH=aarch64 BUILD=$build DISTRI=opensuse FLAVOR=DVD ISO=openSUSE-Tumbleweed-DVD-aarch64-Snapshot$build-Media.iso MIRROR_HTTP=http://openqa.opensuse.org/assets/repo/openSUSE-Tumbleweed-oss-aarch64-Snapshot$build MIRROR_PREFIX=http://openqa.opensuse.org/assets/repo REPO_0=openSUSE-Tumbleweed-oss-aarch64-Snapshot$build REPO_0_DEBUGINFO=openSUSE-Tumbleweed-oss-aarch64-Snapshot$build-debuginfo REPO_OSS=openSUSE-Tumbleweed-oss-aarch64-Snapshot$build REPO_OSS_DEBUGINFO=openSUSE-Tumbleweed-oss-aarch64-Snapshot$build-debuginfo SUSEMIRROR=http://openqa.opensuse.org/assets/repo/openSUSE-Tumbleweed-oss-aarch64-Snapshot$build VERSION=Tumbleweed TEST=wicked_basic_ref,wicked_basic_sut
</code></pre>
<p>-></p>
<pre><code>{
count => 3,
failed => [],
ids => [947700, 947701, 947702],
scheduled_product_id => 109314,
}
</code></pre>
<p>so waiting for <a href="https://openqa.opensuse.org/tests/947702">https://openqa.opensuse.org/tests/947702</a></p>
openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2178772019-06-04T09:22:45Zokurzokurz@suse.com
<ul><li><strong>Subject</strong> changed from <i>[aarch64] wicked tests always in schedule state - tap worker required</i> to <i>[aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)</i></li><li><strong>Assignee</strong> changed from <i>okurz</i> to <i>asmorodskyi</i></li></ul><p>SUT failed to reach the parallel node, probably schrödinbug. We (asmorodskyi, riafarov, me) do not know why it could have ever worked as the aarch64 host does not have openvswitch configured which we probably need. asmorodskyi does not plan to support wicked_* in the near future. I have removed the scenarios from the aarch64 Tumbleweed as well as aarch64 Leap 15 for now.</p>
<p><em>However</em> wicked_basic on x86_64 fails with what looks like the same problem: <a href="https://openqa.opensuse.org/tests/948247#step/t01_basic/420" class="external">https://openqa.opensuse.org/tests/948247#step/t01_basic/420</a> so can you please look into that?</p>
openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2178952019-06-04T09:28:40Zggardet_armguillaume.gardet@arm.com
<ul></ul><p>Why do you remove it from aarch64? Especially when x86_64 fails in the same way.<br>
Wicked is an important part the testing of aarch64, so we should keep it. <br>
We may need to configure aarch64 worker properly to support it.</p>
openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2179972019-06-04T13:35:56Zokurzokurz@suse.com
<ul></ul><p>ggardet_arm wrote:</p>
<blockquote>
<p>We may need to configure aarch64 worker properly to support it.</p>
</blockquote>
<p>Yes, this is what the subject line states.</p>
openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2180692019-06-04T17:06:49Zokurzokurz@suse.com
<ul><li><strong>Blocked by</strong> <i><a class="issue tracker-4 status-3 priority-5 priority-high3 closed" href="/issues/52559">action #52559</a>: [network] test fails in t01_basic to ping the other node</i> added</li></ul> openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2181712019-06-05T07:12:44Zasmorodskyi
<ul></ul><p>while working on this issue keep in mind <a class="issue tracker-4 status-3 priority-4 priority-default closed" title="action: [network] test fails in t08_setup_second_card (Resolved)" href="https://progress.opensuse.org/issues/51635">#51635</a> , "proper" setup for wicked tests ( even basic one ) actually means not just tap0,tap1,tap2,tap3 but also tapX+64 ( e.g. tap64,tap65,tap66,tap67 ) because we using TWO interfaces in all extended tests and one test case in wicked basic ( t08 ) </p>
openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2188912019-06-07T07:33:47Zasmorodskyi
<ul><li><strong>Status</strong> changed from <i>Feedback</i> to <i>Workable</i></li></ul><p>this ticket at the moment is about pure OPS task - setup MM for openQA in aarch64 worker . From tests perspective everything working as expected so I am sign off from this issue for now . It would wait for volunteer who will dare setup MM on o3 aarch64 workers </p>
openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2189242019-06-07T08:16:58Zasmorodskyi
<ul><li><strong>Assignee</strong> deleted (<del><i>asmorodskyi</i></del>)</li></ul> openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2189272019-06-07T08:22:33Zokurzokurz@suse.com
<ul><li><strong>Status</strong> changed from <i>Workable</i> to <i>Blocked</i></li><li><strong>Assignee</strong> set to <i>okurz</i></li><li><strong>Priority</strong> changed from <i>Urgent</i> to <i>Low</i></li></ul><p>I guess we should live with the fact that the wicked tests where never workable on aarch64 as the setup is incomplete so I am arguing that the issue should not be "Urgent". Taking it, reducing prio and waiting for blocker.</p>
openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2213902019-06-18T12:50:31Zokurzokurz@suse.com
<ul><li><strong>Status</strong> changed from <i>Blocked</i> to <i>In Progress</i></li><li><strong>Priority</strong> changed from <i>Low</i> to <i>Normal</i></li></ul><p>ggardet asked me again if we can not do it sooner, fine :)</p>
<p>So I debugged with asmorodskyi what is the problem on openqaworker1 and we found some mis-configuration in the firewall. The default-zone was set to "trusted" however masquerading and the bridge were not in "trusted". Fixed that and adjusted the documentation.</p>
<p>I followed <a href="http://open.qa/docs/#_tap_based_network" class="external">http://open.qa/docs/#_tap_based_network</a> to install necessary steps on aarch64.o.o<br>
One problem that was probably caused by this is that the live mode could not connect anymore. By temporarily disabling the firewall I could identify this as the culprit.<br>
What seems to have happened is that we explicitly added <code>br1</code> to the zone "external" whereas on openqaworker1 it is "default" which seems to be more permissive and allow the live handler connections.</p>
<p>to support multi-nic tests:</p>
<pre><code>for i in {64..69}; do ln -s /etc/sysconfig/network/ifcfg-tap{0,$i} ; done
for i in {128..133}; do ln -s /etc/sysconfig/network/ifcfg-tap{0,$i} ; done
</code></pre>
<p>I wonder if it would need to be real files and not symlinks?</p>
openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2227492019-06-21T07:52:20Zokurzokurz@suse.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/222749/diff?detail_id=219986">diff</a>)</li><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Blocked</i></li></ul><p>waiting for asmorodskyi to do the debugging in <a class="issue tracker-4 status-3 priority-5 priority-high3 closed" title="action: [network] test fails in t01_basic to ping the other node (Resolved)" href="https://progress.opensuse.org/issues/52559">#52559</a> and <a class="issue tracker-4 status-3 priority-4 priority-default closed" title="action: [network] test fails in t08_setup_second_card (Resolved)" href="https://progress.opensuse.org/issues/51635">#51635</a></p>
openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2227582019-06-21T07:52:35Zokurzokurz@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-4 priority-default closed" href="/issues/51635">action #51635</a>: [network] test fails in t08_setup_second_card</i> added</li></ul> openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2272612019-07-15T13:03:18Zggardet_armguillaume.gardet@arm.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-4 priority-default closed" href="/issues/54281">action #54281</a>: [aarch64] test fails in wicked before_test - DNS problem</i> added</li></ul> openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2307052019-07-29T12:34:43Zokurzokurz@suse.com
<ul></ul><p>Trying out something for debugging if we can actually ping the nameserver we configured:</p>
<pre><code>openqa-clone-custom-git-refspec https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/8044 https://openqa.opensuse.org/tests/991642
</code></pre>
<p>Created job #994813: opensuse-Tumbleweed-DVD-aarch64-Build20190724-wicked_basic_ref@aarch64 -> <a href="https://openqa.opensuse.org/t994813" class="external">https://openqa.opensuse.org/t994813</a></p>
<p>nsinger and me assume that something with NAT (including firewall) is off as the tap devices and the bridge looks fine but there is no traffic outside the virtual network.</p>
openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2307112019-07-29T12:37:29Zggardet_armguillaume.gardet@arm.com
<ul></ul><p>okurz wrote:</p>
<blockquote>
<p>nsinger and me assume that something with NAT (including firewall) is off as the tap devices and the bridge looks fine but there is no traffic outside the virtual network.</p>
</blockquote>
<p>The physical interface is connected to the same bridge?</p>
<p>Maybe <code>ip a</code> and <code>ip route</code> from host may bring some lights.</p>
openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2307262019-07-29T12:50:41Zokurzokurz@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-12 priority-3 priority-lowest" href="/issues/54785">action #54785</a>: tap devices not in any zone, error reported by firewalld</i> added</li></ul> openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2307892019-07-29T19:41:54Zokurzokurz@suse.com
<ul></ul><p>that I can easily provide :)</p>
<pre><code>openqa-aarch64:~ # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:18:85:04:00:e0 brd ff:ff:ff:ff:ff:ff
inet 192.168.112.3/24 brd 192.168.112.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::218:85ff:fe04:e0/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 00:18:85:05:00:e0 brd ff:ff:ff:ff:ff:ff
4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 00:18:85:00:00:e0 brd ff:ff:ff:ff:ff:ff
5: eth3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 00:18:85:01:00:e0 brd ff:ff:ff:ff:ff:ff
6: tap1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master ovs-system state DOWN group default qlen 1000
link/ether b6:96:22:24:0f:08 brd ff:ff:ff:ff:ff:ff
7: tap2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master ovs-system state DOWN group default qlen 1000
link/ether 22:9f:f1:7b:99:18 brd ff:ff:ff:ff:ff:ff
8: tap3: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master ovs-system state DOWN group default qlen 1000
link/ether e6:37:77:92:54:3b brd ff:ff:ff:ff:ff:ff
9: tap4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master ovs-system state DOWN group default qlen 1000
link/ether 06:5e:1a:63:77:56 brd ff:ff:ff:ff:ff:ff
10: tap5: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master ovs-system state DOWN group default qlen 1000
link/ether 6a:cc:87:d9:04:4c brd ff:ff:ff:ff:ff:ff
11: ovs-system: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether 7a:ae:cc:2c:17:6d brd ff:ff:ff:ff:ff:ff
12: br1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether 46:0c:e3:a2:72:4b brd ff:ff:ff:ff:ff:ff
inet 10.0.2.2/15 brd 10.1.255.255 scope global br1
valid_lft forever preferred_lft forever
inet6 fe80::440c:e3ff:fea2:724b/64 scope link
valid_lft forever preferred_lft forever
13: tap0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master ovs-system state DOWN group default qlen 1000
link/ether a6:51:a0:51:cc:49 brd ff:ff:ff:ff:ff:ff
14: tap128: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master ovs-system state DOWN group default qlen 1000
link/ether de:12:6a:b1:26:40 brd ff:ff:ff:ff:ff:ff
15: tap129: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master ovs-system state DOWN group default qlen 1000
link/ether a6:64:5a:aa:45:48 brd ff:ff:ff:ff:ff:ff
16: tap130: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master ovs-system state DOWN group default qlen 1000
link/ether c2:f9:56:fb:d8:ac brd ff:ff:ff:ff:ff:ff
17: tap131: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master ovs-system state DOWN group default qlen 1000
link/ether c2:f9:bc:98:c7:4f brd ff:ff:ff:ff:ff:ff
18: tap132: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master ovs-system state DOWN group default qlen 1000
link/ether b6:64:a2:35:30:84 brd ff:ff:ff:ff:ff:ff
19: tap133: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master ovs-system state DOWN group default qlen 1000
link/ether e2:14:50:81:0b:c9 brd ff:ff:ff:ff:ff:ff
20: tap64: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master ovs-system state DOWN group default qlen 1000
link/ether 6e:8e:90:1a:e6:af brd ff:ff:ff:ff:ff:ff
21: tap65: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master ovs-system state DOWN group default qlen 1000
link/ether ce:4a:7f:ba:e3:84 brd ff:ff:ff:ff:ff:ff
22: tap66: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master ovs-system state DOWN group default qlen 1000
link/ether 3a:29:ef:57:96:48 brd ff:ff:ff:ff:ff:ff
23: tap67: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master ovs-system state DOWN group default qlen 1000
link/ether 2a:7d:e1:7a:ef:64 brd ff:ff:ff:ff:ff:ff
24: tap68: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master ovs-system state DOWN group default qlen 1000
link/ether 06:24:ad:80:2a:4d brd ff:ff:ff:ff:ff:ff
25: tap69: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast master ovs-system state DOWN group default qlen 1000
link/ether 72:db:93:47:55:ee brd ff:ff:ff:ff:ff:ff
openqa-aarch64:~ # ip route
default via 192.168.112.254 dev eth0
10.0.0.0/15 dev br1 proto kernel scope link src 10.0.2.2
192.168.112.0/24 dev eth0 proto kernel scope link src 192.168.112.3
</code></pre> openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2307952019-07-29T21:29:42Zokurzokurz@suse.com
<ul></ul><p>the approach in <a class="issue tracker-4 status-3 priority-4 priority-default closed" title="action: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests... (Resolved)" href="https://progress.opensuse.org/issues/52499#note-15">#52499#note-15</a> did – maybe obviously – not work because we triggered only a single job but we would need the parallel one so let's try again:</p>
<pre><code>build=20190728; openqa-client --host https://openqa.opensuse.org isos post _NO_OBSOLETE_BUILD=1 ARCH=aarch64 BUILD=okurz/os-autoinst-distri-opensuse#8044 DISTRI=opensuse FLAVOR=DVD ISO=openSUSE-Tumbleweed-DVD-aarch64-Snapshot$build-Media.iso MIRROR_HTTP=http://openqa.opensuse.org/assets/repo/openSUSE-Tumbleweed-oss-aarch64-Snapshot$build MIRROR_PREFIX=http://openqa.opensuse.org/assets/repo REPO_0=openSUSE-Tumbleweed-oss-aarch64-Snapshot$build REPO_0_DEBUGINFO=openSUSE-Tumbleweed-oss-aarch64-Snapshot$build-debuginfo REPO_OSS=openSUSE-Tumbleweed-oss-aarch64-Snapshot$build REPO_OSS_DEBUGINFO=openSUSE-Tumbleweed-oss-aarch64-Snapshot$build-debuginfo SUSEMIRROR=http://openqa.opensuse.org/assets/repo/openSUSE-Tumbleweed-oss-aarch64-Snapshot$build VERSION=Tumbleweed CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse.git#fix/aarch64_mm NEEDLES_DIR=/var/lib/openqa/share/tests/opensuse/products/opensuse/needles PRODUCTDIR=os-autoinst-distri-opensuse/products/opensuse TEST=wicked_basic_ref,wicked_basic_sut
</code></pre>
<p>-></p>
<pre><code>{
count => 3,
failed => [],
ids => [995247, 995248, 995249],
scheduled_product_id => 114676,
}
</code></pre>
<p><a href="https://openqa.opensuse.org/t995249" class="external">https://openqa.opensuse.org/t995249</a></p>
openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2308462019-07-30T05:52:16Zokurzokurz@suse.com
<ul></ul><p>failed because I used a "/" in BUILD, reported as <a class="issue tracker-4 status-3 priority-4 priority-default closed" title="action: PUBLISH_HDD_1 fails with '/' in variable name (Resolved)" href="https://progress.opensuse.org/issues/54809">#54809</a> .</p>
<pre><code>build=20190728; openqa-client --host https://openqa.opensuse.org isos post _NO_OBSOLETE_BUILD=1 ARCH=aarch64 BUILD=okurz:os-autoinst-distri-opensuse#8044 DISTRI=opensuse FLAVOR=DVD ISO=openSUSE-Tumbleweed-DVD-aarch64-Snapshot$build-Media.iso MIRROR_HTTP=http://openqa.opensuse.org/assets/repo/openSUSE-Tumbleweed-oss-aarch64-Snapshot$build MIRROR_PREFIX=http://openqa.opensuse.org/assets/repo REPO_0=openSUSE-Tumbleweed-oss-aarch64-Snapshot$build REPO_0_DEBUGINFO=openSUSE-Tumbleweed-oss-aarch64-Snapshot$build-debuginfo REPO_OSS=openSUSE-Tumbleweed-oss-aarch64-Snapshot$build REPO_OSS_DEBUGINFO=openSUSE-Tumbleweed-oss-aarch64-Snapshot$build-debuginfo SUSEMIRROR=http://openqa.opensuse.org/assets/repo/openSUSE-Tumbleweed-oss-aarch64-Snapshot$build VERSION=Tumbleweed CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse.git#fix/aarch64_mm NEEDLES_DIR=/var/lib/openqa/share/tests/opensuse/products/opensuse/needles PRODUCTDIR=os-autoinst-distri-opensuse/products/opensuse TEST=wicked_basic_ref,wicked_basic_sut
</code></pre>
<p>-></p>
<pre><code>{
count => 3,
failed => [],
ids => [995839, 995840, 995841],
scheduled_product_id => 114706,
}
</code></pre>
<p><a href="https://openqa.opensuse.org/t995841" class="external">https://openqa.opensuse.org/t995841</a></p>
openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2309152019-07-30T07:28:41Zokurzokurz@suse.com
<ul></ul><p>So the creation job managed to create an HDD image but the downstream jobs fail to download it. Let's try to avoid the ':' as well:</p>
<pre><code>build=20190728; openqa-client --host https://openqa.opensuse.org isos post _NO_OBSOLETE_BUILD=1 ARCH=aarch64 BUILD=okurz-os-autoinst-distri-opensuse#8044 DISTRI=opensuse FLAVOR=DVD ISO=openSUSE-Tumbleweed-DVD-aarch64-Snapshot$build-Media.iso MIRROR_HTTP=http://openqa.opensuse.org/assets/repo/openSUSE-Tumbleweed-oss-aarch64-Snapshot$build MIRROR_PREFIX=http://openqa.opensuse.org/assets/repo REPO_0=openSUSE-Tumbleweed-oss-aarch64-Snapshot$build REPO_0_DEBUGINFO=openSUSE-Tumbleweed-oss-aarch64-Snapshot$build-debuginfo REPO_OSS=openSUSE-Tumbleweed-oss-aarch64-Snapshot$build REPO_OSS_DEBUGINFO=openSUSE-Tumbleweed-oss-aarch64-Snapshot$build-debuginfo SUSEMIRROR=http://openqa.opensuse.org/assets/repo/openSUSE-Tumbleweed-oss-aarch64-Snapshot$build VERSION=Tumbleweed CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse.git#fix/aarch64_mm NEEDLES_DIR=/var/lib/openqa/share/tests/opensuse/products/opensuse/needles PRODUCTDIR=os-autoinst-distri-opensuse/products/opensuse TEST=wicked_basic_ref,wicked_basic_sut
</code></pre>
<p>-></p>
<pre><code>{
count => 3,
failed => [],
ids => [995847, 995848, 995849],
scheduled_product_id => 114709,
}
</code></pre>
<p><a href="https://openqa.opensuse.org/t995849" class="external">https://openqa.opensuse.org/t995849</a></p>
openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2309662019-07-30T08:32:51Zokurzokurz@suse.com
<ul></ul><pre><code><guillaume_g> okurz: failed again: https://openqa.opensuse.org/tests/995849#dependencies :(
<okurz> guillaume_g: seems that also the "#" is a problem, but not for the publishing, only the download in caching, see https://openqa.opensuse.org/tests/995900/file/autoinst-log.txt . interesting
</code></pre>
<p>Trying again without the <code>#</code></p>
<pre><code>build=20190728; openqa-client --host https://openqa.opensuse.org isos post _NO_OBSOLETE_BUILD=1 ARCH=aarch64 BUILD=okurz-os-autoinst-distri-opensuse-8044 DISTRI=opensuse FLAVOR=DVD ISO=openSUSE-Tumbleweed-DVD-aarch64-Snapshot$build-Media.iso MIRROR_HTTP=http://openqa.opensuse.org/assets/repo/openSUSE-Tumbleweed-oss-aarch64-Snapshot$build MIRROR_PREFIX=http://openqa.opensuse.org/assets/repo REPO_0=openSUSE-Tumbleweed-oss-aarch64-Snapshot$build REPO_0_DEBUGINFO=openSUSE-Tumbleweed-oss-aarch64-Snapshot$build-debuginfo REPO_OSS=openSUSE-Tumbleweed-oss-aarch64-Snapshot$build REPO_OSS_DEBUGINFO=openSUSE-Tumbleweed-oss-aarch64-Snapshot$build-debuginfo SUSEMIRROR=http://openqa.opensuse.org/assets/repo/openSUSE-Tumbleweed-oss-aarch64-Snapshot$build VERSION=Tumbleweed CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse.git#fix/aarch64_mm NEEDLES_DIR=/var/lib/openqa/share/tests/opensuse/products/opensuse/needles PRODUCTDIR=os-autoinst-distri-opensuse/products/opensuse TEST=wicked_basic_ref,wicked_basic_sut
</code></pre><pre><code>{
count => 3,
failed => [],
ids => [995901, 995902, 995903],
scheduled_product_id => 114718,
}
</code></pre>
<p><a href="https://openqa.opensuse.org/t995903" class="external">https://openqa.opensuse.org/t995903</a></p>
openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2310982019-07-30T10:13:48Zokurzokurz@suse.com
<ul></ul><p>wrong syntax in <a href="https://openqa.opensuse.org/tests/995903#step/before_test/46" class="external">https://openqa.opensuse.org/tests/995903#step/before_test/46</a> , trying again with an explicit array casting:</p>
<pre><code>build=20190728; openqa-client --host https://openqa.opensuse.org isos post _NO_OBSOLETE_BUILD=1 ARCH=aarch64 BUILD=okurz-os-autoinst-distri-opensuse-8044 DISTRI=opensuse FLAVOR=DVD ISO=openSUSE-Tumbleweed-DVD-aarch64-Snapshot$build-Media.iso MIRROR_HTTP=http://openqa.opensuse.org/assets/repo/openSUSE-Tumbleweed-oss-aarch64-Snapshot$build MIRROR_PREFIX=http://openqa.opensuse.org/assets/repo REPO_0=openSUSE-Tumbleweed-oss-aarch64-Snapshot$build REPO_0_DEBUGINFO=openSUSE-Tumbleweed-oss-aarch64-Snapshot$build-debuginfo REPO_OSS=openSUSE-Tumbleweed-oss-aarch64-Snapshot$build REPO_OSS_DEBUGINFO=openSUSE-Tumbleweed-oss-aarch64-Snapshot$build-debuginfo SUSEMIRROR=http://openqa.opensuse.org/assets/repo/openSUSE-Tumbleweed-oss-aarch64-Snapshot$build VERSION=Tumbleweed CASEDIR=https://github.com/okurz/os-autoinst-distri-opensuse.git#fix/aarch64_mm NEEDLES_DIR=/var/lib/openqa/share/tests/opensuse/products/opensuse/needles PRODUCTDIR=os-autoinst-distri-opensuse/products/opensuse TEST=wicked_basic_ref,wicked_basic_sut
</code></pre>
<p>-></p>
<pre><code>{
count => 3,
failed => [],
ids => [995936, 995937, 995938],
scheduled_product_id => 114728,
}
</code></pre>
<p><a href="https://openqa.opensuse.org/t995938" class="external">https://openqa.opensuse.org/t995938</a></p>
openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2311432019-07-30T11:51:31Zokurzokurz@suse.com
<ul></ul><p>so the latest shows the ping not working for aarch64 which is expected, let's test on x86_64 still though.</p>
<p>Created new script for easier triggering: <a href="https://raw.githubusercontent.com/okurz/scripts/master/openqa-trigger-mm" class="external">https://raw.githubusercontent.com/okurz/scripts/master/openqa-trigger-mm</a></p>
<pre><code>env build=20190728 ~/bin/openqa-trigger-mm
</code></pre>
<p>-></p>
<pre><code>{
count => 3,
failed => [],
ids => [995963, 995964, 995965],
scheduled_product_id => 114736,
}
</code></pre>
<p>-> <a href="https://openqa.opensuse.org/t995965" class="external">https://openqa.opensuse.org/t995965</a></p>
openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2311492019-07-30T11:54:30Zggardet_armguillaume.gardet@arm.com
<ul></ul><p>Contrary to the current test <a href="https://openqa.opensuse.org/tests/994576" class="external">https://openqa.opensuse.org/tests/994576</a> where enp0s3 is up, in your aarch64 test, enp0s3 is not ready.</p>
openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2311552019-07-30T12:07:18Zokurzokurz@suse.com
<ul></ul><p>Yes, I saw that as well. Interesting. But currently I am giving up because of the immature tooling. I tried to spawn an x86_64 set of tests and that again needs properly initialized NEEDLESDIR. Also needing to skip the creating job is annoying. <a href="https://github.com/os-autoinst/openQA/pull/2224" class="external">https://github.com/os-autoinst/openQA/pull/2224</a> might be of help there. For now I give up going further.</p>
openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2312032019-07-30T14:21:48Zggardet_armguillaume.gardet@arm.com
<ul></ul><p>With <a href="https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/8067" class="external">https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/8067</a> we have some results in <a href="https://openqa.opensuse.org/tests/995987/file/serial_terminal.txt" class="external">https://openqa.opensuse.org/tests/995987/file/serial_terminal.txt</a></p>
<p>And <code>ping -c 1 8.8.8.8</code> fails as well as downloading from google with <code>curl -L 216.58.204.99</code></p>
<p>Routing inside the guest seems to be ok. So, the problem is the routing on the host.<br>
Maybe we should add <code>eth0</code> to the bridge, or configure <code>br1</code> to route to <code>eth0</code> when needed.</p>
openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2325472019-08-05T11:48:08Zokurzokurz@suse.com
<ul><li><strong>Status</strong> changed from <i>Blocked</i> to <i>Feedback</i></li></ul><p>see <a class="issue tracker-4 status-3 priority-4 priority-default closed" title="action: [aarch64] test fails in wicked before_test - DNS problem (Resolved)" href="https://progress.opensuse.org/issues/54281#note-3">#54281#note-3</a></p>
<p><a href="https://openqa.opensuse.org/tests/999020#">https://openqa.opensuse.org/tests/999020#</a> is soft-failed so I can move the scenarios to the product validation job group, which I did. Let's monitor for next jobs – and after worker restart.</p>
<p>Seems from history of <a href="mailto:root@aarch64.o.o">root@aarch64.o.o</a> that we never did <code>firewall-cmd --permanent --zone=trusted --add-masquerade</code>. I guess the documentation is a bit misleading to state "To enable masquerading one can use the following command:<br>
firewall-cmd --permanent --zone=external --add-masquerade" because it needs to be the zone which has the external interface, which is "br1" in "trusted" in our case.</p>
<p>if the entries for <code>OVS_BRIDGE_PORT_DEVICE_X</code> start with "0" or "1" should not matter according to <a href="https://www.suse.com/documentation/sles-15/book_sle_admin/data/sec_network_openvswitch.html#sec_network_openvswitch_bridge">https://www.suse.com/documentation/sles-15/book_sle_admin/data/sec_network_openvswitch.html#sec_network_openvswitch_bridge</a> as long as they are unique.</p>
<p>There are still warnings about tap devices not in any zone though.</p>
<p>To get rid of the warning about eth0 not being in any zone I called <code>firewall-cmd --zone=external --add-interface=eth0</code>. This wasn't necessary to fix the test though. <a href="https://openqa.opensuse.org/tests/999046#">https://openqa.opensuse.org/tests/999046#</a> showed still working but the livehandler could not connect. Fixed that by moving to "trusted" zone: <code>firewall-cmd --zone=trusted --change-interface=eth0</code> so we know that we need that for both liveview+MM now :)</p>
<p>On w1 I am now confused: yast2 firewall states that eth0 is in trusted, <code>firewall-cmd --get-zone-of-interface eth0</code> states "no zone", which the warning confirms and /etc/sysconfig/network/ifcfg-eth0 states "ZONE=public". I wonder if the zone in /etc/sysconfig/network/ifcfg-eth0 actually has any effect or if this only used by SuSEfirewall2 but not by firewalld. <br>
<a href="https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/deployment_guide/s1-networkscripts-interfaces">https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/deployment_guide/s1-networkscripts-interfaces</a> doesn't mention the ZONE config. Maybe I can just delete it from the file when this is old, e.g. for SuSEfirewall2 and not necessary anymore? I deleted the line <code>ZONE=trusted</code> from /etc/sysconfig/network/ifcfg-eth0 and rebooted the machine aarch64.o.o now. Let's see. Machine came up, <code>ZONE=trusted</code> is still not there in the file, <code>firewall-cmd</code> reports eth0 to be part of "trusted" and no warning in /var/log/firewalld. Retriggered <a href="https://openqa.opensuse.org/tests/999063">https://openqa.opensuse.org/tests/999063</a> , still fine. live mode and test working. Deleted <code>ZONE=…</code> from all files with <code>sed -i -e '/ZONE=/d' /etc/sysconfig/network/*</code> on aarch64.o.o . Rebooting and trying again. <a href="https://openqa.opensuse.org/tests/999067">https://openqa.opensuse.org/tests/999067</a> is also fine. Also added "ovs-system" to trusted zone to fix warning in /var/log/firewalld.</p>
openQA Tests - action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required)https://progress.opensuse.org/issues/52499?journal_id=2334232019-08-08T10:01:54Zokurzokurz@suse.com
<ul><li><strong>Status</strong> changed from <i>Feedback</i> to <i>Resolved</i></li></ul><p><a href="https://openqa.opensuse.org/tests?match=wicked_basic" class="external">https://openqa.opensuse.org/tests?match=wicked_basic</a> are very stable now and for both x86_64 and aarch64 the according scenarios are scheduled for the corresponding product validation job group.</p>