https://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842019-12-18T12:52:52ZopenSUSE Project Management ToolopenQA Tests - action #60833: [qe-core][sle][functional] performance issue of aarch64 worker: Stall detectedhttps://progress.opensuse.org/issues/60833?journal_id=2656202019-12-18T12:52:52Zzluo
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-5 priority-high3 closed" href="/issues/46190">action #46190</a>: [functional][u] test fails in user_settings - mistyping in Username (lowercase instead of uppercase) </i> added</li></ul> openQA Tests - action #60833: [qe-core][sle][functional] performance issue of aarch64 worker: Stall detectedhttps://progress.opensuse.org/issues/60833?journal_id=2672912019-12-25T07:04:25Zokurzokurz@suse.com
<ul></ul><p>This is an autogenerated message for openQA integration by the openqa_review script:</p>
<p>This bug is still referenced in a failing openQA test: extra_tests_on_gnome<br>
<a href="https://openqa.suse.de/tests/3731488" class="external">https://openqa.suse.de/tests/3731488</a></p>
<p>To prevent further reminder comments one of the following options should be followed:</p>
<ol>
<li>The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted</li>
<li>The openQA job group is moved to "Released"</li>
<li>The label in the openQA scenario is removed</li>
</ol>
openQA Tests - action #60833: [qe-core][sle][functional] performance issue of aarch64 worker: Stall detectedhttps://progress.opensuse.org/issues/60833?journal_id=2689642020-01-08T07:07:20Zokurzokurz@suse.com
<ul></ul><p>This is an autogenerated message for openQA integration by the openqa_review script:</p>
<p>This bug is still referenced in a failing openQA test: gnome+proxy_SCC+allmodules<br>
<a href="https://openqa.suse.de/tests/3758637" class="external">https://openqa.suse.de/tests/3758637</a></p>
<p>To prevent further reminder comments one of the following options should be followed:</p>
<ol>
<li>The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted</li>
<li>The openQA job group is moved to "Released"</li>
<li>The label in the openQA scenario is removed</li>
</ol>
openQA Tests - action #60833: [qe-core][sle][functional] performance issue of aarch64 worker: Stall detectedhttps://progress.opensuse.org/issues/60833?journal_id=2691052020-01-08T12:17:00Zokurzokurz@suse.com
<ul><li><strong>Category</strong> set to <i>Bugs in existing tests</i></li></ul> openQA Tests - action #60833: [qe-core][sle][functional] performance issue of aarch64 worker: Stall detectedhttps://progress.opensuse.org/issues/60833?journal_id=2701162020-01-10T14:49:11ZSLindoMansillaslindomansilla@suse.com
<ul></ul><p>Workers are again using the old number of workers which is known to produce typing issues: <a href="https://gitlab.suse.de/openqa/salt-pillars-openqa/blob/master/openqa/workerconf.sls#L489" class="external">https://gitlab.suse.de/openqa/salt-pillars-openqa/blob/master/openqa/workerconf.sls#L489</a></p>
<p>This commit <a href="https://gitlab.suse.de/openqa/salt-pillars-openqa/commit/cef2ca2755860394d0ace4178ef51cc800dc34fe" class="external">https://gitlab.suse.de/openqa/salt-pillars-openqa/commit/cef2ca2755860394d0ace4178ef51cc800dc34fe</a> suggest to mask services and tools team agreed that while investigating masking should be used.<br>
Once the right amount of workers is known, this should be change in the salt state <a href="https://gitlab.suse.de/openqa/salt-pillars-openqa/blob/master/openqa/workerconf.sls#L489" class="external">https://gitlab.suse.de/openqa/salt-pillars-openqa/blob/master/openqa/workerconf.sls#L489</a></p>
openQA Tests - action #60833: [qe-core][sle][functional] performance issue of aarch64 worker: Stall detectedhttps://progress.opensuse.org/issues/60833?journal_id=2701192020-01-10T14:50:36ZSLindoMansillaslindomansilla@suse.com
<ul><li><strong>Assignee</strong> set to <i>SLindoMansilla</i></li><li><strong>Priority</strong> changed from <i>Normal</i> to <i>High</i></li></ul><p>Performing binary search for openqaworker-arm-1.<br>
Starting workers: 20<br>
Trying with: 10 (workers from 11 to 20 stopped and masked)</p>
openQA Tests - action #60833: [qe-core][sle][functional] performance issue of aarch64 worker: Stall detectedhttps://progress.opensuse.org/issues/60833?journal_id=2702422020-01-10T16:52:09Zokurzokurz@suse.com
<ul></ul><p>I learned that masked servers make salt state apply fail so I reverted my masking and updated salt pillars accordingly. If you want to experiment with "less workers" I suggest to pin test jobs to openqaworker-arm-3 which is reduced to 4 worker instances in parallel for now. We can run the experiment but we will need to unmask worker instances again as soon as we have problems with salt recipe application.</p>
openQA Tests - action #60833: [qe-core][sle][functional] performance issue of aarch64 worker: Stall detectedhttps://progress.opensuse.org/issues/60833?journal_id=2705722020-01-13T12:48:44ZSLindoMansillaslindomansilla@suse.com
<ul><li><strong>Assignee</strong> deleted (<del><i>SLindoMansilla</i></del>)</li></ul><p>Approach is not accepted by tools team.<br>
To decide in next refinement meeting.</p>
openQA Tests - action #60833: [qe-core][sle][functional] performance issue of aarch64 worker: Stall detectedhttps://progress.opensuse.org/issues/60833?journal_id=2707822020-01-13T18:05:59Zokurzokurz@suse.com
<ul></ul><p>The approach <em>is</em> accepted when you use salt pillar changes and not simply masking systemd services to not break salt.</p>
openQA Tests - action #60833: [qe-core][sle][functional] performance issue of aarch64 worker: Stall detectedhttps://progress.opensuse.org/issues/60833?journal_id=2752462020-01-30T08:45:55Zzluo
<ul></ul><p><a class="issue tracker-4 status-3 priority-4 priority-default closed" title="action: [tools][functional][u] stall detected in openqaworker-arm-1 through 3 sometimes - "worker perform... (Resolved)" href="https://progress.opensuse.org/issues/25864">#25864</a> is actually old ticket which has been worked by okurz</p>
openQA Tests - action #60833: [qe-core][sle][functional] performance issue of aarch64 worker: Stall detectedhttps://progress.opensuse.org/issues/60833?journal_id=2800032020-02-24T10:52:11ZSLindoMansillaslindomansilla@suse.com
<ul><li><strong>Assignee</strong> set to <i>mgriessmeier</i></li></ul><ol>
<li>Increase QEMURAM for openqaworker-arm-1 and openqaworker-arm-3.</li>
<li>Ask Santi about requirements for an ARM server for test environment (openqa.suse.de).</li>
<li>Show requirements to Ralf and see if it is possible to acquire such hardware.</li>
</ol>
openQA Tests - action #60833: [qe-core][sle][functional] performance issue of aarch64 worker: Stall detectedhttps://progress.opensuse.org/issues/60833?journal_id=2800452020-02-24T12:59:28Zokurzokurz@suse.com
<ul></ul><p>SLindoMansilla wrote:</p>
<blockquote>
<ol>
<li>Increase QEMURAM for openqaworker-arm-1 and openqaworker-arm-3.</li>
</ol>
</blockquote>
<p>Yes. This would also go in line with <a class="issue tracker-4 status-3 priority-5 priority-high3 closed" title="action: [functional][u] test fails in user_settings - mistyping in Username (lowercase instead of upperca... (Resolved)" href="https://progress.opensuse.org/issues/46190#note-88">#46190#note-88</a></p>
<blockquote>
<ol>
<li>Ask Santi about requirements for an ARM server for test environment (openqa.suse.de).</li>
<li>Show requirements to Ralf and see if it is possible to acquire such hardware.</li>
</ol>
</blockquote>
<p>New ARM hardware as already requested, see <a href="https://trello.com/c/JQtnALhz/6-openqa-hw-budget-planning#comment-5e185a3e9a5c3786c32fd089" class="external">https://trello.com/c/JQtnALhz/6-openqa-hw-budget-planning#comment-5e185a3e9a5c3786c32fd089</a></p>
openQA Tests - action #60833: [qe-core][sle][functional] performance issue of aarch64 worker: Stall detectedhttps://progress.opensuse.org/issues/60833?journal_id=2800512020-02-24T12:59:43Zokurzokurz@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-4 priority-default closed" href="/issues/25864">action #25864</a>: [tools][functional][u] stall detected in openqaworker-arm-1 through 3 sometimes - "worker performance issues"</i> added</li></ul> openQA Tests - action #60833: [qe-core][sle][functional] performance issue of aarch64 worker: Stall detectedhttps://progress.opensuse.org/issues/60833?journal_id=2801112020-02-24T17:43:26ZSLindoMansillaslindomansilla@suse.com
<ul></ul><p>MR: <a href="https://gitlab.suse.de/openqa/salt-pillars-openqa/merge_requests/224" class="external">https://gitlab.suse.de/openqa/salt-pillars-openqa/merge_requests/224</a></p>
openQA Tests - action #60833: [qe-core][sle][functional] performance issue of aarch64 worker: Stall detectedhttps://progress.opensuse.org/issues/60833?journal_id=2987622020-05-07T14:25:35ZSLindoMansillaslindomansilla@suse.com
<ul><li><strong>Parent task</strong> deleted (<del><i>#56087</i></del>)</li></ul> openQA Tests - action #60833: [qe-core][sle][functional] performance issue of aarch64 worker: Stall detectedhttps://progress.opensuse.org/issues/60833?journal_id=2987652020-05-07T14:26:04ZSLindoMansillaslindomansilla@suse.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Blocked</i></li><li><strong>Assignee</strong> changed from <i>mgriessmeier</i> to <i>szarate</i></li></ul> openQA Tests - action #60833: [qe-core][sle][functional] performance issue of aarch64 worker: Stall detectedhttps://progress.opensuse.org/issues/60833?journal_id=2987712020-05-07T14:26:20ZSLindoMansillaslindomansilla@suse.com
<ul><li><strong>Blocked by</strong> <i><a class="issue tracker-4 status-3 priority-3 priority-lowest closed" href="/issues/41882">action #41882</a>: all arm worker die after some time</i> added</li></ul> openQA Tests - action #60833: [qe-core][sle][functional] performance issue of aarch64 worker: Stall detectedhttps://progress.opensuse.org/issues/60833?journal_id=2990502020-05-11T07:29:07Zokurzokurz@suse.com
<ul><li><strong>Status</strong> changed from <i>Blocked</i> to <i>Workable</i></li></ul><p><a class="user active user-mention" href="https://progress.opensuse.org/users/24132">@SLindoMansilla</a> <a class="issue tracker-4 status-3 priority-3 priority-lowest closed" title="action: all arm worker die after some time (Resolved)" href="https://progress.opensuse.org/issues/41882">#41882</a> is about machines crashing completely, not about performance issues per se. Please do not use that as blocker. If there is something specific I could help you with I am happy to help.</p>
openQA Tests - action #60833: [qe-core][sle][functional] performance issue of aarch64 worker: Stall detectedhttps://progress.opensuse.org/issues/60833?journal_id=3476952020-11-06T09:41:39Ztjyrinki_susetjyrinki+redmine@suse.de
<ul><li><strong>Subject</strong> changed from <i>[sle][functional][u] performance issue of aarch64 worker: Stall detected</i> to <i>[qe-core][sle][functional] performance issue of aarch64 worker: Stall detected</i></li></ul> openQA Tests - action #60833: [qe-core][sle][functional] performance issue of aarch64 worker: Stall detectedhttps://progress.opensuse.org/issues/60833?journal_id=3828882021-02-16T08:22:27Zszarate
<ul><li><strong>Category</strong> changed from <i>Bugs in existing tests</i> to <i>Infrastructure</i></li><li><strong>Assignee</strong> deleted (<del><i>szarate</i></del>)</li></ul><p>I will not be taking at stalls for now...</p>
openQA Tests - action #60833: [qe-core][sle][functional] performance issue of aarch64 worker: Stall detectedhttps://progress.opensuse.org/issues/60833?journal_id=3843702021-02-22T08:26:33Zszarate
<ul><li><strong>Status</strong> changed from <i>Workable</i> to <i>Rejected</i></li><li><strong>Assignee</strong> set to <i>szarate</i></li></ul><p>I don't see it referenced anymore, and stalls + aarch64 is usually a bad combination on Caviums</p>