https://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842021-11-26T13:24:27ZopenSUSE Project Management ToolopenQA Infrastructure - action #103128: gitlab CI pipelines sporadically fail with "Could not resolve host: gitlab.suse.de", e.g. Recovery pipelines for ARM workers might fail during the maintenance window size:Mhttps://progress.opensuse.org/issues/103128?journal_id=4684082021-11-26T13:24:27Zokurzokurz@suse.com
<ul><li><strong>Target version</strong> set to <i>Ready</i></li></ul><p>Did you mean "fail" in the subject line?</p>
<p>I assume the code that failed is within gitlab-runner code. Not running any gitlab CI jobs within the suse.de domain should of course prevent running into network problems during that time :) That should be something we can suggest to SUSE IT.</p>
openQA Infrastructure - action #103128: gitlab CI pipelines sporadically fail with "Could not resolve host: gitlab.suse.de", e.g. Recovery pipelines for ARM workers might fail during the maintenance window size:Mhttps://progress.opensuse.org/issues/103128?journal_id=4684202021-11-26T15:28:59Zmkittlermarius.kittler@suse.com
<ul><li><strong>Subject</strong> changed from <i>Recovery pipelines for ARM workers might file during the maintenance window</i> to <i>Recovery pipelines for ARM workers might fail during the maintenance window</i></li></ul> openQA Infrastructure - action #103128: gitlab CI pipelines sporadically fail with "Could not resolve host: gitlab.suse.de", e.g. Recovery pipelines for ARM workers might fail during the maintenance window size:Mhttps://progress.opensuse.org/issues/103128?journal_id=4699892021-12-02T10:54:52Zlivdywanliv.dywan@suse.com
<ul><li><strong>Subject</strong> changed from <i>Recovery pipelines for ARM workers might fail during the maintenance window</i> to <i>Recovery pipelines for ARM workers might fail during the maintenance window size:M</i></li><li><strong>Description</strong> updated (<a title="View differences" href="/journals/469989/diff?detail_id=444570">diff</a>)</li><li><strong>Status</strong> changed from <i>New</i> to <i>Workable</i></li></ul> openQA Infrastructure - action #103128: gitlab CI pipelines sporadically fail with "Could not resolve host: gitlab.suse.de", e.g. Recovery pipelines for ARM workers might fail during the maintenance window size:Mhttps://progress.opensuse.org/issues/103128?journal_id=4744272021-12-17T15:42:14Zokurzokurz@suse.com
<ul><li><strong>Status</strong> changed from <i>Workable</i> to <i>In Progress</i></li><li><strong>Assignee</strong> set to <i>okurz</i></li></ul><p>So apparently there had been two upstream feature requests to retry the initial git clone <a href="https://gitlab.com/gitlab-org/gitlab-runner/-/issues/2296" class="external">https://gitlab.com/gitlab-org/gitlab-runner/-/issues/2296</a> and <a href="https://gitlab.com/gitlab-org/gitlab-docs/-/issues/32" class="external">https://gitlab.com/gitlab-org/gitlab-docs/-/issues/32</a> , both rejected. It seems that currently gitlab CI does not support retry for the initial git clone out of the box. It might be an option to actually configure gitlab CI to not clone a git at all and we just do it ourselves manually in the script section with retrying and backoff as the retry in <a href="https://docs.gitlab.com/ee/ci/yaml/#retry" class="external">https://docs.gitlab.com/ee/ci/yaml/#retry</a> is only for the job <em>after</em> the initial git clone. Before we do that we should clarify with EngInfra if they can find a better solution. I think there is a ticket somewhere regarding the deprecation of the CAASP cluster that is used to run the gitlab CI runners.</p>
openQA Infrastructure - action #103128: gitlab CI pipelines sporadically fail with "Could not resolve host: gitlab.suse.de", e.g. Recovery pipelines for ARM workers might fail during the maintenance window size:Mhttps://progress.opensuse.org/issues/103128?journal_id=4744292021-12-17T15:46:57Zokurzokurz@suse.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Blocked</i></li></ul><p><a href="https://sd.suse.com/servicedesk/customer/portal/1/SD-70900" class="external">https://sd.suse.com/servicedesk/customer/portal/1/SD-70900</a></p>
openQA Infrastructure - action #103128: gitlab CI pipelines sporadically fail with "Could not resolve host: gitlab.suse.de", e.g. Recovery pipelines for ARM workers might fail during the maintenance window size:Mhttps://progress.opensuse.org/issues/103128?journal_id=4745372021-12-19T20:13:32Zokurzokurz@suse.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/474537/diff?detail_id=448977">diff</a>)</li></ul> openQA Infrastructure - action #103128: gitlab CI pipelines sporadically fail with "Could not resolve host: gitlab.suse.de", e.g. Recovery pipelines for ARM workers might fail during the maintenance window size:Mhttps://progress.opensuse.org/issues/103128?journal_id=4749212021-12-20T21:58:37Zokurzokurz@suse.com
<ul><li><strong>Subject</strong> changed from <i>Recovery pipelines for ARM workers might fail during the maintenance window size:M</i> to <i>gitlab CI pipelines sporadically fail with "Could not resolve host: gitlab.suse.de", e.g. Recovery pipelines for ARM workers might fail during the maintenance window size:M</i></li><li><strong>Status</strong> changed from <i>Blocked</i> to <i>Feedback</i></li></ul><p>@team please comment if you see the issue again with a reference to the according gitlab CI jobs and provide additional information here and/or in <a href="https://sd.suse.com/servicedesk/customer/portal/1/SD-70900" class="external">https://sd.suse.com/servicedesk/customer/portal/1/SD-70900</a></p>
openQA Infrastructure - action #103128: gitlab CI pipelines sporadically fail with "Could not resolve host: gitlab.suse.de", e.g. Recovery pipelines for ARM workers might fail during the maintenance window size:Mhttps://progress.opensuse.org/issues/103128?journal_id=4797192022-01-12T11:19:30Zokurzokurz@suse.com
<ul><li><strong>Status</strong> changed from <i>Feedback</i> to <i>Blocked</i></li></ul><p><a href="https://sd.suse.com/servicedesk/customer/portal/1/SD-70900" class="external">https://sd.suse.com/servicedesk/customer/portal/1/SD-70900</a> has an update. Jiri Novak found a problem with DNS config on kubernetes hosts and is working on that.</p>
openQA Infrastructure - action #103128: gitlab CI pipelines sporadically fail with "Could not resolve host: gitlab.suse.de", e.g. Recovery pipelines for ARM workers might fail during the maintenance window size:Mhttps://progress.opensuse.org/issues/103128?journal_id=4798812022-01-13T08:31:32Zokurzokurz@suse.com
<ul><li><strong>Status</strong> changed from <i>Blocked</i> to <i>Resolved</i></li></ul><p>The SD ticket was resolved with some fixes to the DNS infrastructure. As it seems that Jiri Novak did a proper test themselves I consider this resolved. If any of you see this again please raise it up again</p>