Project

General

Profile

Actions

action #170380

closed

[spike][timeboxed:10h] Prevent unauthorized openQA asset download size:S

Added by okurz about 2 months ago. Updated 28 days ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Feature requests
Target version:
Start date:
2024-11-27
Due date:
% Done:

0%

Estimated time:
Tags:

Description

Motivation

Right now to be CC-compliant we need to prevent complete network access to openQA from non-CC locations as openQA might provide access to potentially sensitive data, e.g. openQA assets linked on openQA test results. For example https://openqa.opensuse.org/tests/4668726#downloads from https://openqa.opensuse.org/tests/latest?arch=x86_64&distri=opensuse&flavor=DVD&machine=64bit&test=create_hdd_textmode&version=Tumbleweed has link https://openqa.opensuse.org/tests/4668726/asset/hdd/opensuse-Tumbleweed-x86_64-20241126-textmode@64bit.qcow2 pointing to https://openqa.opensuse.org/assets/hdd/opensuse-Tumbleweed-x86_64-20241126-textmode@64bit.qcow2 which is reachable unauthenticated. In this example this is of course not a problem but the same applies for openqa.suse.de which can potentially provide qcow files with sensitive data included for which we want to provide a solution to prevent unauthenticated access based on a configuration feature switch. Any openQA jobs that need this asset should still be able to do so, at least when they are part of the same dependency tree and such.

Goals

  • G1: Based on configuration feature switch prevent unauthorized access to any or selected openQA assets
  • G2: By default openQA assets are still accessible unauthenticated
  • G3: Asset downloads are still handled in an efficient way by NGINX (and not in an inefficient way via Mojolicious).

Suggestions

  • Look into https://github.com/os-autoinst/openQA/blob/master/lib/OpenQA/WebAPI.pm and move relevant routes below the "authenticated_only" root based on configuration variable
  • Ensure that openQA tests can still work properly with that
  • Implement this in openqa-cli to make it easy to download assets
  • Consider tests that currently re-use job assets directly
  • Consider how a (temporary) password (e.g. the job token) might be revealed in logs or needles, and how to address that
  • Only allow downloading asset from whitelisted IP address (of the worker/SUT)
  • Research how we can integrate/combine the authentication done via the Mojolicious app in NGINX.
    • The Mojolicious app already provides basic HTTP auth. Maybe we can configure NGINX in line with that and use it in relevant places instead of the custom authentication.
  • File follow-up tickets as necessary

Out of scope

  • Restricting access to test details or needles

Related issues 2 (2 open0 closed)

Copied to openQA Project (public) - action #170383: [spike][timeboxed:10h] Concept for full RBAC implementation size:SWorkable2024-11-27

Actions
Copied to openQA Project (public) - action #174154: Prevent unauthorized openQA asset downloadFeedbackmkittler2025-01-31

Actions
Actions #1

Updated by okurz about 2 months ago

  • Copied to action #170383: [spike][timeboxed:10h] Concept for full RBAC implementation size:S added
Actions #2

Updated by mkittler about 2 months ago

  • Subject changed from [spike][timeboxed:10h] Prevent unauthorized openQA asset download to [spike][timeboxed:10h] Prevent unauthorized openQA asset download size:S
  • Description updated (diff)
  • Status changed from New to Workable
Actions #3

Updated by szarate about 2 months ago · Edited

  1. Disable directory listing on nginx or echo "" >> /var/lib/openqa/share/factory/*/index.html
  2. Not authenticated requests, shall not give more information than it should.
  3. Authenticated requests that fail, shall have the least ammount of information required.

e.g, consider implementing error pages for different http status

For this task, it is not needed to go over the implementation details, only 1 and 2 from the list above have to be implemented.

# not authenticated
foursixnine@pakhet gitlab-suse-de-workspace-test % curl -I https://openqa.suse.de/asset/hdd/SLES-15-SP7-aarch64-Build43.1@aarch64-unregistered_functional.qcow2
HTTP/2 403
server: nginx/1.21.5
date: Thu, 28 Nov 2024 10:38:57 GMT
content-type: text/html;charset=UTF-8
content-length: 87
strict-transport-security: max-age=31536000; includeSubDomains

# authenticated and authorized, but no job (or asset is marked as on a need to use basis)
foursixnine@pakhet gitlab-suse-de-workspace-test % curl -I https://openqa.suse.de/hdd/SLES-15-SP7-aarch64-Build43.1@aarch64-unregistered_functional.qcow2
HTTP/2 418
server: nginx/1.21.5
date: Thu, 28 Nov 2024 10:44:47 GMT
content-type: text/html;charset=UTF-8
content-length: 72030
strict-transport-security: max-age=31536000; includeSubDomains
vary: Accept-Encoding

# authenticated and authorized, permitted to access asset
foursixnine@pakhet gitlab-suse-de-workspace-test % curl -I https://openqa.suse.de/tests/16031623/asset/hdd/SLES-15-SP7-aarch64-Build43.1@aarch64-unregistered_functional.qcow2
HTTP/2 302
server: nginx/1.21.5
date: Thu, 28 Nov 2024 10:45:22 GMT
content-length: 0
location: /assets/hdd/SLES-15-SP7-aarch64-Build43.1@aarch64-unregistered_functional.qcow2
strict-transport-security: max-age=31536000; includeSubDomains

# authenticated and authorized, permitted to download
foursixnine@pakhet gitlab-suse-de-workspace-test % curl -I https://openqa.suse.de/hdd/SLES-15-SP7-aarch64-Build43.1@aarch64-unregistered_functional.qcow2
HTTP/2 200
server: nginx/1.21.5
date: Thu, 28 Nov 2024 10:44:47 GMT
content-type: text/html;charset=UTF-8
content-length: 72030
strict-transport-security: max-age=31536000; includeSubDomains
vary: Accept-Encoding

Actions #4

Updated by mkittler about 2 months ago

  • Description updated (diff)

Assets are actually served by NGINX. We didn't consider that during the estimation so I added relevant goals/suggestions.

Actions #5

Updated by okurz about 2 months ago

@mkittler we can keep the goal as you stated but I don't think we should make nginx the strict requirement. Maybe native mojo is not the bad, maybe we can redirect to nginx, maybe we use traefik, whatever, all possible :)

Actions #6

Updated by mkittler about 2 months ago

Maybe native mojo is not the bad,

I remember we had assets downloads via Mojo accidentally for a while on o3 when switching there to NGINX and it was not good enough.

maybe we can redirect to nginx

I think that's already happening but then we still need authentication in NGINX.

maybe we use traefik

I'll have to look into it.

whatever, all possible

One thing that came to mind when talking with @szarate was to create an NGINX module.

However, maybe there's already something we can use. Or we can simply configure HTTP auth in-line with what we also do in Mojolicious and only rely on that¹. Not sure whether that would be flexible enough to also restrict workers so they can only access assets of their current job (which would be part of the use case of #170383).


¹ Note that basic HTTP auth can be configured via auth_basic_user_file in NGINX and we could probably generate/update those files from the Mojolicious app when new API keys are created or API keys are removed.

Actions #7

Updated by okurz about 2 months ago

  • Priority changed from Normal to High
Actions #8

Updated by mkittler about 2 months ago

  • Status changed from Workable to In Progress
  • Assignee set to mkittler
Actions #9

Updated by szarate about 2 months ago

  • Priority changed from High to Immediate
Actions #10

Updated by okurz about 2 months ago · Edited

@szarate mkittler already picked up the issue. What needs immediate handling today? And please be aware that this is only a spike solution so we don't expect to have a full solution at the time of resolving this.

Actions #11

Updated by szarate about 2 months ago

okurz wrote in #note-10:

@szarate mkittler already picked up the issue. What needs immediate handling today? And please be aware that this is only a spike solution so we don't expect to have a full solution at the time of resolving this.

@okurz It's exactly the solution we discussed one week ago: 1,2 and 3 of this message, https://progress.opensuse.org/issues/170380#note-3

The testing and validation plan provided to the CAB is here, I only need to know when the change is in place, so I can inform the CAB, before their next meeting, day after Monday. I hope its feasible to have this implemented by the end of the week.

Needs to be deployed on all openQA instances that QE LSG Department uses (a ticket that is linked to poo#17366 is sufficient.

https://confluence.suse.com/pages/viewpage.action?pageId=668631174

Actions #12

Updated by okurz about 2 months ago

  • Priority changed from Immediate to High

szarate wrote in #note-11:

okurz wrote in #note-10:

@szarate mkittler already picked up the issue. What needs immediate handling today?

https://progress.opensuse.org/projects/qa/wiki/Tools#SLAs-service-level-agreements

And please be aware that this is only a spike solution so we don't expect to have a full solution at the time of resolving this.

@okurz It's exactly the solution we discussed one week ago: 1,2 and 3 of this message, https://progress.opensuse.org/issues/170380#note-3

yes, that is planned to be implemented but this will likely have detrimental effects on the existing infrastructure so to not disrupt the existing business flows we need to have this spike solution first to identify which areas to work on before deploying a robust solution on OSD.

The testing and validation plan provided to the CAB is here, I only need to know when the change is in place, so I can inform the CAB, before their next meeting, day after Monday. I hope its feasible to have this implemented by the end of the week.

That is at least challenging if not impossible and is far from the goal of this ticket. I don't consider that likely to happen considering the impact on testing that we need to sustain from current zone-CC. My understanding was that while temporarily firewall rules will be lifted we will promise to work on this in the next weeks and that's how we planned the work accordingly.

Needs to be deployed on all openQA instances that QE LSG Department uses

I see that as neither necessary nor feasible. Certainly openqa.opensuse.org should not be impacted. Another instance would be http://openqa.qam.suse.cz/ which shouldn't be impacted as it runs fully within zone-CC. http://openqa.qa2.suse.asia/ on the other hand runs fully outside zone-CC and hence should also be save and not needing change. All personal development instances are not under our control but we can suggest everyone to apply the same solution as applicable.

(a ticket that is linked to poo#17366 is sufficient.

I don't understand what you mean by that and the ticket reference is invalid.

https://confluence.suse.com/pages/viewpage.action?pageId=668631174

Regarding the priorization I think there is a misunderstanding in expectations according to https://progress.opensuse.org/projects/qa/wiki/Tools#SLAs-service-level-agreements so lowering back to High as there is no immediate urgency mitigation needed today.

Actions #14

Updated by openqa_review about 2 months ago

  • Due date set to 2024-12-19

Setting due date based on mean cycle time of SUSE QE Tools

Actions #15

Updated by okurz about 1 month ago

  • Status changed from In Progress to Workable
Actions #16

Updated by okurz about 1 month ago

We discussed this in the unblock meeting. mkittler will look into avoiding redirects. As alternative I proposed to symlink assets on the side of the openQA instance to a temporary location and point the web reverse proxy to that and then remove the link to the temporary location again after the job finishes.

Actions #17

Updated by mkittler about 1 month ago

  • Copied to action #174154: Prevent unauthorized openQA asset download added
Actions #18

Updated by mkittler about 1 month ago

  • Status changed from Workable to Feedback

I created #174154 because now the 10 hours are probably almost over and there are way too many open points to resolve this quickly. We might even need to split the newly created ticket.

Actions #19

Updated by okurz about 1 month ago

I think also with what we had discussed today you can create at least another ticket which can be worked on independantly of #174154 . How about a ticket about an experiment to try out the "symlink"-approach and/or another ticket about doing the implementation w/o any reverse proxy.

Actions #20

Updated by mkittler about 1 month ago

I've now created https://github.com/os-autoinst/openQA/pull/6082 which will reduce the size of #174154 when merged. Then we only need to think about the hard parts anymore.

I will figure out the first and most promising point in #174154 to look into and create a separate ticket for that.

Actions #21

Updated by mkittler about 1 month ago

I created another ticket, see #174154#note-7.

Actions #22

Updated by mkittler about 1 month ago

  • Status changed from Feedback to Resolved

PR was merged.

Actions #23

Updated by okurz 28 days ago

  • Due date deleted (2024-12-19)
Actions

Also available in: Atom PDF