Project

General

Profile

Actions

action #162125

closed

coordination #58184: [saga][epic][use case] full version control awareness within openQA

coordination #152847: [epic] version control awareness within openQA for test distributions

[timeboxed:10h][spike] Let openQA keep test distribution checkouts up to date without needing fetchneedles size:S

Added by okurz 4 months ago. Updated 17 days ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Feature requests
Target version:
Start date:
2024-06-12
Due date:
% Done:

0%

Estimated time:

Description

Motivation

fetchneedles is a script provided within the openQA repo and we call it on o3+osd in a cron job every minute to keep test distribution checkouts updated but it's not well documented, can interfer with openQA internal git handling and (probably) still needs initial checkout of test distributions. Let's see what else would be necessary to use the new openQA internal support for checking out git test distributions if they don't exist yet.

Goals

  • G1: A migration plan for existing test distribution on o3 exists so that /var/lib/openqa/share/tests/* would not need to be updated by fetchneedles, e.g. on o3 if checkouts already exist
  • G2: tests would still pass consistently
  • G3: test details and source code views would still show content as expected

Suggestions

  • Apply the approach from #156922 for other test distributions at best in a local reproduction environment but if you are careful or a daredevil you could do it in production :)
  • Try where to update checkouts e.g. in an openQA minion job
  • If you don't know the movie "Despicable me" then watch that first but don't count it as part of the 10h timebox :)

Out of scope

  • Doing any kind of initial checkout if git working copies do not exist yet

Related issues 6 (2 open4 closed)

Related to openQA Infrastructure - action #164895: o3 had corrupted needles git repo, lost uncommitted needles between 2024-07-31 and 2024-08-02Resolvedtinita2024-08-02

Actions
Copied from openQA Project - action #156922: Run os-autoinst-distri-openQA directly from git without anything related in o3:/var/lib/openqa/share/testsBlockedokurz

Actions
Copied to openQA Project - action #164883: Use same minion guard for save_needle, delete_needles and git_clone size:SResolvedtinita

Actions
Copied to openQA Project - action #164886: Use OpenQA::Git for all our git wrappers size:SResolvedrobert.richardson

Actions
Copied to openQA Project - action #164889: Ensure git repos cloned by minions are cleaned up regularly size:SResolved

Actions
Copied to openQA Project - action #164898: Replace fetchneedles with a minion job size:MFeedbacktinita2024-10-11

Actions
Actions #1

Updated by okurz 4 months ago

  • Copied from action #156922: Run os-autoinst-distri-openQA directly from git without anything related in o3:/var/lib/openqa/share/tests added
Actions #2

Updated by okurz 4 months ago

  • Status changed from New to Blocked
  • Assignee set to okurz
Actions #3

Updated by okurz 4 months ago

  • Status changed from Blocked to New
  • Assignee deleted (okurz)
Actions #4

Updated by okurz 4 months ago

  • Priority changed from Normal to High
Actions #5

Updated by livdywan 3 months ago

I feel like it's fair to say we didn't get to this at the workshop. Let's estimate it tomorrow.

Actions #6

Updated by okurz 3 months ago

  • Subject changed from [timeboxed:10h][spike] Let openQA keep test distribution checkouts up to date without needing fetchneedles to [timeboxed:10h][spike] Let openQA keep test distribution checkouts up to date without needing fetchneedles size:S
  • Description updated (diff)
  • Status changed from New to Workable
Actions #7

Updated by mkittler 3 months ago

  • Assignee set to mkittler
Actions #8

Updated by mkittler 3 months ago

  • Assignee deleted (mkittler)
Actions #9

Updated by livdywan 3 months ago

I feel like this might not be picked over other High tickets atm because it's a new feature? Maybe it should have lower priority? I'm asking in the team chat once more.

Actions #10

Updated by ybonatakis 3 months ago

  • Assignee set to ybonatakis
Actions #11

Updated by ybonatakis 2 months ago

  • Status changed from Workable to In Progress
Actions #12

Updated by livdywan 2 months ago

https://github.com/os-autoinst/os-autoinst-distri-openQA/pull/187 was just merged. Not sure what this is about.

Actions #13

Updated by ybonatakis 2 months ago

I did the following experiment:

  • I clone a openqa-to-openqa job with git url in CASEDIR+NEEDLES_DIR. tests pulled the repos in /var/lib/openqa/share/tests/openqa and run successfully
  • created a PR https://github.com/os-autoinst/os-autoinst-distri-openQA/pull/187 and merged.
  • then I restarted the cloned job in my local setup and it run against the new changes. However change was not in /var/lib/openqa/share/tests/openqa
Actions #14

Updated by ybonatakis 2 months ago

livdywan wrote in #note-12:

https://github.com/os-autoinst/os-autoinst-distri-openQA/pull/187 was just merged. Not sure what this is about.

I have to revert it.

Actions #15

Updated by ybonatakis 2 months ago

ybonatakis wrote in #note-14:

livdywan wrote in #note-12:

https://github.com/os-autoinst/os-autoinst-distri-openQA/pull/187 was just merged. Not sure what this is about.

I have to revert it.

Actually someone already did it https://github.com/os-autoinst/os-autoinst-distri-openQA/pull/188

Actions #16

Updated by openqa_review 2 months ago

  • Due date set to 2024-08-13

Setting due date based on mean cycle time of SUSE QE Tools

Actions #17

Updated by ybonatakis 2 months ago

  • Status changed from In Progress to Feedback

I cant say I have any progress.
I start looking at the fetchneedles when I realised that this is not the objective of the ticket.

I used the registry.opensuse.org/devel/openqa/containers/openqa-single-instance to run experiments.
Following the suggestion and try to understand how minion[0] works.
then I reviewed https://github.com/os-autoinst/openQA/commit/a38165bd7576d7ebba5497478636e176fb83daa1 but I dont understand how does it work and seems I cant figure it out how and what to expect inside the container. In the container the git_auto_clone is enabled.
When I cloned a openqa2openqa job the /var/lib/openqa/share/tests/openqa got updated

[info] Ignoring invalid group {"id":"24"} when creating new job 1
[info] Running cmd: env GIT_SSH_COMMAND="ssh -oBatchMode=yes" git ls-remote --symref https://github.com/os-autoinst/os-autoinst-distri-openQA HEAD
[info] cmd returned 0
[info] Running cmd: env GIT_SSH_COMMAND="ssh -oBatchMode=yes" git clone https://github.com/os-autoinst/os-autoinst-distri-openQA /var/lib/openqa/share/tests/openqa
[info] cmd returned 0
[info] Running cmd: git -C /var/lib/openqa/share/tests/openqa remote get-url origin
[info] cmd returned 0
[info] Running cmd: git -C /var/lib/openqa/share/tests/openqa branch --show-current
[info] cmd returned 0
[info] Running cmd: env GIT_SSH_COMMAND="ssh -oBatchMode=yes" git -C /var/lib/openqa/share/tests/openqa fetch origin master
[info] cmd returned 0
[info] Running cmd: git -C /var/lib/openqa/share/tests/openqa reset --hard origin/master
[info] cmd returned 0
[info] Running cmd: env GIT_SSH_COMMAND="ssh -oBatchMode=yes" git ls-remote --symref https://github.com/os-autoinst/os-autoinst-needles-openQA HEAD
[info] cmd returned 0
[info] Running cmd: env GIT_SSH_COMMAND="ssh -oBatchMode=yes" git clone https://github.com/os-autoinst/os-autoinst-needles-openQA /var/lib/openqa/share/tests/openqa/needles
[info] cmd returned 0
[info] Running cmd: git -C /var/lib/openqa/share/tests/openqa/needles remote get-url origin
[info] cmd returned 0
[info] Running cmd: git -C /var/lib/openqa/share/tests/openqa/needles branch --show-current
[info] cmd returned 0
[info] Running cmd: env GIT_SSH_COMMAND="ssh -oBatchMode=yes" git -C /var/lib/openqa/share/tests/openqa/needles fetch origin master
[info] cmd returned 0
[info] Running cmd: git -C /var/lib/openqa/share/tests/openqa/needles reset --hard origin/master
[info] cmd returned 0

Take a look at https://github.com/os-autoinst/openQA/commit/a38165bd7576d7ebba5497478636e176fb83daa1 too.

[0] https://metacpan.org/dist/Minion/view/lib/Minion/Guide.pod

Actions #18

Updated by tinita 2 months ago

  • Status changed from Feedback to In Progress
  • Assignee changed from ybonatakis to tinita
Actions #19

Updated by tinita 2 months ago

It looks like we can just use the OpenQA::Task::Git::Clone minion job for this as well, with some changes.
https://github.com/os-autoinst/openQA/commit/a38165bd7576d7ebba5497478636e176fb83daa1
I made some minor adjustments, so that instead of passing

/path/to/share/tests/foo: git-url
/path/to/share/tests/foo/needles: git-url

you can leave the values empty (when no CASEDIR / NEEDLES_DIR is set):

/path/to/share/tests/foo: null
/path/to/share/tests/foo/needles: null

and then it will only do a git fetch etc. if the clone already exists in the given path.

One thing to consider: If people are working with a git repo directly in /path/to/share/tests/foo (or a symlink to their clone), then the minion job should probably not do anything in there. Most commands are harmless, except the git reset --hard.
Question is, how to find out if the git clone is really just a clone or the actual developer git. Checking for a dirty status might not be enough.

Actions #20

Updated by tinita 2 months ago

https://github.com/os-autoinst/openQA/pull/5808 Proof of Concept: Support git_auto_clone for empty CASDIR/NEEDLES_DIR

To test:

openqa.ini:

[scm git]
git_auto_clone = yes

Clone https://github.com/os-autoinst/os-autoinst-distri-openQA into /path/to/openqa/share/tests/openqa.
Start openqa daemons including gru.
Then run

openqa-clone-job --host http://localhost:9526 https://openqa.opensuse.org/tests/4360561 _TRIGGER_JOB_DONE_HOOK=0 SCHEDULE=tests/install/boot CASEDIR=

In the gru log you should see all executed git commands.

  • Editing a file in the checkout results in the minion job not updating the repo because of a dirty status
  • An empty CASEDIR/NEEDLES_DIR will auto clone, but explicitly specifying a local path will not

Todos

% git status
fatal: .git/index: index file smaller than expected
  • save_needle code should be reviewed. Maybe code can be reused. OpenQA::Task::Needle::Save uses OpenQA::Git, OpenQA::Task::Git::Clone doesn't
  • How to prevent updating a user's working git clone?

Since we currently haven't git_auto_clone enabled on o3/osd, we can roll out the feature and enable it and can monitor.
Important: fetchneedles needs to be disabled then.

Actions #21

Updated by tinita 2 months ago

We could already add the git gc and git diff-index to the current code.

Actions #22

Updated by tinita 2 months ago

  • Copied to action #164883: Use same minion guard for save_needle, delete_needles and git_clone size:S added
Actions #23

Updated by tinita 2 months ago

  • Copied to action #164886: Use OpenQA::Git for all our git wrappers size:S added
Actions #24

Updated by tinita 2 months ago

  • Copied to action #164889: Ensure git repos cloned by minions are cleaned up regularly size:S added
Actions #25

Updated by tinita 2 months ago

  • Related to action #164895: o3 had corrupted needles git repo, lost uncommitted needles between 2024-07-31 and 2024-08-02 added
Actions #26

Updated by tinita 2 months ago

  • Copied to action #164898: Replace fetchneedles with a minion job size:M added
Actions #27

Updated by tinita 2 months ago

  • Status changed from In Progress to Feedback

I created #164898 as the main followup and 3 others as smaller preparation tickets that should be done before.
For the problem with the git index in the needle checkout see #164895

Actions #28

Updated by tinita 2 months ago

  • Status changed from Feedback to Resolved

No comments, so I guess this is resolved

Actions #29

Updated by okurz 17 days ago

  • Due date deleted (2024-08-13)
Actions

Also available in: Atom PDF