coordination #19720
Updated by okurz over 4 years ago
## motivation Make job failure investigation easier to save time and ensure we do not miss failures ## ideas * DONE: <del>See https://github.com/okurz/openqa_review/tree/feature/investigate especially https://github.com/okurz/openqa_review/blob/feature/investigate/openqa_review/investigate.py -> https://github.com/os-autoinst/scripts/pull/23</del> -> [gh#os-autoinst/scripts#24](https://github.com/os-autoinst/scripts/pull/24) * all package changes, e.g. save `rpm -qa` in file and provide diff and/or changelog for the worker *and* the SUT. For openSUSE e.g. read out changelog diff from https://openqa.opensuse.org/snapshot-changes/opensuse/Tumbleweed/ , for SLE from http://xcdchk.suse.de/ * changes on worker: * see #19720#note-14 and #19720#note-15 for preliminary evaluations * 1st step: Collect the data, e.g. if a hook is defined in worker config, call command on worker at end of job, save data in text file, upload text file (same as other log files) * 2nd step: Show the diff in "investigate" route if file(s) exist, same as we do for vars.json * diff of test schedule * if best needle candidate matches 0% it is most likely not a trivial needle issue * Make "last good" a link to a job instead of plain job ID * collapse content of initial rows in investigation tab when content becomes too big, e.g. more than 10 lines * In settings table mark origin of settings and changed settings, e.g. for setting "foo" instead of the table row "foo | 1" one could have * "foo | 1 (testsuites table)" when the settings comes from the test suites database table, e.g. compared to job templates, machines, etc. . This would also help when we allow even more sources for settings, e.g. load job templates from test distributions in parallel to database tables * update the settings table from vars.json after job run to included changes but then show which settings changed since the job was initially created * "foo | 1 (+)" when the setting is new in the scenario, with the table row and/or "(+)" in green (as in common colored diffs) and on hover it shows the explanation that this was added, linked to the commit, showing which job it compares against * "foo | 1 (<->)" or similar when the setting changed against "last good" where it was e.g. 0, with "(<->)" being a link to the "last good" job, with the table row in different color * DONE: <del>provide more data in the job logs itself for https://progress.opensuse.org/projects/openqav3/wiki#Further-decision-steps-working-on-test-issues</del> * <del>e.g. [gh#os-autoinst/os-autoinst#805](https://github.com/os-autoinst/os-autoinst/pull/805) -> providing also git hash for needles repo so we could also compare the differences in needles</del> -> [gh#os-autoinst/openQA#2625](https://github.com/os-autoinst/openQA/pull/2625) * DONE: <del>provide diff of failed job vs. "last good"</del> * DONE: <del>git log or diff for test+needle changes</del> -> [gh#os-autoinst/openQA/2566](https://github.com/os-autoinst/openQA/pull/2566) and [gh#os-autoinst/openQA/2609](https://github.com/os-autoinst/openQA/pull/2609) for test diff, [gh#os-autoinst/openQA/2625](https://github.com/os-autoinst/openQA/pull/2625) for needles * <del>list of changed files</del> * DONE: <del>exclude context in vars.json diff, distinguish change and add/remove</del> -> [gh#os-autoinst/openQA#2625](https://github.com/os-autoinst/openQA/pull/2625) * DONE: <del>exclude merges from test git log</del> -> [gh#os-autoinst/openQA#2625](https://github.com/os-autoinst/openQA/pull/2625) * DONE: <del>The Investigation tab should use CodeMirror to render diffs like we do for test sources or the YAML editor (from #61103)</del> -> Using nicer table instead should be good enough [gh#os-autoinst/openQA#2644](https://github.com/os-autoinst/openQA/pull/2644)