action #57143
closed[YAML] Editor does not check if same combination of test suite/arch/flavor/version already used in different job group
0%
Description
In legacy Job Group Editor you were not able to add a second combination of the same test suite + arch + flavor + version. It was also not possible if the job group was different. In this case you've got an SQL error (see the related ticket).
In the YAML Job Group editor you can have the same 'test suite + arch + flavor + version' combination twice in different job groups. E.g. on OSD we currently have:
defaults:
x86_64:
machine: 64bit
priority: 50
products:
sle-15-SP2-Installer-DVD-x86_64:
distri: sle
flavor: Installer-DVD
version: 15-SP2
scenarios:
x86_64:
sle-15-SP2-Installer-DVD-x86_64:
- wicked_basic_sut
- wicked_advanced_ref
- wicked_advanced_sut
- wicked_startandstop_sut
- wicked_startandstop_ref
- wicked_basic_ref
and
defaults:
aarch64:
machine: aarch64
priority: 50
x86_64:
machine: 64bit
priority: 50
products:
sle-15-SP2-Installer-DVD-aarch64: &products
distri: sle
flavor: Installer-DVD
version: 15-SP2
sle-15-SP2-Installer-DVD-x86_64:
*products
scenarios:
aarch64:
sle-15-SP2-Installer-DVD-aarch64: &tests
- wicked_basic_sut: &general_settings
settings:
DESKTOP: textmode
EXTRATEST: wicked
KEEP_GRUB_TIMEOUT: '1'
VIDEOMODE: text
WICKED_TCPDUMP: '1'
VIRTIO_CONSOLE_NUM: '2'
- wicked_advanced_ref:
*general_settings
- wicked_advanced_sut:
*general_settings
- wicked_startandstop_sut:
*general_settings
- wicked_startandstop_ref:
*general_settings
- wicked_basic_ref:
*general_settings
- wicked_aggregate_sut:
*general_settings
- wicked_aggregate_ref:
*general_settings
- create_hdd_autoyast_wicked:
settings:
AUTOYAST: autoyast_sle15/autoyast_wicked_%ARCH%.xml
priority: 45
x86_64:
sle-15-SP2-Installer-DVD-x86_64:
*tests
in the job groups 117 (SLE 15 Development -> Network) and 262 (SLE 15 -> Network).
This leads to the job being scheduled twice when posting a new ISO. Considering that the job templates actually exist twice this is expected behavior. The question is whether we want to allow the same 'test suite + arch + flavor + version' combination in different job groups.
(Note that the ticket from @asmorodskyi implies that the new editor should check this as the old editor did. I personally would leave this open for discussion.)
Updated by mkittler about 5 years ago
- Related to action #15192: [tools]DB exception popup while trying to add Test Suite with same name added
Updated by mkittler about 5 years ago
- Subject changed from [YAML] Editor does not check if such combination of test suite/arch/flavor/version already in use to [YAML] Editor does not check if same combination of test suite/arch/flavor/version already used in different job group
- Description updated (diff)
Updated by livdywan about 5 years ago
- So we have
add_unique_constraint([qw(product_id machine_id name test_suite_id)])
in the code right now. - When updating job templates we do
$schema->resultset('JobTemplates')->find_or_create()
which fails on non-unique combinations of the above within the same group. - There's no error from the database for different job groups
Updated by livdywan about 5 years ago
- Status changed from New to In Progress
- Assignee set to livdywan
I'm investigating this now. If the above doesn't make sense, please ignore. These were just quick notes to keep a record of the investigation.
Updated by livdywan about 5 years ago
Updated by livdywan about 5 years ago
- Status changed from In Progress to Resolved
Updated by livdywan about 5 years ago
The fix I just merged enforces the correct checks for unique combinations across groups.
Note that no immediate changes will result from that when it's deployed, but the editor will require affected groups to be updated the next time they're modified.
Updated by okurz about 5 years ago
- Category set to Regressions/Crashes
- Status changed from Resolved to Feedback
It seems on o3 the migration did not show any problems – at least I am not aware of not seen any – but on OSD we had:
Sep 27 07:34:32 openqa systemd[1]: Started The openQA web UI.
Sep 27 07:34:35 openqa openqa[27056]: failed to run SQL in /usr/share/openqa/script/../dbicdh/PostgreSQL/upgrade/81-82/001-update.sql: DBIx::Class::DeploymentHandler::DeployMethod::SQL::Translator::try {...} (): DBI Exceptio>
Sep 27 07:34:35 openqa openqa[27056]: DETAIL: Key (product_id, machine_id, name, test_suite_id)=(339, 60, , 1652) already exists. at inline delegation in DBIx::Class::DeploymentHandler for deploy_method->upgrade_single_step>
Sep 27 07:34:35 openqa openqa[27056]: (running line 'update job_templates set name='' where name is null') at /usr/lib/perl5/vendor_perl/5.26.1/DBIx/Class/DeploymentHandler/DeployMethod/SQL/Translator.pm line 248.
Sep 27 07:34:35 openqa openqa[27056]: DBIx::Class::Storage::TxnScopeGuard::DESTROY(): A DBIx::Class::Storage::TxnScopeGuard went out of scope without explicit commit or error. Rolling back. at /usr/share/openqa/script/openqa>
Sep 27 07:34:35 openqa systemd[1]: openqa-webui.service: Main process exited, code=exited, status=255/n/a
Sep 27 07:34:35 openqa systemd[1]: openqa-webui.service: Unit entered failed state.
Sep 27 07:34:35 openqa systemd[1]: openqa-webui.service: Failed with result 'exit-code'.
Sep 27 07:35:32 openqa systemd[1]: Started The openQA web UI.
Sep 27 07:35:33 openqa systemd[1]: Stopping The openQA web UI...
Sep 27 07:35:33 openqa systemd[1]: Stopped The openQA web UI.
Sep 27 07:35:33 openqa systemd[1]: Started The openQA web UI.
Sep 27 07:35:35 openqa openqa[27391]: failed to run SQL in /usr/share/openqa/script/../dbicdh/PostgreSQL/upgrade/81-82/002-auto.sql: DBIx::Class::DeploymentHandler::DeployMethod::SQL::Translator::try {...} (): DBI Exception:>
Sep 27 07:35:35 openqa openqa[27391]: (running line 'ALTER TABLE job_templates ALTER COLUMN name SET NOT NULL') at /usr/lib/perl5/vendor_perl/5.26.1/DBIx/Class/DeploymentHandler/DeployMethod/SQL/Translator.pm line 248.
Sep 27 07:35:35 openqa openqa[27391]: DBIx::Class::Storage::TxnScopeGuard::DESTROY(): A DBIx::Class::Storage::TxnScopeGuard went out of scope without explicit commit or error. Rolling back. at /usr/share/openqa/script/openqa>
Sep 27 07:35:35 openqa openqa[27391]: DBIx::Class::Storage::TxnScopeGuard::DESTROY(): A DBIx::Class::Storage::TxnScopeGuard went out of scope without explicit commit or error. Rolling back. at /usr/share/openqa/script/openqa>
Sep 27 07:35:35 openqa systemd[1]: openqa-webui.service: Main process exited, code=exited, status=255/n/a
I manually disabled the two files to bring the web UI up again. Please provide a fix also for the running instance on osd.
Updated by coolo about 5 years ago
- Assignee changed from livdywan to okurz
there is no code fix to be provided - the admin of the site will have to sort out the duplicates and decide which one wins. And as you decided to deploy on a friday morning, that would be you
Updated by coolo about 5 years ago
On a friday morning you're reportedly on vacation - I forgot to add
Updated by okurz about 5 years ago
half-day vacation, same as for the whole week as well as next week. Thank you for encouraging the team to take more responsibility ;)
Updated by okurz about 5 years ago
- Status changed from Feedback to Resolved
coolo wrote:
On a friday morning you're reportedly on vacation - I forgot to add
half-day vacation, same as for the whole week as well as next week. Thank you for encouraging the team to take more responsibility ;)
Note that neither the ticket nor the PR state that the final version did not include an automatic remedy. Also I would have expected that the openQA instance would have been updated accordingly upfront.
I checked the database manually and prevented duplicate job templates in job groups, mainly some lvm, cryptlvm scenarios which were defined both in the production YaST job groups as well as test development so I simply deleted them from the test development job groups. Then I called the content of the migration scripts again to apply the changes.
Updated by mkittler about 5 years ago
- Category deleted (
Regressions/Crashes) - Status changed from Resolved to In Progress
- Assignee changed from okurz to livdywan
Note that neither the ticket nor the PR state that the final version did not include an automatic remedy.
@okurz I mentioned the problem: "I'm wondering whether it is worth/required to add a migration for detecting job templates which are wrongly "shared" between job groups." (https://github.com/os-autoinst/openQA/pull/2345#pullrequestreview-292266649-body-html)
I checked the database manually and prevented duplicate job templates in job groups, mainly some lvm, cryptlvm scenarios which were defined both in the production YaST job groups as well as test development so I simply deleted them from the test development job groups. Then I called the content of the migration scripts again to apply the changes.
Thanks. I guess then we can save the work of implementing an automatic database migration for this.
This migration should have been tested with recent OSD data (like almost every migration).
Updated by okurz about 5 years ago
- Category set to Regressions/Crashes
- Status changed from In Progress to Resolved
- Target version set to Current Sprint
@mkittler I think you have reopened the ticket by mistake because you even removed the category.