Project

General

Profile

Actions

action #10966

closed

action #10960: current performance problems on o.s.d

SLES Build1204 overview page times out with "502 Bad Gateway"

Added by okurz about 8 years ago. Updated about 8 years ago.

Status:
Resolved
Priority:
Immediate
Assignee:
Category:
-
Target version:
-
Start date:
2016-02-28
Due date:
% Done:

0%

Estimated time:

Description

observation

Accessing https://openqa.suse.de/tests/overview?distri=sle&version=12-SP2&build=1204&groupid=25 yields an error "502" after some time.

from /var/logs/openqa on o.s.d:

[Sun Feb 28 09:37:36 2016][26145][debug] [DBIx debug] Took 8.41389394 seconds executed: SELECT me.id, me.slug, me.result_dir, me.state, me.priority, me.result, me.worker_id, me.test, me.clone_id, me.retry_avbl, me.backend, me.backend_info, me.group_id, me.t_started, me.t_finished, me.t_created, me.t_updated, settings.id, settings.key, settings.value, settings.job_id, settings.t_created, settings.t_updated, parents.child_job_id, parents.parent_job_id, parents.dependency, children.child_job_id, children.parent_job_id, children.dependency FROM jobs me LEFT JOIN job_settings settings ON settings.job_id = me.id LEFT JOIN job_dependencies parents ON parents.child_job_id = me.id LEFT JOIN job_dependencies children ON children.parent_job_id = me.id WHERE ( ( me.clone_id IS NULL AND me.group_id = ? AND me.id IN ( SELECT me.job_id FROM job_settings me LEFT JOIN job_settings siblings ON siblings.job_id = me.job_id LEFT JOIN job_settings siblings_2 ON siblings_2.job_id = me.job_id WHERE ( ( ( me.key = ? AND me.value = ? ) AND ( siblings.key = ? AND siblings.value = ? ) AND ( siblings_2.key = ? AND siblings_2.value = ? ) ) ) ) ) ) ORDER BY me.id DESC: '25', 'VERSION', '12-SP2', 'DISTRI', 'sle', 'BUILD', '1204'.

Steps to reproduce

Problem

I see the very long query time (see observation) but no other obvious error message in the logs. I am trying to reproduce this locally and it looks like this is possible. The very long query from above also appears (twice) plus a subsequent

Took 12.04192710 seconds executed: SELECT me.job_id, me.result, me.soft_failure, COUNT( id ) FROM job_modules me WHERE <super_long_job_list>

and finally the request is finished without timeout when run locally after 111.849923s. Remember to also see parent task #10960. The slow rendering is not because of any change since 97b8d92 but the database changed its layout or content since then or something in the data of new builds is very different from before, e.g. many more jobs for external reasons.

Actions

Also available in: Atom PDF