action #42359
closed
[functional][y][sporadic] test fails in gnuhealth_install to write to /dev/ttyS0 "Input/output error" while serial getty seems to restart at the same time?
Added by okurz over 5 years ago.
Updated over 5 years ago.
Category:
Bugs in existing tests
Target version:
SUSE QA - Milestone 21
Description
Observation¶
openQA test in scenario opensuse-15.1-DVD-x86_64-gnuhealth@64bit fails in
gnuhealth_install
trying to write to /dev/ttyS0 with "Input/output error" while serial getty seems to restart at the same time?
https://openqa.opensuse.org/tests/771001/file/gnuhealth_install-journal.log shows that at the time of the screenshot (20:20 - 20:22) the lines:
Oct 10 20:21:00 susetest su[1851]: pam_unix(su:session): session opened for user root by bernhard(uid=1000)
Oct 10 20:21:01 susetest systemd[1]: serial-getty@ttyS0.service: Service hold-off time over, scheduling restart.
Oct 10 20:21:01 susetest systemd[1]: Stopped Serial Getty on ttyS0.
Oct 10 20:21:01 susetest systemd[1]: Started Serial Getty on ttyS0.
Shouldn't the getty process have been masked in before, e.g. by https://openqa.opensuse.org/tests/770994#step/system_prepare/1 ? Neither the parent job creating the image nor the downstream job call "consoletest_setup" which looks ok just judging about the name because we do not want to run any pure "consoletests" in gnuhealth.
Good reading regarding how it works: http://0pointer.de/blog/projects/serial-console.html
Reproducible¶
Fails since (at least) Build 315.1 (current job)
Expected result¶
Last good: 314.2 (or more recent)
Further details¶
Always latest result in this scenario: latest
- Subject changed from [functional][u][sporadic] test fails in gnuhealth_install to write to /dev/ttyS0 "Input/output error" while serial getty seems to restart at the same time? to [functional][y][sporadic] test fails in gnuhealth_install to write to /dev/ttyS0 "Input/output error" while serial getty seems to restart at the same time?
- Status changed from New to Workable
- Assignee set to riafarov
- Target version set to Milestone 20
@riafarov could you take a look please because you were involved with masking the serial getty services?
- Assignee changed from riafarov to okurz
@okurz, stopping of serial-getty is done in consoletest_setup, which is not schedule neither in the given test suite, nor in createhdd, so it's not related. So I guess we need to investigate further if it's a product bug. And potentially do this change in system_prepare which doesn't sound 100% right to me.
- Due date set to 2018-11-20
- Assignee deleted (
okurz)
Thank you for your explanation. I am still suspecting that some change changed the test schedule accordingly so let's take a look soon.
- Due date changed from 2018-11-20 to 2018-11-06
- Priority changed from Normal to Urgent
So, I found easy way to reproduce the error.
Start writing to ttyS0 in infinite loop: while true; do echo bb > /dev/ttyS0 ; done
run systemctl stop serial-getty@ttyS0
"Input/output error" is shown for all attempts to write to tty in the loop. Once command stopped and started again, it works fine again.
- Priority changed from Urgent to High
So I've updated the bug with my findings, it's a regression, hopefully we'll get some feedback there. As of now, most popular solution on web is to disable getty service. Problem with this approach is that we rely on Welcome messages to detect that system has booted up. Also, for SLE 15 we failed before reached the step where we stop serial-getty service.
After discussion with @okurz, we came up with 2 potential scalable solutions:
- Static: Always disable serial-getty before writing to serial, use ssh to detect that system booted up
- Dynamic: Use same approach as for activating ttys, but for serial devices. Before writing to serial device, trigger some activation method. In our case it should set permissions and handle serial-getty service. Potentially, more steps.
Other options are hacks, which we will try to avoid as issue is not new and is not that critical yet.
- Description updated (diff)
- Status changed from Workable to Feedback
@zluo has setup where he can reproduce this issue all the time, on Monday I will reuse it to figure out better solution to work the bug around.
- Due date changed from 2018-11-06 to 2018-11-20
- Related to action #39575: [functional][u] Split consoletest_setup to smaller parts which serve single purpose per module added
- Status changed from Feedback to In Progress
- Estimated time set to 3.00 h
- Status changed from In Progress to Feedback
- Due date changed from 2018-11-20 to 2018-12-04
Change was reverted, need to reduce scope when the workaround is applied.
- Status changed from Feedback to In Progress
- Status changed from In Progress to Feedback
- Target version changed from Milestone 20 to Milestone 21
Yes, PR still not merged. Also I don't think that we should label job with progress ticket, because it's a workaround for bug.
- Due date changed from 2018-12-04 to 2018-12-18
- Blocks action #44186: [functional][u] test fails in kdump_and_crash - Login prompt appeared in serial console output unexpectedly added
- Status changed from Feedback to Resolved
Also available in: Atom
PDF