action #123193
closed02-test_ocr.t fails in OBS size:M
Description
Observation¶
The build on Tumbleweed started to fail on January 14:
https://build.opensuse.org/package/live_build_log/devel:openQA/os-autoinst/openSUSE_Factory/x86_64
[ 113s] 3: [12:46:28] ./t/02-test_ocr.t ..........................
[ 113s] 3: ok 1 - log output for needle init
[ 113s] 3: not ok 2 - log output for OCR
[ 113s] 3:
[ 113s] 3: # Failed test 'log output for OCR'
[ 113s] 3: # at ./t/02-test_ocr.t line 36.
[ 113s] 3: # STDERR:
[ 113s] 3: # Error opening data file /usr/share/tesseract-ocr/tessdata/eng.traineddata
[ 113s] 3: # Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory.
[ 113s] 3: # Failed loading language 'eng'
[ 113s] 3: # Tesseract couldn't load any languages!
[ 113s] 3: # Could not initialize tesseract.
[ 113s] 3: # readline() on closed filehandle $fh at /home/abuild/rpmbuild/BUILD/os-autoinst-4.6.1673533640.573778d/ocr.pm line 23.
[ 113s] 3: # Error opening data file /usr/share/tesseract-ocr/tessdata/eng.traineddata
[ 113s] 3: # Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory.
[ 113s] 3: # Failed loading language 'eng'
[ 113s] 3: # Tesseract couldn't load any languages!
[ 113s] 3: # Could not initialize tesseract.
[ 113s] 3: # readline() on closed filehandle $fh at /home/abuild/rpmbuild/BUILD/os-autoinst-4.6.1673533640.573778d/ocr.pm line 23.
[ 113s] 3: #
[ 113s] 3: # doesn't match:
[ 113s] 3: # (?^u:Tesseract.*OCR)
[ 113s] 3: # as expected
[ 113s] 3: ok 3 - ocr match 1
[ 113s] 3: not ok 4 - log output for tesseract call
[ 113s] 3:
[ 113s] 3: # Failed test 'log output for tesseract call'
[ 113s] 3: # at ./t/02-test_ocr.t line 42.
[ 113s] 3: # STDERR:
[ 113s] 3: # Error opening data file /usr/share/tesseract-ocr/tessdata/eng.traineddata
[ 113s] 3: # Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory.
[ 113s] 3: # Failed loading language 'eng'
[ 113s] 3: # Tesseract couldn't load any languages!
[ 113s] 3: # Could not initialize tesseract.
[ 113s] 3: # readline() on closed filehandle $fh at /home/abuild/rpmbuild/BUILD/os-autoinst-4.6.1673533640.573778d/ocr.pm line 23.
[ 113s] 3: # Use of uninitialized value in concatenation (.) or string at ./t/02-test_ocr.t line 42.
[ 113s] 3: #
[ 113s] 3: # doesn't match:
[ 113s] 3: # (?^u:Tesseract.*OCR)
[ 113s] 3: # as expected
[ 113s] 3: not ok 5 - log output for tesseract call
[ 113s] 3:
[ 113s] 3: # Failed test 'log output for tesseract call'
[ 113s] 3: # at ./t/02-test_ocr.t line 42.
[ 113s] 3: # STDERR:
[ 113s] 3: # Error opening data file /usr/share/tesseract-ocr/tessdata/eng.traineddata
[ 113s] 3: # Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory.
[ 113s] 3: # Failed loading language 'eng'
[ 113s] 3: # Tesseract couldn't load any languages!
[ 113s] 3: # Could not initialize tesseract.
[ 113s] 3: # readline() on closed filehandle $fh at /home/abuild/rpmbuild/BUILD/os-autoinst-4.6.1673533640.573778d/ocr.pm line 23.
[ 113s] 3: # Use of uninitialized value in concatenation (.) or string at ./t/02-test_ocr.t line 42.
[ 113s] 3: #
[ 113s] 3: # doesn't match:
[ 113s] 3: # (?^u:Tesseract.*OCR)
[ 113s] 3: # as expected
[ 113s] 3: ok 6 - OCR area found
[ 113s] 3: not ok 7 - multiple OCR regions
[ 113s] 3:
[ 113s] 3: # Failed test 'multiple OCR regions'
[ 113s] 3: # at ./t/02-test_ocr.t line 45.
[ 113s] 3: not ok 8 - no (unexpected) warnings (via done_testing)
[ 113s] 3:
[ 113s] 3: # Failed test 'no (unexpected) warnings (via done_testing)'
[ 113s] 3: # at ./t/02-test_ocr.t line 48.
[ 113s] 3: # Got the following unexpected warnings:
[ 113s] 3: # 1: readline() on closed filehandle $fh at /home/abuild/rpmbuild/BUILD/os-autoinst-4.6.1673533640.573778d/ocr.pm line 23.
[ 113s] 3: # 2: readline() on closed filehandle $fh at /home/abuild/rpmbuild/BUILD/os-autoinst-4.6.1673533640.573778d/ocr.pm line 23.
[ 113s] 3: # 3: readline() on closed filehandle $fh at /home/abuild/rpmbuild/BUILD/os-autoinst-4.6.1673533640.573778d/ocr.pm line 23.
[ 113s] 3: # 4: Use of uninitialized value in concatenation (.) or string at ./t/02-test_ocr.t line 42.
[ 113s] 3: # 5: readline() on closed filehandle $fh at /home/abuild/rpmbuild/BUILD/os-autoinst-4.6.1673533640.573778d/ocr.pm line 23.
[ 113s] 3: # 6: Use of uninitialized value in concatenation (.) or string at ./t/02-test_ocr.t line 42.
[ 113s] 3: 1..8
[ 113s] 3: # Looks like you failed 5 tests of 8.
[ 114s] 3: Dubious, test returned 5 (wstat 1280, 0x500)
[ 114s] 3: Failed 5/8 subtests
Aceptance criteria¶
- AC1: Test no longer fails
Suggestions¶
- Wait and see if update fixes the problem
- Otherwise debug the OCR library locally
Updated by osukup over 1 year ago
location of tesseract trained data changed in x86-64 tumbleweed? or excepted location ...
abuild@quasar:~/rpmbuild/BUILD/os-autoinst-4.6.1673533640.573778d> rpm -ql tesseract-ocr-traineddata-english
/usr/share/tessdata
/usr/share/tessdata/eng.cube.bigrams
/usr/share/tessdata/eng.cube.fold
/usr/share/tessdata/eng.cube.lm
/usr/share/tessdata/eng.cube.nn
/usr/share/tessdata/eng.cube.params
/usr/share/tessdata/eng.cube.size
/usr/share/tessdata/eng.cube.word-freq
/usr/share/tessdata/eng.tesseract_cube.nn
/usr/share/tessdata/eng.traineddata
Updated by osukup over 1 year ago
- Status changed from New to In Progress
- Assignee set to osukup
Updated by osukup over 1 year ago
after passing TESSDATA_PREFIX to test:
[ 93s] 3: not ok 2 - log output for OCR
[ 93s] 3:
[ 93s] 3: # Failed test 'log output for OCR'
[ 93s] 3: # at ./t/02-test_ocr.t line 36.
[ 93s] 3: # STDERR:
[ 93s] 3: # Warning: Parameter not found: enable_new_segsearch
[ 93s] 3: # Estimating resolution as 132
[ 93s] 3: # Warning: Parameter not found: enable_new_segsearch
[ 93s] 3: # Estimating resolution as 138
[ 93s] 3: #
[ 93s] 3: # doesn't match:
[ 93s] 3: # (?^u:Tesseract.*OCR)
[ 93s] 3: # as expected
[ 93s] 3: ok 3 - ocr match 1
[ 93s] 3: not ok 4 - log output for tesseract call
[ 93s] 3:
[ 93s] 3: # Failed test 'log output for tesseract call'
[ 93s] 3: # at ./t/02-test_ocr.t line 42.
[ 93s] 3: # STDERR:
[ 93s] 3: # Warning: Parameter not found: enable_new_segsearch
[ 93s] 3: # Estimating resolution as 132
[ 93s] 3: #
[ 93s] 3: # doesn't match:
[ 93s] 3: # (?^u:Tesseract.*OCR)
[ 93s] 3: # as expected
[ 94s] 3: not ok 5 - log output for tesseract call
[ 94s] 3:
[ 94s] 3: # Failed test 'log output for tesseract call'
[ 94s] 3: # at ./t/02-test_ocr.t line 42.
[ 94s] 3: # STDERR:
[ 94s] 3: # Warning: Parameter not found: enable_new_segsearch
[ 94s] 3: # Estimating resolution as 138
[ 94s] 3: #
[ 94s] 3: # doesn't match:
[ 94s] 3: # (?^u:Tesseract.*OCR)
[ 94s] 3: # as expected
[ 94s] 3: ok 6 - OCR area found
[ 94s] 3: ok 7 - multiple OCR regions
[ 94s] 3: ok 8 - no (unexpected) warnings (via done_testing)
[ 94s] 3: 1..8
[ 94s] 3: # Looks like you failed 3 tests of 8.
[ 94s] 3: Dubious, test returned 3 (wstat 768, 0x300)
[ 94s] 3: Failed 3/8 subtests
Updated by osukup over 1 year ago
according to google this warnings appears if traineddata are damaged , when I changed data to another from upstream test passed without problems ...
Updated by osukup over 1 year ago
Updated by osukup over 1 year ago
--> so package in Publishing contains updates data, unfortuanetly it newer got into Factory.
new SR , we will see what it need fix to got into Factory -> https://build.opensuse.org/request/show/1058890
Updated by mkittler over 1 year ago
I can reproduce it on my local TW system and also saw it in our normal CI jobs. Let's see whether the SR fixes it.
Updated by openqa_review over 1 year ago
- Due date set to 2023-02-01
Setting due date based on mean cycle time of SUSE QE Tools
Updated by osukup over 1 year ago
SR https://build.opensuse.org/request/show/1059187 accepted to Factory, so next snapshot will have new trainingdata, after rebuild we will se if we need pass TESSDATA_PREFIX
in spec
Updated by livdywan over 1 year ago
- Status changed from In Progress to Feedback
Brought it up briefly. Let's set it to Feedback for now since the SR is there, and in a few days we'll see if this works fine
Updated by mkittler over 1 year ago
- Subject changed from 02-test_ocr.t fails in OBS to 02-test_ocr.t fails in OBS size:M
- Description updated (diff)
Updated by osukup over 1 year ago
- Status changed from Feedback to In Progress
new training data in distro, .. spec needs define TESSDATA_PREFIX
but with defined prefix it still fails with:
[ 92s] 3: not ok 2 - log output for OCR
[ 92s] 3:
[ 92s] 3: # Failed test 'log output for OCR'
[ 92s] 3: # at ./t/02-test_ocr.t line 36.
[ 92s] 3: # STDERR:
[ 92s] 3: # Estimating resolution as 132
[ 92s] 3: # Estimating resolution as 138
[ 92s] 3: #
[ 92s] 3: # doesn't match:
[ 92s] 3: # (?^u:Tesseract.*OCR)
[ 92s] 3: # as expected
[ 92s] 3: ok 3 - ocr match 1
[ 92s] 3: not ok 4 - log output for tesseract call
[ 92s] 3:
[ 92s] 3: # Failed test 'log output for tesseract call'
[ 92s] 3: # at ./t/02-test_ocr.t line 42.
[ 92s] 3: # STDERR:
[ 92s] 3: # Estimating resolution as 132
[ 92s] 3: #
[ 92s] 3: # doesn't match:
[ 92s] 3: # (?^u:Tesseract.*OCR)
[ 92s] 3: # as expected
[ 93s] 3: not ok 5 - log output for tesseract call
[ 93s] 3:
[ 93s] 3: # Failed test 'log output for tesseract call'
[ 93s] 3: # at ./t/02-test_ocr.t line 42.
[ 93s] 3: # STDERR:
[ 93s] 3: # Estimating resolution as 138
[ 93s] 3: #
[ 93s] 3: # doesn't match:
[ 93s] 3: # (?^u:Tesseract.*OCR)
[ 93s] 3: # as expected
which means there is also change in behavior in tesseract-5.3.x
Updated by osukup over 1 year ago
- Status changed from In Progress to Resolved