action #121354: Implement purely text based needles - openQA Project (public) - openSUSE Project Management Tool

Actions

Copy link

action #121354

open

Implement purely text based needles

Added by clanig over 2 years ago. Updated 7 months ago.

Status:

New

Priority:

Low

Assignee:

Category:

Feature requests

Target version:

QA (public) - future

Start date:

2022-12-02

Due date:

% Done:

Estimated time:

Description

Currently, OCR is used for matching areas of screenshots. It is not possible to create OCR-only needles. Tesseract is used for OCR.

The goal of this ticket is to provide a mechanism to match the content on the current screenshot truly against a pure character string.

The Vision: A test developer could select the OCR needle area on a screenshot in the openQA web UI. The OCR software would immediately display what would be the OCR character string output. The developer could change the selected area based on that output. Finally, the developer could create a pure text file (JSON) without and .png file, to store the reference.
The backend would go through the corresponding tagged JSON-files as usual, and match the reference character string against what the character string the OCR software has found in the reference area in the screenshot of the actual test run. Iff the OCR software would have found a different character string, the Levenshtein distance should be calculated, to determine a percentage accuracy below 100%, comparable with the existing accuracy displayed in the openQA UI.

Some research did reveal that this might be achievable with the OCR tool GOCR, and the Perl Module package perl-Text-Levenshtein.

Related issues 1 (1 open — 0 closed)

Actions

Copy link

Updated by okurz over 2 years ago

Project changed from openQA Infrastructure (public) to openQA Project (public)
Category set to Feature requests
Target version set to future

sounds cool. Moving to the right project for feature development. I just added "future" in the target version to show that the SUSE QE Tools team does not plan to work on it – as you plan to work on it yourself.

Actions

Copy link

Updated by clanig about 2 years ago

Please find a Draft along with POC here: https://github.com/os-autoinst/os-autoinst/pull/2276

Actions

Copy link

Updated by clanig about 2 years ago

I have updated the draft PR with the full OCR integration for the backend. Currently working on integrating it in the frontend.

To make the creation of needles more comfortable it might be considered providing functionality for live checking the OCR result in the frontend. However, that might be a separate task.

Actions

Copy link

Updated by clanig almost 2 years ago

I have noticed I made a wrong assumption about the existing implementation of the OCR functionality. I am going to check how it really works and change the PRs accordingly.
I am sure though, that the current implementation is not optimal - I am especially not convinced that Tesseract is the correct tool for the requirements of openQA.

Actions

Copy link