Project

General

Profile

Actions

action #121354

open

Implement purely text based needles

Added by clanig over 1 year ago. Updated 7 months ago.

Status:
New
Priority:
Low
Assignee:
Category:
Feature requests
Target version:
Start date:
2022-12-02
Due date:
% Done:

0%

Estimated time:

Description

Currently, OCR is used for matching areas of screenshots. It is not possible to create OCR-only needles. Tesseract is used for OCR.

The goal of this ticket is to provide a mechanism to match the content on the current screenshot truly against a pure character string.

The Vision: A test developer could select the OCR needle area on a screenshot in the openQA web UI. The OCR software would immediately display what would be the OCR character string output. The developer could change the selected area based on that output. Finally, the developer could create a pure text file (JSON) without and .png file, to store the reference.
The backend would go through the corresponding tagged JSON-files as usual, and match the reference character string against what the character string the OCR software has found in the reference area in the screenshot of the actual test run. Iff the OCR software would have found a different character string, the Levenshtein distance should be calculated, to determine a percentage accuracy below 100%, comparable with the existing accuracy displayed in the openQA UI.

Some research did reveal that this might be achievable with the OCR tool GOCR, and the Perl Module package perl-Text-Levenshtein.

Actions

Also available in: Atom PDF