Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculate row bounding box in single-word mode per #4304 #4305

Merged
merged 1 commit into from
Aug 23, 2024

Conversation

Balearica
Copy link
Contributor

See #4304 for full context.

In short, ROW objects are expected to have valid/accurate bounding boxes, which are accessed using row->bounding_box(). However, bounding box values are not calculated automatically--the method row_obj->recalc_bounding_box() must be run after the row is created and populated to calculate the bounding box. This happens correctly in 5 of the 6 instances where new ROW is called, however does not occur in 1 case (within the make_single_word function that runs when psm is SINGLE_WORD or SINGLE_CHAR).

This PR fixes this by adding the required call to recalc_bounding_box within make_single_word. This brings that function in line with all the other functions that create ROW objects.

This change fixes the bug described in #4304, where baselines are not correctly calculated when using certain psm settings. After implementing this fix, the motivating example in that issue is resolved, and the baseline is calculated correctly.

@egorpugin egorpugin merged commit ee80dfe into tesseract-ocr:main Aug 23, 2024
7 checks passed
@stweil
Copy link
Contributor

stweil commented Aug 23, 2024

@Balearica, thanks for this fix. Please use your full name as commit author in future pull requests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants