Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skewed bounding boxes #437

Open
JKrivec opened this issue Jul 17, 2024 · 4 comments
Open

Skewed bounding boxes #437

JKrivec opened this issue Jul 17, 2024 · 4 comments

Comments

@JKrivec
Copy link

JKrivec commented Jul 17, 2024

Hello,

When supplying the bounding boxes, I noticed that the degraded bounding boxes are not what I really imagined them to be.

image

The red text and the bounding boxes are what I pulled out of the original pdf, before degrading using Augraphy.
Shouldn't the degraded boxes be the ones I outlined in blue, so the whole original object is outlined?

This is even more obvious when you look at it at the larger scale:

image

The larger bounding box around the table is supposed to encompass the table, but here we can see that some of the textual boxes are now outside of the actual area of interest.

Is this a feature or a bug?

@kwcckw
Copy link
Collaborator

kwcckw commented Jul 18, 2024

Hi, so as in the documentation, only the start point and end point of the box are affected:

https://augraphy.readthedocs.io/en/latest/doc/source/augmentations/folding.html

So this should be consistent with your observation?

@JKrivec
Copy link
Author

JKrivec commented Jul 18, 2024

Yes, this is exactly what is stated in the docs, so I guess this is a feature, not a bug :).

Hovewer the second image I uploaded is a mix of Geometric and Folding, and with the larger bounding box, this is mostly an "issue" with the rotation. I would say that the correct way would be rotating the bounding box, then getting the bottom left and the top right coordinate and using that as the new bounding box.

If you were to label the table in the second image, where would you put the rectangle? I think you want to encompass the whole object.
I am not a computer vision specialist, so I'm not sure what the correct way is, so this issue is maybe just opening a debate how the bounding box computation should be approached

@kwcckw
Copy link
Collaborator

kwcckw commented Jul 18, 2024

Right, there should be a better solution to this problem. For example, for clockwise rotation, it should take top-left and bottom-right of the box, while for anticlockwise rotation (your example), it should take top-right and bottom-left of the box.

Thanks for pointing this out. So probably you can submit a pull request too if you are able to create a better alternative to address on this problem.

@JKrivec
Copy link
Author

JKrivec commented Jul 18, 2024

Yeah, the bounding box should just be the (min(all_x), min(all_y), max(all_x), max(all_y)) in my opinion.
Im currently very low on time, but I might give it a crack in a few months!
Feel free to close this, thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants