Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance enhancements #139

Merged
merged 4 commits into from
Mar 13, 2022
Merged

Performance enhancements #139

merged 4 commits into from
Mar 13, 2022

Conversation

phlash
Copy link
Collaborator

@phlash phlash commented Jan 30, 2022

Specifically to alpha_blend and convert_rgb_to_yuyv as per #138

This first draft addresses alpha_blend, reducing run time by a factor of 3-4x simply by separating the function and turning on full compiler optimisation for the machine that we are compiling on (-march=native). This may be a problem if compiling for a different target that doesn't have the same CPU features...

@BenBE
Copy link
Collaborator

BenBE commented Jan 30, 2022

Could you please rebase onto experimental?

@phlash
Copy link
Collaborator Author

phlash commented Feb 1, 2022

Could you please rebase onto experimental?

Odd, I thought this was based on current experimental, it looks like that to me?

app/blend.cc Outdated Show resolved Hide resolved
@BenBE
Copy link
Collaborator

BenBE commented Feb 24, 2022

Any updates here?

@phlash
Copy link
Collaborator Author

phlash commented Feb 24, 2022

Sorry - nothing much has happened this month - other distractions kicked in! I'll try and get back to this next week.

Copy link
Collaborator

@BenBE BenBE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just some minor ideas, nothing dramatic …

CMakeLists.txt Outdated Show resolved Hide resolved
app/blend.cc Outdated Show resolved Hide resolved
@phlash phlash marked this pull request as ready for review March 5, 2022 21:33
CMakeLists.txt Outdated
app/calcmask.cc
app/utils.cc
)
# special compile options for blend.cc to use SIMD optimiser
set_source_files_properties(app/blend.cc PROPERTIES COMPILE_OPTIONS "-march=native;-O9")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not for everything?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure.. I guess that if one module is targetted / optimised for the current CPU, then it may as well all be. Initially I only wanted to mess with / break one specific part of the code. I was also thinking about compiling run-time swappable implementations for different targets (this gets talked about a bit in the GCC manuals when using intrinsics) - that is probably over-engineering.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as we are compiling all 3rd party deps ourselves we may as well compile everything for -march=native. Let's split this into another PR, so we could merge this one for now.

Having run-time swappable implementations is probably overkill for us right now. Let's handle this separately.

@BenBE BenBE merged commit fa6b685 into floe:experimental Mar 13, 2022
@BenBE
Copy link
Collaborator

BenBE commented Mar 13, 2022

Please try to always rebase on the target branch instead of merging things …

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants