-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up sequential CPU code through OpenMP #51
Comments
see ComputationalRadiationPhysics#51 Some observations: - we have only trivial (and fast) loops - the other loops are integral steps of the simulation and can not be parallelized (sequential steps and maybe with device-code) - one of the parsing loops uses cudaSetDevice, not sure if it's possible to parallelize that in a good way. - parallelizing std::vector is ok as long as the length is fixed (no reallocation). That means, we may not use vector.push_back() or vector.insert() inside a loop with OpenMP pragmas. Compiler might not complain... Only "easy" loops were parallelized, only basic pragmas were used. Might give some speedup one day, but if not... no problem. Pragmas and code changes are non-intrusive enough to keep it maintainable.
I started working on this. Not sure how necessary this actually is... might introduce code complexity without benefits. See also the commit message: ( slizzered@91625ac ) |
To gain maximal performance we need to reduce the runtime of our sequential code base. We have two possibilities to gain this reduction:
So, I think some investigation makes sense. But, you are right, it looks a bit weird if computation unrelated code is parallized with OpenMP 🌺 |
Well, it looks not too weird to me. The problem is rather, that it does not bring any speedup, since the loops are pretty small/fast. The MATLAB functions seem to be one of the more important problems (really slow...) |
There might be some loops that can pose a bottleneck. They might be rather easily parallelized with OpenMP (cmake-file will need to be tweaked)
The text was updated successfully, but these errors were encountered: