-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update Array hierarchy and allocate nD arrays in a contiguous block by default #1236
Conversation
Thanks! I will have a look |
Does building SWIG work for you? I get some errors, e.g.:
|
forgot to push... |
Perfect, thanks! It builds now, but importing stir in python leads to an error still: I'll go through the code changes in patience to understand all the things that have changed. |
ok. I didn't try that yet. This appears to be It does point to the fact that we're SWIGging too much. (anything with |
I'll rebase this on current |
done. this imports now. |
ah well, committed a temp |
Ran the latest version this morning through the python reconstruction steps and everything worked fine! Will later also test it in C++, as well as with sanitisers |
that's great. but is it any faster? (or does it use more memory?) |
Not noticeably in the setup I used, but I want to test this with C++ where everything is a bit more controllable. |
Oops, I accidentally pushed a merge with #1237. That wasn't my intention. Depending on how I feel, I might still revert that... |
Tested it now in C++ and both speed and memory consumption look identical. |
ah well. maybe if we exploit it a bit more. But profiling often throws "conventional wisdom" in the bin... |
the current
That might be non-negligible for GPU applications, but obviously it is essentially 20 3D images, so in practice it might just not matter. I guess we should add some timings on doing copies etc. |
bc63f52
to
6e0e47e
Compare
First job failure due to #1378. gcc12-cuda0 job failure: there seems to be a time-out or something in |
b3ad8c4
to
7cfdadf
Compare
I cannot figure out what is wrong with the gcc12-cuda0 (C++-20) job.
One difference is that other jobs have their builds ccached, such that the build time is < 2min, while for this job it is still 23 mins, but I have no idea why that'd be relevant. stumped. @casperdcl any ideas? (I could remove C++20 etc, but I can't see what this has to do with that) |
597b35a
to
ce36182
Compare
Looks like this as a disk-space issue! Seems resolved now. |
gcc12-cuda0 job resolved. gcc12-cuda2.1 has a segfault in |
- make sure it works with other STIR classes, not just floats etc - re-instate VectorWithOffset constructors that take both start and end pointers for backwards compatibility
The NumericVectorWithOffset constructor would only work when T=NUMBER. The 1D Array constructor taking an `elemT*` wasn't implemented yet.
Seems that VC gets confused by the new Array::swap, so add std:: explicitly
make Array:init from ptr private test_Array is crashing due to memory bug in grow
nD Array's now store the "full" memory via shared_ptr<T[]>. This automates memory management. 1D Array's are still TODO. test_Array now works ok, except for 1D arrays. Some testing code remains in test_Array
needed for shared_ptr<T[]>, unless we'd implement deleters ourselves
all Array tests pass
no longer needed as we're using a shared<T[]> and can test that.
- get rid of copy_data argument for constructors but use sensible defaults - add VectorWithOffset::init such that Array doesn't need to know much
This was an attempt to handle pre-C++-17, but it would need work for make_shared. In any case, it isn't available on VS compilers apparently.
This was previously only done when finding the CONFIG file
I have to rebase on |
if data is contiguous, we don't need an extra copy.
While more can be done here, I will merge it such that we can move on. I just need to add release notes, so anyone (@NikEfth @markus-jehl...) feels like checking, now's the time :-) |
Hi Kris, I would like to have a look, but I won't be able before the weekend. |
Giving it a go with a python based reconstruction! |
Works fine for me :-) |
Is it faster?
…On Wed, May 22, 2024 at 12:25 PM markus-jehl ***@***.***> wrote:
Works fine for me :-)
—
Reply to this email directly, view it on GitHub
<#1236 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACEUB7TYH4HZM2ZAKYG7MCDZDTBINAVCNFSM6AAAAAA343XLEKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRVGIZDGNRRG4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Not noticeably from my one test run, but I haven't done a proper timing comparison. That would take a bit longer. |
addresses the following
Array
andVectorWithOffset
did not have move constructors, so we were using the default generated ones which might not be optimalArray
s used many different separately allocated 1D Arrays. This had the consequence that memory is not contiguous (which prevents some optimisations) as well as many calls tonew[]
anddelete[]
, which turns out to be slow. The default is not to allocate one single block, and let the nD Array point into that. This happens transparent to the rest of the code. CAVEAT: it does mean that growing an nD array in the "first" dimensions could now be less efficient (as it will reallocate everything).Currently
ctest
is happy on my VM andtest_Array
andtest_VectorWithOffset
work fine viavalgrind
(Ubuntu, gcc8). We'll see what GHA and AppVeyor say...@markus-jehl this should be fine for you to test now.
Things still to do:
Array
s could now read/write in 1 chunk, which would be faster, e.g.read_data
,convert_array
Array
assignment such that it doesn't create a copy first if reallocation can be avoided.Sinogram
etc, or are the default ones ok. (operator=
might need implemented to benefit from previous bullet)