-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Splitting part of source_detection.ipynb off into separate imaging.ipynb example notebook #610
Conversation
… Update docs accordingly 🈴
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #610 +/- ##
==========================================
+ Coverage 68.63% 68.78% +0.15%
==========================================
Files 53 54 +1
Lines 5714 5748 +34
==========================================
+ Hits 3922 3954 +32
- Misses 1792 1794 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I have mixed feelings about this PR. I agree to do the split because otherwise it's too much workflow in a single notebook. However, the reasons for my mixed feelings are:
imaging.ipynb
is not included in the tests- This PR introduces a lot of duplication, which makes testing longer, maintainability harder and potentially confusing for a user (I would be confused & double check if the potential duplicated code is the same or not).
I think a better solution would be to split actually all notebooks into smaller parts. The notebooks can start with an instantiation of an according class (e.g. Visibility
) from a file. Here's a suggestion on how to separate the notebooks:
- Visibility simulation
- Imaging
- Source detection
Such a solution would also make it clear that you e.g. can take visibilities from any archive, not just Karabo simulated visibilities.
For implementation & testing, such an approach would need some modification (apart from splitting the notebooks them-self & test them). I suggest the following:
- Take according file/dir-paths (in- and/or output-product) from an env-var starting with a prefix
KARABO_
, set the variable(s) before launching a notebook test in an accordingtest_notebooks.py
test. - Adapt
conftest.py clean_disk
to clean after module or session, not function. This should ensure cleanup (I think also if a test failed?). - Make the test ordered, because they would rely on the execution of the previous test(s). Tools which can help are: pytest-order & pytest-dependency
- Maybe sth else I didn't think of?
What are your thoughts on that?
In addition, writing the down the purpose of this PR would be helpful rather than pointing to a JIRA ticket where permissions are needed to access. |
I added the corresponding test and a small description in the original comment, as those are no-brainers. I copied that format off someone who must've been equally lazy, my bad. As for the rest, it might be best to wait until Michel is back since I did this implementation as per his instructions. I do agree that the duplication is bad for the reasons you listed. Discussing this with Michel, we could not come up with a better solution as doing the initial simulation would be part of the expected workflow of a scientist doing either imaging or source detection. |
Sure we can wait. I also discussed this matter with Mathias Graf. And we concluded more or less to the solution I proposed. And since the changes are small (at least if I considered every aspect of it which might not be the case), it's hard to see for me why we shouldn't do the suggested approach. |
The idea was for this PR to be just about splitting up source detection and imaging, but since the topic of simulation code duplication came up multiple times during the last weeks (e.g. Lukas' metadata example script IIRC), it might indeed be a good idea to address it right now. I could imagine a slightly simpler solution than Lukas suggested: what about just having a saved FITS file that lives in the repo and replace the simulation code with a simple loading of that file? That would save us env vars, cleanup, ordered tests. |
Depends on the size. It also means that we'd have to update serialized files whenever internals change. Why don't we add production of example-data to the Karabo API? A function à la |
Sounds like a good solution. Can be a small simulation so it doesn't take much time when run repeatedly in different parts of the code base. |
Not sure... because to me it seems like the proposed solution is way easier to implement and doesn't require any dedicated file generation function, just a However, since you both don't agree I'm not really sure what to say other that I disagree. What matters to me in the end is to have minimal code duplication for maintainability & example clarity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good apart from the points addressed in my comments, thanks for your work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@Lukas113 merging is blocked because of your change request in august, do you want to have another look or alternatively withdraw the change request? |
The purpose of this PR is to split off the examples on how to use Karabo for imaging out of the notebook that proves source detection examples, leaving only the WSClean algorithm in the source detection notebook.
This corresponds to https://jira.skatelescope.org/browse/CHOC-18