Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Splitting part of source_detection.ipynb off into separate imaging.ipynb example notebook #610

Merged
merged 15 commits into from
Sep 25, 2024

Conversation

PHerzogFHNW
Copy link
Collaborator

@PHerzogFHNW PHerzogFHNW commented Aug 13, 2024

The purpose of this PR is to split off the examples on how to use Karabo for imaging out of the notebook that proves source detection examples, leaving only the WSClean algorithm in the source detection notebook.

This corresponds to https://jira.skatelescope.org/browse/CHOC-18

Copy link

codecov bot commented Aug 13, 2024

Codecov Report

Attention: Patch coverage is 94.11765% with 2 lines in your changes missing coverage. Please review.

Project coverage is 68.78%. Comparing base (45adff6) to head (0848370).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
karabo/simulation/sample_simulation.py 94.11% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #610      +/-   ##
==========================================
+ Coverage   68.63%   68.78%   +0.15%     
==========================================
  Files          53       54       +1     
  Lines        5714     5748      +34     
==========================================
+ Hits         3922     3954      +32     
- Misses       1792     1794       +2     
Flag Coverage Δ
68.78% <94.11%> (+0.15%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@Lukas113 Lukas113 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I have mixed feelings about this PR. I agree to do the split because otherwise it's too much workflow in a single notebook. However, the reasons for my mixed feelings are:

  • imaging.ipynb is not included in the tests
  • This PR introduces a lot of duplication, which makes testing longer, maintainability harder and potentially confusing for a user (I would be confused & double check if the potential duplicated code is the same or not).

I think a better solution would be to split actually all notebooks into smaller parts. The notebooks can start with an instantiation of an according class (e.g. Visibility) from a file. Here's a suggestion on how to separate the notebooks:

  • Visibility simulation
  • Imaging
  • Source detection

Such a solution would also make it clear that you e.g. can take visibilities from any archive, not just Karabo simulated visibilities.

For implementation & testing, such an approach would need some modification (apart from splitting the notebooks them-self & test them). I suggest the following:

  • Take according file/dir-paths (in- and/or output-product) from an env-var starting with a prefix KARABO_, set the variable(s) before launching a notebook test in an according test_notebooks.py test.
  • Adapt conftest.py clean_disk to clean after module or session, not function. This should ensure cleanup (I think also if a test failed?).
  • Make the test ordered, because they would rely on the execution of the previous test(s). Tools which can help are: pytest-order & pytest-dependency
  • Maybe sth else I didn't think of?

What are your thoughts on that?

@Lukas113
Copy link
Collaborator

In addition, writing the down the purpose of this PR would be helpful rather than pointing to a JIRA ticket where permissions are needed to access.

@PHerzogFHNW
Copy link
Collaborator Author

I added the corresponding test and a small description in the original comment, as those are no-brainers. I copied that format off someone who must've been equally lazy, my bad.

As for the rest, it might be best to wait until Michel is back since I did this implementation as per his instructions. I do agree that the duplication is bad for the reasons you listed. Discussing this with Michel, we could not come up with a better solution as doing the initial simulation would be part of the expected workflow of a scientist doing either imaging or source detection.

@Lukas113
Copy link
Collaborator

Sure we can wait. I also discussed this matter with Mathias Graf. And we concluded more or less to the solution I proposed. And since the changes are small (at least if I considered every aspect of it which might not be the case), it's hard to see for me why we shouldn't do the suggested approach.

@mpluess
Copy link
Member

mpluess commented Aug 26, 2024

The idea was for this PR to be just about splitting up source detection and imaging, but since the topic of simulation code duplication came up multiple times during the last weeks (e.g. Lukas' metadata example script IIRC), it might indeed be a good idea to address it right now. I could imagine a slightly simpler solution than Lukas suggested: what about just having a saved FITS file that lives in the repo and replace the simulation code with a simple loading of that file? That would save us env vars, cleanup, ordered tests.
I believe in the SRCNet_rucio_meta.py script, we also need telescope and observation objects, but we could just load them from disk as well using serialization.
Opinions?

@sfiruch
Copy link
Member

sfiruch commented Aug 26, 2024

[...] I believe in the SRCNet_rucio_meta.py script, we also need telescope and observation objects, but we could just load them from disk as well using serialization. Opinions?

Depends on the size. It also means that we'd have to update serialized files whenever internals change.

Why don't we add production of example-data to the Karabo API? A function à la get visibilities for tests, experiments and examples() could be useful?

@mpluess
Copy link
Member

mpluess commented Aug 26, 2024

[...] I believe in the SRCNet_rucio_meta.py script, we also need telescope and observation objects, but we could just load them from disk as well using serialization. Opinions?

Depends on the size. It also means that we'd have to update serialized files whenever internals change.

Why don't we add production of example-data to the Karabo API? A function à la get visibilities for tests, experiments and examples() could be useful?

Sounds like a good solution. Can be a small simulation so it doesn't take much time when run repeatedly in different parts of the code base.

@Lukas113
Copy link
Collaborator

Lukas113 commented Aug 26, 2024

Not sure... because to me it seems like the proposed solution is way easier to implement and doesn't require any dedicated file generation function, just a get_vis_path_from_* function which is looking at an env-var (or even a hard-coded path).

However, since you both don't agree I'm not really sure what to say other that I disagree. What matters to me in the end is to have minimal code duplication for maintainability & example clarity.

@Lukas113 Lukas113 removed their request for review September 20, 2024 06:50
Copy link
Member

@mpluess mpluess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good apart from the points addressed in my comments, thanks for your work.

karabo/examples/imaging.ipynb Outdated Show resolved Hide resolved
karabo/examples/source_detection.ipynb Outdated Show resolved Hide resolved
Copy link
Member

@mpluess mpluess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mpluess
Copy link
Member

mpluess commented Sep 25, 2024

@Lukas113 merging is blocked because of your change request in august, do you want to have another look or alternatively withdraw the change request?

@Lukas113 Lukas113 dismissed their stale review September 25, 2024 10:58

Disagree with PR, but was outvoted.

@mpluess mpluess merged commit 08b76aa into main Sep 25, 2024
3 checks passed
@mpluess mpluess deleted the update-sample-notebooks branch September 25, 2024 11:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants