-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel read option for GCR #621
base: master
Are you sure you want to change the base?
Conversation
Merge branch 'master' of https://github.com/patricialarsen/gcr-catalogs
There are a few changes I need to
Some things to note:
|
Thank you @patricialarsen! Sorry, I only had a bit of time to skim over it, and I'll take a closer look soon. I do have a question about the permissions. Can you explain the issue a bit more. It's ok if we have to change the permissions, but I worried that people may not know this when creating new configs. I wonder if there's some alternatives that are more future-proof? |
We don't need to change the permissions, it's just that for sprint-week events I've had people link to my local repository to access the reader and altered the permissions in doing so, so I need to reset these back to the default. |
As a side note I believe we can make the permissions settings more general by using the core.fileMode and core.sharedRepository config settings, but am not entirely sure how these work. I believe setting the first of these to false stops git from tracking the permission changes in the repository should allow me to make local permissions changes without it causing these problems |
You should also note that this pull request adds readers for the DP0.2 object catalogs |
@patricialarsen thanks for updating the PR and sorry for the delay. Looking at the changes to the readers, I wonder if it's worthy creating a new base class, say class BaseMPIGenericCatalog(BaseGenericCatalog):
def __init__(self, **kwargs):
self._rank = int(kwargs.pop('mpi_rank'))
self._size = int(kwargs.pop('mpi_size'))
super().__init__(**kwargs) This way, it's more clear which readers support MPI (and in those cases, Thoughts? |
Adding parallel read using the config overwrite functionality and normal GCR.
An easy way to test this:
On a jupyter environment open up a terminal and run
source /global/common/software/lsst/common/miniconda/setup_current_python.sh
and then test using
mpirun -np 1 python gcr_test.py
with different numbers of processes, where the test code is