-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added OSM landmask as a new source #30
Conversation
Thank you. This seems very useful. The landmask is derived from the GSSHG shapes, with the OSM shapes the mask probably no longer matches the shapes. So if a point is in the ocean in the mask, but not in OSM, it may not be picked up. I think we have to generate a separate mask, or maybe just the difference to save space (there is a limit on the size of distributed packages to pypi/conda/crates). |
How did you derive the mask? I'm unsure on both how it works and how it's generated. Would you be able to kindly explain how it works in order for me to understand it and be able finish this PR? Thanks! |
The mask was derived from a Python version, so it is not super easy: https://github.com/gauteh/roaring-landmask/tree/main/src/devel, I will give you a more detailed answer tomorrow. |
Roaring landmask is fast because it can first exclude a great number of points that are certain to be in the ocean. The mask is generated with a set resolution (https://github.com/gauteh/roaring-landmask/blob/main/src/mask.rs#L24). To generate the mask you need to check whether any of the pixels in the mask has any land in it: either just intersecting, completely covered, or contains an island that does not intersect. The mask is then converted to integers which are stored in the a roaring bitmap / treemap: https://github.com/gauteh/roaring-landmask/blob/main/src/mask.rs#L137. When roaring landmask checks a point it first checks the mask, if the test is negative it is certain that the point is in the ocean and can return. If not, it needs to consult the shapes since the points may be on land or not. |
I was thinking earlier that this project may be of use for that: https://github.com/regionmask/regionmask (#18 ) |
1f45334
to
048fa29
Compare
@gauteh I've already resolved the conflicts with your latest changes. |
Awesome! Looking forward to seeing how you solved this, will take a look tomorrow. |
This looks very promising! I am missing a python test and benchmark: https://github.com/gauteh/roaring-landmask/blob/main/tests/test_roaring.py to see how the provider should be specified from python (or are you planning to do this in the next round?). It would be useful to plot the two masks + difference between them to see how big the difference is, and it would also be a good test to see if anything stands out. |
There are a few errors in the the tests related to the provider argument: https://github.com/gauteh/roaring-landmask/actions/runs/10789004492/job/29921006432?pr=30#step:7:448. To run the tests requiring nightly rust you do:
|
That looks very promising, the difference looks as expected (no shift in coordinates). Some new errors in CI. You may need rust nightly to test. |
🎉 tests are passing! |
Great! Are any of the commits duplicating the datafiles? They should be squashed, to prevent the repo from becoming too big. I can try later if you are unsure how to do this. I think the Python bindings + tests need some updates, but we can do that in the next step. |
eb28110
to
da0b180
Compare
8a606c1
to
6355747
Compare
ci: pypi token update deps disable par test fix deps to main fmt ci: also run rust workflows set correct semver version updated bitmask and added the creating/conversion scripts updated bitmask and added the creating/conversion scripts . fixed tests and added images for both masks fix errors related to paths
6311ae1
to
ea99fcc
Compare
Merged! I fixed the python tests to now also testing OSM. |
any estimate about when we'll get this into a pypi release? |
If all tests go through it should be on its way: https://github.com/gauteh/roaring-landmask/actions/runs/10808495500 |
Looks like there's two problems... The first one with memory in the x86 version: and then something seemingly related to compiling geos and numpy deps:
Not really sure why any of these are happening now but didn't happen before. Any insights on what you changed regarding the CI on this PR? |
Memory happens now and then. I had accidentally disabled Python test. The fails are not related to your code changes. I will try to fix. |
Left the GSHHG landmask as the default, but I've added a new method in order to be able to choose between both GSHHG and OSM. It could also be possible to add any arbitrary number of providers but I didn't want to change the project too much, so I kept the changes minimal.