Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

switching to pyproject.toml #41

Merged
merged 6 commits into from
Nov 18, 2023
Merged

Conversation

samueldmcdermott
Copy link
Contributor

removed setup.py and requirements.txt, added pyproject.toml.

Also incidentally had to change some relative paths and nullify one test in order to pass pytest

…d some path configurations for successful import, nullified `test_annealing.py`
@reubenharry
Copy link
Collaborator

Do you know why the test broke? The seems like a worrying consequence. (Is it just an import issue?)

Also, would you mind adding to e.g. contributing.md any info I need to use the new packaging? E.g. what do I run to install, etc.

@samueldmcdermott
Copy link
Contributor Author

the code did not build successfully out of the box because of relative path issues, so I had to change a number of things which are all visible on the Files Changed tab up above (basically: where previously you had from mclmc.sampling.pkg I changed it to from mclmc.pkg or from .pkg as appropriate). The code also did not test successfully out of the box, so I had to nullify test_annealing.py since it pointed to a module called old_annealing that was not present in the branch I forked (it just has assert 1==1 now). I made those changes, and, at the time I pushed, the package installed successfully and all tests passed on my computer.

the install for me as I was editing was pip install -e . from the project's TLD, for a general user it will be pip install . if they have admin privileges on their machine or in the safest and most general case they should do python3 -m pip install . -- you can add that in the readme or wherever you think is appropriate.

given the state of the code when I forked (not building and tests not passing) I'm a little uncertain how to be helpful, so if further discussion is required, maybe we should find a way to have a Zoom call. I can make time next Tuesday if needed

@reubenharry
Copy link
Collaborator

reubenharry commented Nov 18, 2023

Ah, I see what you mean. OK, let me pull your PR, fix the test and go from there.

What I meant to ask above was: what are the instructions I should now give to CI yaml to install the requirements. Previously I was:

python -m pip install --upgrade pip
pip install pytest
pip install -r requirements.txt

What should I replace that with, given that requirements.txt is gone?

Once that's done, it should be ready to merge, since I've added back in the relevant test. (Cause this is a research repo, we're often a bit unscrupulous about good software practices, so sometimes push to main and it breaks the tests. We're currently adding the MCLMC algorithm to BlackJax, which should be a stable reference implementation going forward).

@samueldmcdermott
Copy link
Contributor Author

ah, I see -- the pyproject.toml has a dependencies field, which I populated. I also added version information there as follows:

  • for each package dependency, I looked up its current version
  • for a package that is currently at version A.B.c I have it requiring >=A.B
  • you can deprecate something if you need to by changing the number in the dependencies field, but this seemed reasonable to me (and all tests passed, except test_annealing, so it probably doesn't change a whole lot)

then your CI workflow will just include pip install . in the final line with no files named

@reubenharry reubenharry merged commit f877e19 into JakobRobnik:master Nov 18, 2023
1 check passed
@reubenharry
Copy link
Collaborator

@samueldmcdermott OK, merged. I also pushed to pypi, so things should be in order. (Let me know if it's working)

@samueldmcdermott
Copy link
Contributor Author

this successfully installed from pypi using pip install mclmc -- thanks for your work on it!

Fwiw, here's my additional thoughts on naming the package @reubenharry -- if I understand correctly, given the comment from @JakobRobnik in #39, this package currently implements the MCLMC algorithm of the papers, which is closely related to but distinct from the MCHMC algorithm. This GitHub project specifies to HMC, however. So I would suggest either

  1. renaming the project to MicroCanonicalLMC or MicroCanonicalMC (from the main page, go to Settings and change the name)
  2. or call the package mchmc instead (change the name of the inner directory to mclmc and change L6 in the pyproject.toml to name = "mchmc")

I think strictly speaking option 1 might be clearest for future users, though it requires all the devs to reinstall (which might be a good opportunity to clean up untracked files, etc). I think it's relatively painless (eg, anyone who navigates to the current repo will be redirected to the new one after the name change, so there won't be any future ambiguity). Option 2 is "least invasive", but if this is what you choose then I'd suggest clarifying in the README (and probably also in the __init__.py and sampler.py) that this implements MCLMC instead of MCHMC.

I guess a third possibility is to implement MCHMC as a separate module in this package and allow the user to choose MCLMC or MCHMC as two distinct options, depending on which they prefer. But that's probably a significant amount of work.

@reubenharry
Copy link
Collaborator

Yes, I agree that this might make sense. For what it's worth, there's really only one algorithm we want, which is to have partial momentum updates every integrator step, which can be viewed as a discretization of the SDE in https://arxiv.org/abs/2303.18221. We're calling this Microcanonical Langevin Monte Carlo. The alternative is to have full momentum updates (sampled from the unit sphere) every N steps, but as I understand, there's no real point implementing this version, and indeed, we don't.

In terms of the renaming of the repo, our current (tentative) plan is to shift our focus to implementing MCLMC in BlackJax. It looks like this will be merged in the next couple of weeks, and in the medium-term, this means that we may deprecate this repo, depending on our experience with BlackJax integration.

@samueldmcdermott
Copy link
Contributor Author

got it, that certainly makes sense from the science perspective, and no need to devote lots of energy to a repo that's still changing! The pyproject.toml structure should be useful for you regardless of what direction you go in next, though. And fyi I'll remain available for future audits or tests, feel free to at me if it's useful. Thanks again for your work on this

@reubenharry
Copy link
Collaborator

Thanks, we really appreciate this contribution! We'll definitely get your help testing either the BlackJax version, or future iterations of this repo.

Btw, some context: our goal is to get MCLMC more broadly adopted by the Bayesian stats (and whatever other) community, since we've found that it often outperforms NUTS by quite some margin. (We're also interested in combining it with parallelizable ensemble methods and/or SMC, to get samplers that are very very speedy on wall-clock time.). For that reason, something that's really helpful for us is to have people try out the algorithm on their own problems, and report back. If there are tuning issues (or other problems), we can advise on how to debug.

@samueldmcdermott
Copy link
Contributor Author

I've actually got a test case where it does poorly with random initialization but very well when tempered with a nested sampler -- perhaps that's more suitable for a different discussion than this PR though 😂

@reubenharry
Copy link
Collaborator

Yeah, feel free to open an issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants