Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up test files: VASP outputs #3653

Merged
merged 44 commits into from
Mar 7, 2024

Conversation

DanielYang59
Copy link
Contributor

@DanielYang59 DanielYang59 commented Feb 24, 2024

Clean up test files: VASP outputs

This is the first step of cleaning up test files: VASP outputs, relocate VASP output files to a dedicated dir (vasp/outputs) under tests dir to tidy up the files dir.

With the aim to clean up test files so that future developers could easily find and reuse existing test files whenever possible.

TODOs

Future follow up PRs

  • Clean up VASP input files
  • Group and clean up dedicated VASP dirs, for example tests/files/relaxation, tests/files/kpoints_opt and more.

@DanielYang59
Copy link
Contributor Author

DanielYang59 commented Feb 24, 2024

Do you have any comments so far before I proceed? @janosh

Test file naming

Regarding the test file naming conventions, I would like to hear about your suggestions. Taking vasprun.xml for example, I'm thinking of:

  • Keep vasprun as the header so that they could be listed and grouped properly. For example bad_vasprun.xml should not be used.
  • Use xml.gz as the tail to respect the file extension.

In this case, maybe either vasprun_something.xml.gz or vasprun.something.xml.gz? Which one would you suggest? Or don't change them at this moment? I would then leave a README there after this too.

Tests for borg broken

Meanwhile my relocations seem to break tests for borg, as the outputs and inputs for VASP are now not inside the same dir:

entry = self.drone.assimilate(TEST_FILES_DIR)
for param in ("hubbards", "is_hubbard", "potcar_spec", "run_type"):
assert param in entry.parameters

Which seems to pack os.listdir():

def assimilate(self, path):
"""Assimilate data in a directory path into a ComputedEntry object.
Args:
path: directory path
Returns:
ComputedEntry
"""
files = os.listdir(path)

But I'm not familiar with borg so 28c784c may not be a rational fix. Can you help me double check this?

Thanks a lot 😃 !

@janosh
Copy link
Member

janosh commented Feb 28, 2024

i noticed that as well. not sure why but also haven't investigated this at all yet. my suspicion was the windows runner just use slower hardware but maybe that's wrong and it's something in our test setup.

either way it makes sense to occasionally update the .pytest-durations.json file that determines how the workload is optimally split across concurrent test runners so they all take roughly equally long. and as the code changes, that file needs to be updated once in a while

@DanielYang59
Copy link
Contributor Author

DanielYang59 commented Feb 29, 2024

my suspicion was the windows runner just use slower hardware but maybe that's wrong and it's something in our test setup.

Not sure (don't have much knowledge about Github Actions...). But I guess it has something to do with the runners themselves, as I notice there're others reporting this as well.

either way it makes sense to occasionally update the .pytest-durations.json file that determines how the workload is optimally split across concurrent test runners so they all take roughly equally long.

Well I don't think this would help much though... As currently Windows runners seem consistently slower than Ubuntu, so even we balance the runtime of both sides, the final run time would still be determined (and slowed) by Windows 😿

DanielYang59 and others added 2 commits March 2, 2024 11:50
Signed-off-by: Haoyu (Daniel) <yanghaoyu97@outlook.com>
@DanielYang59

This comment was marked as resolved.

@DanielYang59 DanielYang59 marked this pull request as ready for review March 5, 2024 12:15
@DanielYang59
Copy link
Contributor Author

Please review @janosh.

I have to keep the test file name for borg as vasprun.xml.xe.gz instead of the expected vasprun.xe.xml.gz because it is hardcoded to collect info from vasprun files of pattern vasprun.xml*:

def assimilate(self, path):
"""Assimilate data in a directory path into a ComputedEntry object.
Args:
path: directory path
Returns:
ComputedEntry
"""
files = os.listdir(path)
if "relax1" in files and "relax2" in files:
filepath = glob(f"{path}/relax2/vasprun.xml*")[0]
else:
vasprun_files = glob(f"{path}/vasprun.xml*")
filepath = None
if len(vasprun_files) == 1:
filepath = vasprun_files[0]
elif len(vasprun_files) > 1:
# Since multiple files are ambiguous, we will always read
# the one that it the last one alphabetically.
filepath = sorted(vasprun_files)[-1]
warnings.warn(f"{len(vasprun_files)} vasprun.xml.* found. {filepath} is being parsed.")
try:
vasp_run = Vasprun(filepath)
except Exception as exc:
logger.debug(f"error in {filepath}: {exc}")
return None
return vasp_run.get_computed_entry(self._inc_structure, parameters=self._parameters, data=self._data)

I wouldn't touch the source code because I don't have proper knowledge of borg.

Noticed some tests are failing because pydantic and matgl are missing. Not sure about the reason though.

@janosh
Copy link
Member

janosh commented Mar 5, 2024

Well I don't think this would help much though... As currently Windows runners seem consistently slower than Ubuntu, so even we balance the runtime of both sides, the final run time would still be determined (and slowed) by Windows 😿

that's certainly true

I have to keep the test file name for borg as vasprun.xml.xe.gz instead of the expected vasprun.xe.xml.gz because it is hardcoded to collect info from vasprun files of pattern vasprun.xml*:

sure, makes sense

Noticed some tests are failing because pydantic and matgl are missing. Not sure about the reason though.

not sure either pydantic should not be needed at all and matgl should be installed in CI. i can try to debug tomorrow

"matgl",

@DanielYang59
Copy link
Contributor Author

DanielYang59 commented Mar 5, 2024

not sure either pydantic should not be needed at all and matgl should be installed in CI. i can try to debug tomorrow

"matgl",

That would be much appreciated. Also noticed tests only failed for Ubuntu not Windows.

It seems matgl is installed successfully during CI run stage Install dependencies:

Successfully installed .......  matgl-1.0.0 ......

And import error is triggered for importing from the submodule not the matgl lib itself:

        if calculator.lower() == "m3gnet":
            try:
                import matgl
>               from matgl.ext.ase import M3GNetCalculator

@DanielYang59
Copy link
Contributor Author

DanielYang59 commented Mar 7, 2024

I have to keep the test file name for borg as vasprun.xml.xe.gz instead of the expected vasprun.xe.xml.gz because it is hardcoded to collect info from vasprun files of pattern vasprun.xml*:

Well I came up with a solution in bd87dea: create a symlink with ScratchDir, now all files follow the same pattern 😃 .

Copy link
Member

@janosh janosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perfect! thank you so much @DanielYang59! 👍 this is a huge PR and hugely appreciated! you've been a big help cleaning up the pymatgen code base!

i can take a look at the unrelated CI errors in a separate PR

@janosh janosh merged commit 8c5829b into materialsproject:master Mar 7, 2024
20 of 22 checks passed
@janosh janosh added tests Issues with or changes to the pymatgen test suite housekeeping Moving around or cleaning up old code/files vasp Vienna Ab initio Simulation Package dx Developer experience labels Mar 7, 2024
@DanielYang59
Copy link
Contributor Author

Thanks for reviewing @janosh! (I was about to redefine a VASP_OUT_DIR after cleaning up all vasp files, but glad you helped me out, thanks a ton 😃). I would proceed to other clean up once I have time and glad I could help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dx Developer experience housekeeping Moving around or cleaning up old code/files tests Issues with or changes to the pymatgen test suite vasp Vienna Ab initio Simulation Package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants