Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multipole storage #70

Open
loriab opened this issue Mar 3, 2020 · 3 comments
Open

multipole storage #70

loriab opened this issue Mar 3, 2020 · 3 comments

Comments

@loriab
Copy link
Collaborator

loriab commented Mar 3, 2020

No decision necessary for dipole because all 3 elements are unique, but for quadrupoles and higher one has to choose compact storage and defined order (e.g., xx, xy, xz, yy, yz, zz) or full representation (e.g., 9 element quadrupole storage). Former saves space but requires more management, which is hard to impose in schema as a data layout. I propose higher multipoles should be stored in full. For 64-poles, this is 729 elements redundant (28 unique). Any concerns or objections?

@mattwelborn
Copy link
Collaborator

Worth looking at http://www.openrsp.org/en/latest/index.html which (de facto) defines a schema for arbitrary response properties (arbitrary in terms of operator, order, and frequency).

@loriab
Copy link
Collaborator Author

loriab commented Mar 4, 2020

Thanks for the link! That's a great project to know about, and I'm reassured to see they went with redundant components as well http://www.openrsp.org/en/latest/tutorial/perturbations.html#perturbations.

My guess is that it should be easy to map but that qcsk doesn't want to go immediately with the more complex openrsp representation?

@ghutchis
Copy link
Collaborator

I think lexicographical order is fairly common (e.g., http://cclib.github.io/data_notes.html#moments) and if programs use different order, it can be mapped easily.

As far as storage, I'd probably suggest storing in upper triangle / reduced form. Let's say that you store the 64-pole for all molecules in 3 million entries. It may be a small part of the whole, but it adds up quickly across a database IMHO.

I'm aways going to come from the compressed = good perspective. I'm trying to upload 22GB to Figshare right now and that's meaningful.

Importantly, I many programs I use do not output the full tensor for the same reason - much is redundant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants