Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PBC #14

Merged
merged 5 commits into from
Apr 21, 2022
Merged

PBC #14

merged 5 commits into from
Apr 21, 2022

Conversation

Tsjerk
Copy link
Collaborator

@Tsjerk Tsjerk commented Jan 8, 2022

This PR makes the mapping PBC compliant, also supporting triclinic boxes. The only assumption is that a group of particles mapping to a bead never spans more than half the shortest box dimension. The routines generate some overhead, obviously, but not much in a practical sense. The main contribution to the overhead is the conversion to reciprocal space coordinates. It is possible to split the routine into a PBC aware and a naive one, but I'm not sure it's worth the hassle. Oh, this fixes issue #11.

Copy link
Owner

@fgrunewald fgrunewald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good solution. I'm still testing speed on some of my test systems. Overall just small comments. One thing I still think is useful would be to propagate the pbc complete flag and to have the option of not pbc completing. Also some left-over code from the old pbc code can be removed now.

istraj = args[1].casefold().endswith(('trr', 'xtc'))

# Required for unwrapping
if istraj:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means we always PBC treat the trajectory. Could there be a use-case where we don't want that? Perhaps we should enable by default but still have a flag that let's the user switch it off.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PBC correction never has adverse effects, so the only use would be to speed up the processing a bit, at the cost of complicating the code, which is a conditional on the reciprocal space mapping and having two functions for the mapping.

# Required for unwrapping
if istraj:
self.pbc = np.array([ dim2lattice(*f.dimensions) for f in self.trajectory ])
self.coordinates = self.trajectory.coordinate_array.astype('float64')
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need float64 accuracy for the PBC treatment? MDAnalysis uses 32 by default and that's also a bit more easy on the memory. Since we keep a lot of stuff there, perhaps using 32 might be the better option.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the numpy math (division) seems to cast to float64, which made it much easier to cast everything to float64 (one change) than do everything in float32 (casting in UniverseHandler and casting in the loop in the mapper)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a valid argument I didn't know numpy implicitly casts them to float64 arrays. Then we leave it like that and just convert them when writing out the universe.


self.ipbc = 2 * np.pi * np.linalg.inv(self.pbc)
self.reciprocal = (self.coordinates @ self.ipbc)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pbc_complete method is not needed anymore, so it can be removed. Same goes for the hidden self.__pbc_completed attribute unless we want to set it when the new code is run. I think that would be a good idea

import MDAnalysis as mda
from MDAnalysis import transformations


def dim2lattice(x, y, z, alpha=90, beta=90, gamma=90):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x,y,z are the xyz components of positions I assume so their shape must be a one dimensional array? Is it worth to numba this bit of code or is it fast?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is only done on the box for every frame and it is needed in the initialization. you'll win a few microseconds in total, maybe. You could see if calculating the reciprocal coordinates with numba shaves off a second or so, but I guess it won't be significant on trajectories that fit into memory anyway.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspected as much but was still curious. Let's not over-engineer this one then

Comment on lines -69 to -74
if args.pbc_complete:
print("INFO - PBC completeing trajectory")
init_universe.pbc_complete()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would still be a good idea from within the Universe handler to tell we are pbc completing and also propagate the pbc complete flag to leave the option to not pbc complete.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think taking periodicity into account is the most natural thing and in this form doesn't need special mention (nor turning off)

@fgrunewald
Copy link
Owner

I've tested it and speed-wise it is much much faster. Only when running on a gro file I get: TypeError: dim2lattice() missing 2 required positional arguments: 'y' and 'z'

@fgrunewald
Copy link
Owner

Also this PR does not do a pbc mol completion. Beads are not mapped in split fashion across the box but the molecules are also not completed. Could we add a post-processing step where molecules are completed in the same fashion as it is done for one bead? That should be possible except for very large molecules that span more than half of the box?

@Tsjerk
Copy link
Collaborator Author

Tsjerk commented Jan 10, 2022

I've tested it and speed-wise it is much much faster. Only when running on a gro file I get: TypeError: dim2lattice() missing 2 required positional arguments: 'y' and 'z'

Okay, didn't test a .gro; apparently the dimensions aren't set correctly then.

@Tsjerk
Copy link
Collaborator Author

Tsjerk commented Jan 10, 2022

Also this PR does not do a pbc mol completion. Beads are not mapped in split fashion across the box but the molecules are also not completed. Could we add a post-processing step where molecules are completed in the same fashion as it is done for one bead? That should be possible except for very large molecules that span more than half of the box?

Well, it's quite easy to implement a box-style clustering routine, which also does the job for large molecules (but periodic molecules will be a bit problematic, then again, nothing will fix that). Otherwise, at the least the routine should be used again for calculation of interactions spanning multiple beads, in which case you want to have all particles in the interaction in the most compact representation. This is quite trivial if you also have bondedness, in which case you can add that information to always be right, otherwise you have the condition that the interaction should not span more than half the box.

@fgrunewald
Copy link
Owner

@Tsjerk

To compute the bonded interactions in pbc basically you would create the reciprocal coordinates basically initializing the universe_handler class. And then use the transformation below to get the new pbc corrected position.

angle = ((reciprocal[fdx, atom_idx, :] - reci0) + np.pi) % (2 * np.pi) - np.pi
vector = angle @ pbc[fdx] / divfac

If we further assume that a dihedral is the "largest" interaction, as long as the four beads associated to the dihedral don't span more than half the box it should be fine. So the only change would be to in this function to modify line 11 to give out pair coordinates that are pbc completed. Am I missing anything?

def compute_value_for_interaction(universe, inter_type, valid_pairs):
nframes = universe.trajectory.n_frames
time_series = np.zeros((len(valid_pairs) * nframes))
for idx, idxs in enumerate(valid_pairs):
pair_pos = [ universe.trajectory.coordinate_array[:, pair, :] for pair in idxs]
time_series[idx*nframes:(idx+1)*nframes] = NORMAL_FUNCS[inter_type](*pair_pos)

@Tsjerk
Copy link
Collaborator Author

Tsjerk commented Jan 10, 2022

@Tsjerk

To compute the bonded interactions in pbc basically you would create the reciprocal coordinates basically initializing the universe_handler class. And then use the transformation below to get the new pbc corrected position.

angle = ((reciprocal[fdx, atom_idx, :] - reci0) + np.pi) % (2 * np.pi) - np.pi
vector = angle @ pbc[fdx] / divfac

If we further assume that a dihedral is the "largest" interaction, as long as the four beads associated to the dihedral don't span more than half the box it should be fine. So the only change would be to in this function to modify line 11 to give out pair coordinates that are pbc completed. Am I missing anything?

def compute_value_for_interaction(universe, inter_type, valid_pairs):
nframes = universe.trajectory.n_frames
time_series = np.zeros((len(valid_pairs) * nframes))
for idx, idxs in enumerate(valid_pairs):
pair_pos = [ universe.trajectory.coordinate_array[:, pair, :] for pair in idxs]
time_series[idx*nframes:(idx+1)*nframes] = NORMAL_FUNCS[inter_type](*pair_pos)

Correct. For angles and dihedrals you only use pair vectors anyway, so you won't even need worry about a dihedral spanning more than half the box: it's only about pairs.

@fgrunewald
Copy link
Owner

@Tsjerk could you take a look at what's happening with the gro files? Otherwise I think we can pretty much merge it. If I manage I code the one for the interactions tomorrow.

Copy link
Owner

@fgrunewald fgrunewald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good I'll test it again to see

@fgrunewald fgrunewald merged commit 10406da into main Apr 21, 2022
@fgrunewald
Copy link
Owner

@Tsjerk I had to revert this PR because I overlooked one important detail: the coordinate array is copied in total 3 times which basically means none but the smallest systems can be treated. Is it possible to have this implementation more memory friendly?

@Tsjerk
Copy link
Collaborator Author

Tsjerk commented May 2, 2022

The only (proper) solution is to process trajectories on a per-frame basis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants