# 8. Using the MDAnalysis.analysis modules¶

MDAnalysis comes with a number of existing analysis code in the MDAnalysis.analysis module and example scripts (see also the Examples on the MDAnalysis wiki).

## 8.1. RMSD¶

As an example we will use the MDAnalysis.analysis.rms.rmsd() function from the MDAnalysis.analysis.rms module. It computes the coordinate root mean square distance between two sets of coordinates. For example for the AdK trajectory the backbone RMSD between first and last frame is

>>> import MDAnalysis.analysis.rms
>>> u = MDAnalysis.Universe(PSF, DCD)
>>> bb = u.select_atoms('backbone')
>>> A = bb.positions  # coordinates of first frame
>>> u.trajectory[-1]      # forward to last frame
>>> B = bb.positions  # coordinates of last frame
>>> MDAnalysis.analysis.rms.rmsd(A, B)
6.8342494129169804

## 8.2. Superposition of structure¶

In order to superimpose two structures in a way that minimizes the RMSD we have functions in the MDAnalysis.analysis.align module.

The example uses files provided as part of the MDAnalysis test suite (in the variables PSF, DCD, and PDB_small). For all further examples execute first

>>> import MDAnalysis
>>> from MDAnalysis.analysis import align, rms
>>> from MDAnalysis.tests.datafiles import PSF, DCD, PDB_small

In the simplest case, we can simply calculate the C-alpha RMSD between two structures, using rmsd():

>>> ref = MDAnalysis.Universe(PDB_small)
>>> mobile = MDAnalysis.Universe(PSF, DCD)
>>> mobile_CA = mobile.select_atoms("name CA")
>>> ref_CA = ref.select_atoms("name CA")
>>> rms.rmsd(mobile_CA.positions, ref_CA.positions)
28.20178579474479

Note that in this example translations have not been removed. In order to look at the pure rotation one needs to superimpose the centres of mass (or geometry) first:

>>> ref0 =  ref_CA.positions - ref_CA.center_of_mass()
>>> mobile0 =  mobile_CA.positions - mobile_CA.center_of_mass()
>>> rms.rmsd(mobile0, ref0)
21.892591663632704

The rotation matrix that superimposes mobile on ref while minimizing the CA-RMSD is obtained with the rotation_matrix() function

>>> R, rmsd = align.rotation_matrix(mobile0, ref0)
>>> print(rmsd)
6.80939658647
>>> print(R)
[[ 0.14514539 -0.27259113  0.95111876]
[ 0.88652593  0.46267112 -0.00268642]
[-0.43932289  0.84358136  0.30881368]]

Putting all this together one can superimpose all of mobile onto ref and write to, for instance, a PDB file 1:

>>> mobile.atoms.translate(-mobile_CA.center_of_mass())
>>> mobile.atoms.rotate(R)
>>> mobile.atoms.translate(ref_CA.center_of_mass())
>>> mobile.atoms.write("mobile_on_ref.pdb")

## 8.3. Exercise 5¶

Investigate how rigid the CORE, NMP, and LID domains are during the transition: Compute time series of the CA RMSD of each domain relative to its own starting structure, when superimposed on the starting structure.

• You will need to make a copy of the starting reference coordinates that are needed for the shifts, e.g.

NMP = u.select_atoms("resid 30-59")
u.trajectory[0]   # make sure to be on initial frame
ref_com = NMP.select_atoms("name CA").center_of_mass()
ref0 = NMP.positions - ref_com

which is then used instead of ref_CA.center_of_mass() (which would change for each time step).

• You can use the function MDAnalysis.analysis.rms.rmsd() (with the keywords center=True, superposition=True) to superimpose each atom group of interest and to calculate the RMSD.

Possible solution

The code uses MDAnalysis.analysis.rms.rmsd() for the RMSD calculations after fitting each domain. Otherwise it is mostly book-keeping, which is solved by organizing everything in dictionaries with keys “CORE”, “NMP”, “LID”.

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 import numpy as np from MDAnalysis.analysis.rms import rmsd if __name__ == "__main__": import MDAnalysis import matplotlib import matplotlib.pyplot as plt # load AdK DIMS trajectory from MDAnalysis.tests.datafiles import PSF, DCD u = MDAnalysis.Universe(PSF, DCD) # one AtomGroup per domain domains = { 'CORE': u.select_atoms("(resid 1-29 or resid 60-121 or resid 160-214) and name CA"), 'LID': u.select_atoms("resid 122-159 and name CA"), 'NMP': u.select_atoms("resid 30-59 and name CA"), } colors = {'CORE': 'black', 'NMP': 'blue', 'LID': 'red'} u.trajectory[0] # rewind trajectory xref0 = dict((name, g.positions - g.center_of_mass()) for name, g in domains.items()) nframes = len(u.trajectory) results = dict((name, np.zeros((nframes, 2), dtype=np.float64)) for name in domains) for iframe, ts in enumerate(u.trajectory): for name, g in domains.items(): results[name][iframe, :] = (u.trajectory.time, rmsd(g.positions, xref0[name], center=True, superposition=True)) # plot fig = plt.figure(figsize=(5,5)) ax = fig.add_subplot(111) for name in "CORE", "NMP", "LID": data = results[name] ax.plot(data[:,0], data[:,1], linestyle="-", color=colors[name], lw=2, label=name) ax.legend(loc="best") ax.set_xlabel(r"time $t$ (ps)") ax.set_ylabel(r"C$_\alpha$ RMSD from $t=0$, $\rho_{\mathrm{C}_\alpha}$ ($\AA$)") for ext in ('svg', 'pdf', 'png'): fig.savefig("AdK_domain_rigidity.{0}".format(ext))

MDAnalysis.analysis.align.alignto() for superimposing structures

MDAnalysis.analysis.rms.RMSD for comprehensive analysis of RMSD time series

Footnotes

1

PDB format files contain various data fields that are not necessarily used in a typical MD simulation such as altLocs, icodes, occupancies, or tempfactor. When you write a PDB file without providing values for these parameters, MDAnalysis has to set them to default values. When MDAnalysis does that, it warns you with output like

~/anaconda3/envs/mda3/lib/python3.6/site-packages/MDAnalysis/coordinates/PDB.py:892: UserWarning: Found no information for attr: 'altLocs' Using default value of ' '
"".format(attrname, default))

~/anaconda3/envs/mda3/lib/python3.6/site-packages/MDAnalysis/coordinates/PDB.py:892: UserWarning: Found no information for attr: 'icodes' Using default value of ' '
"".format(attrname, default))

~/anaconda3/envs/mda3/lib/python3.6/site-packages/MDAnalysis/coordinates/PDB.py:892: UserWarning: Found no information for attr: 'occupancies' Using default value of '1.0'
"".format(attrname, default))

~/anaconda3/envs/mda3/lib/python3.6/site-packages/MDAnalysis/coordinates/PDB.py:892: UserWarning: Found no information for attr: 'tempfactors' Using default value of '0.0'
"".format(attrname, default))

These warnings are for your information and in the context of this tutorial they are expected and do not indicate a problem.