8. Using the MDAnalysis.analysis modules¶
MDAnalysis comes with a number of existing analysis code in the MDAnalysis.analysis module and example scripts (see also the Examples on the MDAnalysis wiki).
8.1. RMSD¶
As an example we will use the MDAnalysis.analysis.rms.rmsd()
function from the MDAnalysis.analysis.rms
module. It computes
the coordinate root mean square distance between two sets of
coordinates. For example for the AdK trajectory the backbone RMSD
between first and last frame is
>>> import MDAnalysis.analysis.rms
>>> u = MDAnalysis.Universe(PSF, DCD)
>>> bb = u.select_atoms('backbone')
>>> A = bb.positions # coordinates of first frame
>>> u.trajectory[-1] # forward to last frame
>>> B = bb.positions # coordinates of last frame
>>> MDAnalysis.analysis.rms.rmsd(A, B)
6.8342494129169804
8.2. Superposition of structure¶
In order to superimpose two structures in a way that minimizes the
RMSD we have functions in the MDAnalysis.analysis.align
module.
The example uses files provided as part of the MDAnalysis test suite
(in the variables PSF
,
DCD
, and
PDB_small
). For all further
examples execute first
>>> import MDAnalysis
>>> from MDAnalysis.analysis import align, rms
>>> from MDAnalysis.tests.datafiles import PSF, DCD, PDB_small
In the simplest case, we can simply calculate the C-alpha RMSD between
two structures, using rmsd()
:
>>> ref = MDAnalysis.Universe(PDB_small)
>>> mobile = MDAnalysis.Universe(PSF, DCD)
>>> mobile_CA = mobile.select_atoms("name CA")
>>> ref_CA = ref.select_atoms("name CA")
>>> rms.rmsd(mobile_CA.positions, ref_CA.positions)
28.20178579474479
Note that in this example translations have not been removed. In order to look at the pure rotation one needs to superimpose the centres of mass (or geometry) first:
>>> ref0 = ref_CA.positions - ref_CA.center_of_mass()
>>> mobile0 = mobile_CA.positions - mobile_CA.center_of_mass()
>>> rms.rmsd(mobile0, ref0)
21.892591663632704
The rotation matrix that superimposes mobile on ref while
minimizing the CA-RMSD is obtained with the
rotation_matrix()
function
>>> R, rmsd = align.rotation_matrix(mobile0, ref0)
>>> print(rmsd)
6.80939658647
>>> print(R)
[[ 0.14514539 -0.27259113 0.95111876]
[ 0.88652593 0.46267112 -0.00268642]
[-0.43932289 0.84358136 0.30881368]]
Putting all this together one can superimpose all of mobile onto ref and write to, for instance, a PDB file 1:
>>> mobile.atoms.translate(-mobile_CA.center_of_mass())
>>> mobile.atoms.rotate(R)
>>> mobile.atoms.translate(ref_CA.center_of_mass())
>>> mobile.atoms.write("mobile_on_ref.pdb")
8.3. Exercise 5¶
Investigate how rigid the CORE, NMP, and LID domains are during the transition: Compute time series of the CA RMSD of each domain relative to its own starting structure, when superimposed on the starting structure.
You will need to make a copy of the starting reference coordinates that are needed for the shifts, e.g.
NMP = u.select_atoms("resid 30-59") u.trajectory[0] # make sure to be on initial frame ref_com = NMP.select_atoms("name CA").center_of_mass() ref0 = NMP.positions - ref_com
which is then used instead of
ref_CA.center_of_mass()
(which would change for each time step).You can use the function
MDAnalysis.analysis.rms.rmsd()
(with the keywords center=True, superposition=True) to superimpose each atom group of interest and to calculate the RMSD.
Possible solution
The code uses MDAnalysis.analysis.rms.rmsd()
for the RMSD
calculations after fitting each domain. Otherwise it is mostly
book-keeping, which is solved by organizing everything in dictionaries
with keys “CORE”, “NMP”, “LID”.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | import numpy as np
from MDAnalysis.analysis.rms import rmsd
if __name__ == "__main__":
import MDAnalysis
import matplotlib
import matplotlib.pyplot as plt
# load AdK DIMS trajectory
from MDAnalysis.tests.datafiles import PSF, DCD
u = MDAnalysis.Universe(PSF, DCD)
# one AtomGroup per domain
domains = {
'CORE': u.select_atoms("(resid 1-29 or resid 60-121 or resid 160-214) and name CA"),
'LID': u.select_atoms("resid 122-159 and name CA"),
'NMP': u.select_atoms("resid 30-59 and name CA"),
}
colors = {'CORE': 'black', 'NMP': 'blue', 'LID': 'red'}
u.trajectory[0] # rewind trajectory
xref0 = dict((name, g.positions - g.center_of_mass()) for name, g in domains.items())
nframes = len(u.trajectory)
results = dict((name, np.zeros((nframes, 2), dtype=np.float64)) for name in domains)
for iframe, ts in enumerate(u.trajectory):
for name, g in domains.items():
results[name][iframe, :] = (u.trajectory.time,
rmsd(g.positions, xref0[name],
center=True, superposition=True))
# plot
fig = plt.figure(figsize=(5,5))
ax = fig.add_subplot(111)
for name in "CORE", "NMP", "LID":
data = results[name]
ax.plot(data[:,0], data[:,1], linestyle="-", color=colors[name], lw=2, label=name)
ax.legend(loc="best")
ax.set_xlabel(r"time $t$ (ps)")
ax.set_ylabel(r"C$_\alpha$ RMSD from $t=0$, $\rho_{\mathrm{C}_\alpha}$ ($\AA$)")
for ext in ('svg', 'pdf', 'png'):
fig.savefig("AdK_domain_rigidity.{0}".format(ext))
|
See also
MDAnalysis.analysis.align.alignto()
for superimposing
structures
MDAnalysis.analysis.rms.RMSD
for comprehensive analysis of
RMSD time series
Footnotes
- 1
PDB format files contain various data fields that are not necessarily used in a typical MD simulation such as altLocs, icodes, occupancies, or tempfactor. When you write a PDB file without providing values for these parameters, MDAnalysis has to set them to default values. When MDAnalysis does that, it warns you with output like
~/anaconda3/envs/mda3/lib/python3.6/site-packages/MDAnalysis/coordinates/PDB.py:892: UserWarning: Found no information for attr: 'altLocs' Using default value of ' ' "".format(attrname, default)) ~/anaconda3/envs/mda3/lib/python3.6/site-packages/MDAnalysis/coordinates/PDB.py:892: UserWarning: Found no information for attr: 'icodes' Using default value of ' ' "".format(attrname, default)) ~/anaconda3/envs/mda3/lib/python3.6/site-packages/MDAnalysis/coordinates/PDB.py:892: UserWarning: Found no information for attr: 'occupancies' Using default value of '1.0' "".format(attrname, default)) ~/anaconda3/envs/mda3/lib/python3.6/site-packages/MDAnalysis/coordinates/PDB.py:892: UserWarning: Found no information for attr: 'tempfactors' Using default value of '0.0' "".format(attrname, default))
These warnings are for your information and in the context of this tutorial they are expected and do not indicate a problem.