Calculating Root-Mean-Square Fluctuations (RMSF) — pmda.rmsf¶
This module contains parallel versions of analysis tasks in
MDAnalysis.analysis.rms.
-
class
pmda.rms.rmsf.RMSF(atomgroup)[source]¶ Parallel RMSF analysis.
Calculates RMSF of given atoms across a trajectory.
-
rmsf¶ N-lengthnumpy.ndarrayarray of RMSF values, whereNis the number of atoms in the atomgroup of interest. Returned values have units of ångströms.- Type
array
- Parameters
atomgroup (AtomGroup) – Atoms for which RMSF is calculated
- Raises
ValueError – raised if negative values are calculated, which indicates that a numerical overflow or underflow occured
See also
Notes
No RMSD-superposition is performed; it is assumed that the user is providing a trajectory where the protein of interest has been structurally aligned to a reference structure (see the Examples section below). The protein also has be whole because periodic boundaries are not taken into account.
Run the analysis with
RMSF.run(), which stores the results in the arrayRMSF.rmsf.The root mean square fluctuation of an atom \(i\) is computed as the time average:
\[\sigma_{i} = \sqrt{\left\langle (\mathbf{x}_{i} - \langle\mathbf{x}_{i}\rangle)^2 \right\rangle}\]No mass weighting is performed. This method implements an algorithm for computing sums of squares while avoiding overflows and underflows [Welford1962], as well as an algorithm for combining the sum of squares and means of separate partitions of a given trajectory to calculate the RMSF of the entire trajectory [CGL1979].
For more details about RMSF calculations, refer to [Awtrey2019].
References
- Welford1962
B. P. Welford (1962). “Note on a Method for Calculating Corrected Sums of Squares and Products.” Technometrics 4(3):419-420.
Examples
In this example we calculate the residue RMSF fluctuations by analyzing the \(\text{C}_\alpha\) atoms. First we need to fit the trajectory to the average structure as a reference. That requires calculating the average structure first. Because we need to analyze and manipulate the same trajectory multiple times, we are going to load it into memory using the
MemoryReader. (If your trajectory does not fit into memory, you will need to write out intermediate trajectories to disk or generate an in-memory universe that only contains, say, the protein):import MDAnalysis as mda from MDAnalysis.analysis import align from MDAnalysis.tests.datafiles import TPR, XTC u = mda.Universe(TPR, XTC, in_memory=True) protein = u.select_atoms("protein") # TODO: Need to center and make whole (this test trajectory # contains the protein being split across periodic boundaries # and the results will be WRONG!) # Fit to the initial frame to get a better average structure # (the trajectory is changed in memory) prealigner = align.AlignTraj(u, u, select="protein and name CA", in_memory=True).run() # ref = average structure ref_coordinates = u.trajectory.timeseries(asel=protein).mean(axis=1) # Make a reference structure (need to reshape into a # 1-frame "trajectory"). ref = mda.Merge(protein).load_new(ref_coordinates[:, None, :], order="afc")
We created a new universe
referencethat contains a single frame with the averaged coordinates of the protein. Now we need to fit the whole trajectory to the reference by minimizing the RMSD. We useMDAnalysis.analysis.align.AlignTraj:aligner = align.AlignTraj(u, ref, select="protein and name CA", in_memory=True).run() # need to write the trajectory to disk for PMDA 0.3.0 (see issue #15) with mda.Writer("rmsfit.xtc", n_atoms=u.atoms.n_atoms) as W: for ts in u.trajectory: W.write(u.atoms)
(For use with PMDA we cannot currently use a in-memory trajectory (see Issue #15) so we must write out the RMS-superimposed trajectory to the file “rmsfit.xtc”.)
The trajectory is now fitted to the reference (the RMSD is stored as aligner.rmsd for further inspection). Now we can calculate the RMSF:
from pmda.rms import RMSF u = mda.Universe(TPR, "rmsfit.xtc") calphas = protein.select_atoms("protein and name CA") rmsfer = RMSF(calphas).run()
and plot:
import matplotlib.pyplot as plt plt.plot(calphas.resnums, rmsfer.rmsf)
New in version 0.3.0.
- Parameters
Universe (
Universe) – aMDAnalysis.core.groups.Universe(the atomgroups must belong to this Universe)atomgroups (tuple of
AtomGroup) – atomgroups that are iterated in parallel
-
_results¶ The raw data from each process are stored as a list of lists, with each sublist containing the return values from
pmda.parallel.ParallelAnalysisBase._single_frame().- Type
-
readonly_attributes()¶ Set the attributes of this class to be read only
Useful to avoid the class being modified when passing it around.
To be used as a context manager:
with analysis.readonly_attributes(): some_function(analysis)
-
run(start=None, stop=None, step=None, n_jobs=1, n_blocks=None)¶ Perform the calculation
- Parameters
start (int, optional) – start frame of analysis
stop (int, optional) – stop frame of analysis
step (int, optional) – number of frames to skip between each analysed frame
n_jobs (int, optional) – number of jobs to start, if -1 use number of logical cpu cores. This argument will be ignored when the distributed scheduler is used
n_blocks (int, optional) – number of blocks to divide trajectory into. If
Noneset equal to n_jobs or number of available workers in scheduler.
-
See also
MDAnalysis.analysis.rms.RMSF