Calculating Root-Mean-Square Fluctuations (RMSF) — pmda.rmsf

This module contains parallel versions of analysis tasks in MDAnalysis.analysis.rms.

class pmda.rms.rmsf.RMSF(atomgroup)[source]

Parallel RMSF analysis.

Calculates RMSF of given atoms across a trajectory.

rmsf

N-length numpy.ndarray array of RMSF values, where N is the number of atoms in the atomgroup of interest. Returned values have units of ångströms.

Type

array

Parameters

atomgroup (AtomGroup) – Atoms for which RMSF is calculated

Raises

ValueError – raised if negative values are calculated, which indicates that a numerical overflow or underflow occured

Notes

No RMSD-superposition is performed; it is assumed that the user is providing a trajectory where the protein of interest has been structurally aligned to a reference structure (see the Examples section below). The protein also has be whole because periodic boundaries are not taken into account.

Run the analysis with RMSF.run(), which stores the results in the array RMSF.rmsf.

The root mean square fluctuation of an atom \(i\) is computed as the time average:

\[\sigma_{i} = \sqrt{\left\langle (\mathbf{x}_{i} - \langle\mathbf{x}_{i}\rangle)^2 \right\rangle}\]

No mass weighting is performed. This method implements an algorithm for computing sums of squares while avoiding overflows and underflows [Welford1962], as well as an algorithm for combining the sum of squares and means of separate partitions of a given trajectory to calculate the RMSF of the entire trajectory [CGL1979].

For more details about RMSF calculations, refer to [Awtrey2019].

References

Welford1962

B. P. Welford (1962). “Note on a Method for Calculating Corrected Sums of Squares and Products.” Technometrics 4(3):419-420.

Examples

In this example we calculate the residue RMSF fluctuations by analyzing the \(\text{C}_\alpha\) atoms. First we need to fit the trajectory to the average structure as a reference. That requires calculating the average structure first. Because we need to analyze and manipulate the same trajectory multiple times, we are going to load it into memory using the MemoryReader. (If your trajectory does not fit into memory, you will need to write out intermediate trajectories to disk or generate an in-memory universe that only contains, say, the protein):

import MDAnalysis as mda
from MDAnalysis.analysis import align
from MDAnalysis.tests.datafiles import TPR, XTC
u = mda.Universe(TPR, XTC, in_memory=True)
protein = u.select_atoms("protein")

# TODO: Need to center and make whole (this test trajectory
# contains the protein being split across periodic boundaries
# and the results will be WRONG!)

# Fit to the initial frame to get a better average structure
# (the trajectory is changed in memory)
prealigner = align.AlignTraj(u, u, select="protein and name CA",
                             in_memory=True).run()
# ref = average structure
ref_coordinates = u.trajectory.timeseries(asel=protein).mean(axis=1)
# Make a reference structure (need to reshape into a
# 1-frame "trajectory").
ref = mda.Merge(protein).load_new(ref_coordinates[:, None, :],
                                  order="afc")

We created a new universe reference that contains a single frame with the averaged coordinates of the protein. Now we need to fit the whole trajectory to the reference by minimizing the RMSD. We use MDAnalysis.analysis.align.AlignTraj:

aligner = align.AlignTraj(u, ref, select="protein and name CA",
                          in_memory=True).run()
# need to write the trajectory to disk for PMDA 0.3.0 (see issue #15)
with mda.Writer("rmsfit.xtc", n_atoms=u.atoms.n_atoms) as W:
    for ts in u.trajectory:
        W.write(u.atoms)

(For use with PMDA we cannot currently use a in-memory trajectory (see Issue #15) so we must write out the RMS-superimposed trajectory to the file “rmsfit.xtc”.)

The trajectory is now fitted to the reference (the RMSD is stored as aligner.rmsd for further inspection). Now we can calculate the RMSF:

from pmda.rms import RMSF

u = mda.Universe(TPR, "rmsfit.xtc")
calphas = protein.select_atoms("protein and name CA")

rmsfer = RMSF(calphas).run()

and plot:

import matplotlib.pyplot as plt
plt.plot(calphas.resnums, rmsfer.rmsf)

New in version 0.3.0.

Parameters
  • Universe (Universe) – a MDAnalysis.core.groups.Universe (the atomgroups must belong to this Universe)

  • atomgroups (tuple of AtomGroup) – atomgroups that are iterated in parallel

_results

The raw data from each process are stored as a list of lists, with each sublist containing the return values from pmda.parallel.ParallelAnalysisBase._single_frame().

Type

list

readonly_attributes()

Set the attributes of this class to be read only

Useful to avoid the class being modified when passing it around.

To be used as a context manager:

with analysis.readonly_attributes():
    some_function(analysis)
run(start=None, stop=None, step=None, n_jobs=1, n_blocks=None)

Perform the calculation

Parameters
  • start (int, optional) – start frame of analysis

  • stop (int, optional) – stop frame of analysis

  • step (int, optional) – number of frames to skip between each analysed frame

  • n_jobs (int, optional) – number of jobs to start, if -1 use number of logical cpu cores. This argument will be ignored when the distributed scheduler is used

  • n_blocks (int, optional) – number of blocks to divide trajectory into. If None set equal to n_jobs or number of available workers in scheduler.

See also

MDAnalysis.analysis.rms.RMSF