Calculating Root-Mean-Square Fluctuations (RMSF) — pmda.rmsf
¶
This module contains parallel versions of analysis tasks in
MDAnalysis.analysis.rms
.
-
class
pmda.rms.rmsf.
RMSF
(atomgroup)[source]¶ Parallel RMSF analysis.
Calculates RMSF of given atoms across a trajectory.
-
rmsf
¶ N
-lengthnumpy.ndarray
array of RMSF values, whereN
is the number of atoms in the atomgroup of interest. Returned values have units of ångströms.- Type
array
- Parameters
atomgroup (AtomGroup) – Atoms for which RMSF is calculated
- Raises
ValueError – raised if negative values are calculated, which indicates that a numerical overflow or underflow occured
See also
Notes
No RMSD-superposition is performed; it is assumed that the user is providing a trajectory where the protein of interest has been structurally aligned to a reference structure (see the Examples section below). The protein also has be whole because periodic boundaries are not taken into account.
Run the analysis with
RMSF.run()
, which stores the results in the arrayRMSF.rmsf
.The root mean square fluctuation of an atom \(i\) is computed as the time average:
\[\sigma_{i} = \sqrt{\left\langle (\mathbf{x}_{i} - \langle\mathbf{x}_{i}\rangle)^2 \right\rangle}\]No mass weighting is performed. This method implements an algorithm for computing sums of squares while avoiding overflows and underflows [Welford1962], as well as an algorithm for combining the sum of squares and means of separate partitions of a given trajectory to calculate the RMSF of the entire trajectory [CGL1979].
For more details about RMSF calculations, refer to [Awtrey2019].
References
- Welford1962
B. P. Welford (1962). “Note on a Method for Calculating Corrected Sums of Squares and Products.” Technometrics 4(3):419-420.
Examples
In this example we calculate the residue RMSF fluctuations by analyzing the \(\text{C}_\alpha\) atoms. First we need to fit the trajectory to the average structure as a reference. That requires calculating the average structure first. Because we need to analyze and manipulate the same trajectory multiple times, we are going to load it into memory using the
MemoryReader
. (If your trajectory does not fit into memory, you will need to write out intermediate trajectories to disk or generate an in-memory universe that only contains, say, the protein):import MDAnalysis as mda from MDAnalysis.analysis import align from MDAnalysis.tests.datafiles import TPR, XTC u = mda.Universe(TPR, XTC, in_memory=True) protein = u.select_atoms("protein") # TODO: Need to center and make whole (this test trajectory # contains the protein being split across periodic boundaries # and the results will be WRONG!) # Fit to the initial frame to get a better average structure # (the trajectory is changed in memory) prealigner = align.AlignTraj(u, u, select="protein and name CA", in_memory=True).run() # ref = average structure ref_coordinates = u.trajectory.timeseries(asel=protein).mean(axis=1) # Make a reference structure (need to reshape into a # 1-frame "trajectory"). ref = mda.Merge(protein).load_new(ref_coordinates[:, None, :], order="afc")
We created a new universe
reference
that contains a single frame with the averaged coordinates of the protein. Now we need to fit the whole trajectory to the reference by minimizing the RMSD. We useMDAnalysis.analysis.align.AlignTraj
:aligner = align.AlignTraj(u, ref, select="protein and name CA", in_memory=True).run() # need to write the trajectory to disk for PMDA 0.3.0 (see issue #15) with mda.Writer("rmsfit.xtc", n_atoms=u.atoms.n_atoms) as W: for ts in u.trajectory: W.write(u.atoms)
(For use with PMDA we cannot currently use a in-memory trajectory (see Issue #15) so we must write out the RMS-superimposed trajectory to the file “rmsfit.xtc”.)
The trajectory is now fitted to the reference (the RMSD is stored as aligner.rmsd for further inspection). Now we can calculate the RMSF:
from pmda.rms import RMSF u = mda.Universe(TPR, "rmsfit.xtc") calphas = protein.select_atoms("protein and name CA") rmsfer = RMSF(calphas).run()
and plot:
import matplotlib.pyplot as plt plt.plot(calphas.resnums, rmsfer.rmsf)
New in version 0.3.0.
- Parameters
Universe (
Universe
) – aMDAnalysis.core.groups.Universe
(the atomgroups must belong to this Universe)atomgroups (tuple of
AtomGroup
) – atomgroups that are iterated in parallel
-
_results
¶ The raw data from each process are stored as a list of lists, with each sublist containing the return values from
pmda.parallel.ParallelAnalysisBase._single_frame()
.- Type
-
readonly_attributes
()¶ Set the attributes of this class to be read only
Useful to avoid the class being modified when passing it around.
To be used as a context manager:
with analysis.readonly_attributes(): some_function(analysis)
-
run
(start=None, stop=None, step=None, n_jobs=1, n_blocks=None)¶ Perform the calculation
- Parameters
start (int, optional) – start frame of analysis
stop (int, optional) – stop frame of analysis
step (int, optional) – number of frames to skip between each analysed frame
n_jobs (int, optional) – number of jobs to start, if -1 use number of logical cpu cores. This argument will be ignored when the distributed scheduler is used
n_blocks (int, optional) – number of blocks to divide trajectory into. If
None
set equal to n_jobs or number of available workers in scheduler.
-
See also
MDAnalysis.analysis.rms.RMSF