Writing new parallel analysis¶
At the core of PMDA is the idea that a common interface makes it easy to create code that can be easily parallelized, especially if the analysis can be split into independent work over multiple trajectory segments (or “slices”) and a final step in which all data from the trajectory slices is combined.
Wrapping existing analysis functions¶
If you already have a function that
takes one or more
AtomGroup
instances as inputanalyzes one frame in a trajectory and returns the result for this frame
then you can use the helper functions in pmda.custom
to rapidly
create a parallel analysis class that follows the common PMDA
API.
Example: Parallelizing radius of gyration¶
For example, we want to calculate the radius of gyration of a
protein. We first create the protein AtomGroup
:
import MDAnalysis as mda
u = mda.Universe(topology, trajectory)
protein = u.select_atoms("protein")
The the following function calculates the radius of gyration of a
protein given in AtomGroup
ag
:
def rgyr(ag):
return ag.radius_of_gyration()
We can wrap rgyr()
in pmda.custom.AnalysisFromFunction
import pmda.custom
parallel_rgyr = pmda.custom.AnalysisFromFunction(rgyr, u, protein)
Run the analysis on 8 cores and show the timeseries that was collected
in the results
attribute:
parallel_rgyr.run(n_jobs=8)
print(parallel_rgyr.results)
Building PMDA analysis classes¶
With the help of pmda.parallel.ParallelAnalysisBase
one can
write new analysis functions that automatically parallelize. This
approach provides more freedom than Wrapping existing analysis functions
described above.
Define the single frame analysis function, i.e., how to compute the observable for a single time step from a given
AtomGroup
.Derive a class from
ParallelAnalysisBase
that uses the single frame function.
As an example, we show how one can parallelize the RMSF function (from
MDAnalysis.analysis.rms.RMSF
):
TODO
more TODO
other example?