Data files¶
In this section you will download all the files that are required for the tutorial. This includes Python scripts and real-world data files.
Input files and scripts¶
Get the latest version of the tutorial:
git clone https://github.com/MDAnalysis/SPIDAL-MDAnalysis-Midas-tutorial.git
The Python scripts for the tutorial are located in
SPIDAL-MDAnalysis-Midas-tutorial/rp
and
SPIDAL-MDAnalysis-Midas-tutorial/util
.
Data¶
A real-world data set of about 1.5 GiB size is made available (and should be downloaded well in advance of this tutorial). These are 400 trajectories of the conformational transition of the enzyme adenylate kinase [Seyler2015], sampled with the dynamic importance sampling (DIMS) MD method [Beckstein2009] [Perilla2011] or the FRODA method [Farrell2010].
Download the whole archive as a 1.2 GiB tar.bz2 from https://becksteinlab.physics.asu.edu/pages/download/SPIDAL-tutorial-data.tar.bz2 and unpack with
tar -jxvf SPIDAL-tutorial-data.tar.bz2
(Alternatively, use the dropbox PSA folder.)
This should create a directory tutorial-data
. Do not change the
contents inside the directory because the primitive
generate_file_list.py script (see below) has built-in
assumptions on the directory structure.
Put the tutorial-data
directory in the same directory as the
tutorial code directory SPIDAL-MDAnalysis-Midas-tutorial
.
Work directory¶
Perform all runs in a directory WORK
(here we assume that WORK
in the same directory as the data and the tutorial code):
mkdir WORK
cd WORK
Generate JSON file¶
We need a starting JSON file that lists the pairs of (topology, trajectory) files with absolute paths. This is a short cut for this tutorial, in general, managing large collections of simulation trajectories is a difficult topic and tools such as MDSynthesis can be employed.
Generate input file for the RP script rp/rp_psa.py with the helper script util/generate_file_list.py:
cd WORK # if you are not already in WORK
RPDIR=$PWD/../SPIDAL-MDAnalysis-Midas-tutorial/rp
$RPDIR/../util/generate_file_list.py -T .. trajectories.json
(Just needs the path to the directory that contains the data and and output file name.) This creates the list with all 2 x 200 = 400 trajectories.
Also create a smaller list for testing (only 2 x 5 = 10 trajectories):
$RPDIR/../util/generate_file_list.py -T .. -e 5 testcase.json
You can have a look at the JSON files with less or a text editor. It is important that the files are listed with absolute paths to make sure that the radical.pilot script can find them.