Google Summer of Code Students 2020
20 May 2020We are happy to announce that MDAnalysis is hosting three GSoC students this year – @hmacdope, @cbouy, and @yuxuanzhuang. This is the first year that MDAnalysis has been accepted as its own organization with GSoC and we are grateful to Google for granting us three student slots so that we can have three exciting GSoC projects.
Hugo MacDermott-Opeskin: Trajectory New Generation: the trajectory format for the future of simulation
Trajectory storage has always proved problematic for the molecular simulation community, as large volumes of data can be generated quickly. Traditional trajectory formats suffer from poor portability, large file sizes and limited ability to include metadata relevant to simulation. The Trajectory New Generation (TNG) format developed by the GROMACS team represents the first trajectory format with small file sizes, metadata storage, archive integrity verification and user/software signatures. The primary goal of this project is for @hmacdope to refactor the existing TNG code into C++ to provide clarity and usability for GROMACS, other simulation packages and analysis tools. Thin FORTRAN and Python layers are also desirable to encourage widespread adoption and are a secondary goal of the project. An efficient and transferable implementation of the TNG format will represent a major step forward for the computational molecular sciences community, enabling easy storage and replication of simulations.
This project is a collaboration with the GROMACS developer team with @acmnpv from GROMACS serving as a co-mentor.
Hugo MacDermott-Opeskin is a PhD student in computational chemistry at the Australian National University. His work focuses on studying membrane biophysics through molecular dynamics simulations coupled with enhanced sampling techniques. Hugo can be found on github as @hmacdope and on twitter as @hugomacdermott. When not hard at work Hugo can be found running or mountain biking in the Canberra hills.
Through GSoC Hugo aims to bring the TNG next generation trajectory format to the simulation community and he will document his experience at his “Biophysics Bonanza” blog.
Cédric Bouysset: From RDKit to the Universe and back
The aim of the RDKit interoperability
project
is to give MDAnalysis the ability to use RDKit’s Chem.Mol
structure as an input to an MDAnalysis Universe
, but also to
convert a Universe
or AtomGroup
to an RDKit molecule
. RDKit is
one of the most complete and one of the most commonly used
chemoinformatics package, yet it lacks file readers for formats
typically encountered in MD simulations. @cbouy will implement in
MDAnalysis the ability to switch back and forth between a Universe
and an RDKit molecule
to perform typical chemoinformatics
calculations and so add a lot of value to both packages.
Cédric is a PhD student in molecular modelling at Université Côte D’Azur, France. His research aims to decipher the molecular basis of chemosensory perception (smell and taste) using computational tools. His day-to-day work includes; modelling bitter taste receptors, building machine-learning models to search for molecules with interesting olfactive or sapid properties, maintaining the website of the Global Consortium of Chemosensory Researchers, and a bit of teaching. In his free time he enjoys cooking and playing video games. Cédric can be found on github as @cbouy and on twitter as @cedricbouysset.
Cédric will describe his progress in his blog.
Yuxuan Zhuang: Serialize Universes for parallel
As we approach the exascale barrier, researchers are handling
increasingly large volumes of molecular dynamics (MD) data. Whilst
MDAnalysis is a flexible and relatively fast framework for complex
analysis tasks in MD simulations, implementing a parallel computing
framework would play a pivotal role in accelerating the time to
solution for such large datasets. To achieve a flawless
implementation of parallelism, @yuxuanzhuang will implement
serialization support for
Universe
,
the core of MDAnalysis. Furthermore, he will adapt this new
serialization functionality to accelerate MDAnalysis’ analysis modules
using distributed computing frameworks, e.g. Dask, multiprocessing, or
MPI.
Yuxuan is a PhD student at Stockholm University. He mainly works on understanding pentameric ligand-gated ion channels from MD simulations. His daily workflow involves setting up and running simulations, on lab clusters or HPC centers, and performing various analyses on the MD trajectories in his jupyter notebook. Yuxuan can be found on github as @yuxuanzhuang.
Yuxuan will chronicle his work on his blog.
— @richardjgowers @IAlibay @acmnpv @fiona-naughton @orbeckst (mentors)