Blog

NumFOCUS Sponsored Workshop and Hackathon

NumFOCUS Foundation

We are happy to announce that thanks to a NumFOCUS development grant we will be hosting a 2 day workshop and hackathon aimed at introducing researchers to MDAnalysis. The event will be free to attend and will be held at Northwestern University, Evanston IL on the 12th and 13th of November 2018. In addition, there are small travel grants available for people attending from other US institutions aimed at promoting diversity in STEM. Women and minorities are especially encouraged to apply!

The first day of the workshop will focus on covering the basics of the Python programming language and an introduction to the MDAnalysis library. The second day will cover how to apply MDAnalysis to your own research data. The workshop will conclude with a hackathon which will teach you how to start contributing to open source software.

Workshop program details

Registration

To attend the Workshop please register here. Space is limited and registration closes on October 15th.

Program and Materials

The workshop materials are online and are available for attendants and anyone else interested.

Google Summer of Code Students 2018

We are happy to anounce that MDAnalysis is hosting two GSoC students for NumFOCUS this year, Ayush Suhane (@ayushsuhane) on GitHub) and Davide Cruz (@davidercruz).

Ayush Suhane: Improve Distance Search Methods in MDAnalysis

Ayush Suhane

With the capability of multiple MD codes to easily handle milions of atoms, a major roadblock to analysis of this vast amount of data corresponding to positions of each atoms at every timestep is the time to evaluate pairwise distance between multiple atoms. Almost every operation requires the distance between the pair of atoms, fast calculation of pairwise distance is of utmost importance. Multiple basic analysis functions like Radial Distribution Function, Contact Matrices, depepend very heavily on fast distance evaluations. Apart from naive approach for pairwise calculations which scale as \(\mathcal{O}(N^2)\), other forms of data structures like KDTree, Octree are sugested for faster calulations based on the requirements. Based on the MDAnalysis, two use cases are identified as highly used in majority of the analysis algorithms. The goal of the project is to identify the data structure based on the requirements of the use case and implement in the MDAnalysis library along with clear documentations and test cases.

Ayush is a graduate student in Materials Engineering at UBC, Canada. He is working with Molecular dynamics simulations for his Master’s thesis. He wishes to contribute to the open source community by simplifying the complex analysis and visualization involved in MD. During GSoC, he aims to use the opportunity to learn from the already established open source contributors and continue the tradition by becoming an active member of the community. In his free time, he also likes to read fiction novels and play computer games.

Ayush will describe his progress on his blog.

Davide Cruz: Implementing on-the-fly coordinate transformations

Davide Cruz

Implement trajectory transformations on the MDAnalysis API, to be called on-the-fly by the user, eliminating the requirement for multiple intermediate steps of modifying and saving the trajectory, and giving users a more efficient and simple workflow for simulation data analysis.

Davide is currently on the last year of his PhD on Molecular Biosciences at ITQB-NOVA in Lisbon, Portugal. For his thesis he is using MDAnalysis to analyse the results of molecular dynamics simulationations and this GSoC project is an opportunity to contribute to the community. He expects to learn a lot about python and software development during this summer.

Davide will describe his progress on his blog.

Other NumFOCUS students

NumFOCUS is hosting 45 students this year for several of their supported and affiliated projects. You can find out about the other students here.

Release 0.18.0

MDAnalysis version 0.18.0 has been released.

This release brings various fixes and new features and users should update with either pip install -U MDAnalysis or conda install -c conda-forge mdanalysis.

One exciting new feature is the addition of duecredit to keep track of what citations are appropriate. Once you have written an analysis script (e.g., myanalysis.py) and have installed the duecredit package (pip install duecredit), you can either set the environment variable DUECREDIT_ENABLE=yes and run your script python myanalysis.py or to run python -m duecredit myanalysis.py to be given a report of what citable software you have used. You can then use the data written by duecredit to export the bibliography and have it ready to be imported into your reference manager. We hope that this will allow all contributors of analysis packages within MDAnalysis to get properly cited and we are working on retroactively adding all required citations to duecredit.

The AtomGroup.groupby method now supports using multiple attributes to group by, for example one could create a dictionary which allows a particular residue name and atom name to be quickly queried:

>>> grouped = ag.groupby(['resnames', 'names'])

>>> grouped['MET', 'CA']
<AtomGroup with 6 atoms>

When writing GRO files, previously all atoms would have their indices reset so that they ran sequentially from 1. To preserve their original index, the reindex option has been added to the GROWriter. For example:

>>> u = mda.Universe()

>>> u.atoms.write('out.gro', reindex=False)

or

>>> with mda.Writer('out.gro', reindex=False) as w:
...     w.write(u.atoms)

Gromacs users can benefit from a new feature when reading TPR files. Now, when the topology is read from a TPR file, the atoms have a moltype and a molnum attribute. The moltype attribute is the molecule type as defined in ITP files, the molnum attribute is the index of the molecule. These attributes can be accessed for an atom group using the plural form:

>>> u = mda.Universe(TPR, XTC)
>>> u.atoms.moltypes
>>> u.atoms.molnums

These attributes can be used in groupby:

>>> u.atoms.groupby('moltypes')
>>> u.atoms.groupby('molnums')

to provide access to all atoms of a specific moleculr type or that are part of a particular molecule.

The AtomGroup.split method of atom groups can also work on molecules:

>>> u.atoms.split('molecule')

and will create a list of AtomGroup instances, one for each molecule.

For convenience, various Group classes have been moved to the top namespace (namely, AtomGroup, ResidueGroup, SegmentGroup):

import MDAnalysis as mda


u = mda.Universe(topology, trajectory)


# for creating AtomGroups from arrays of indices
ag = mda.AtomGroup([11, 15, 16], u)

# or for checking an input in a function:
def myfunction(thing):
    if not isinstance(thing, mda.AtomGroup):
        raise TypeError("myfunction requires AtomGroup")

And finally, this release includes fixes for many bugs. This includes a smaller memory footprint when reading NetCDF trajectories, better handling of time when reading DCD trajectories and adding support for Gromacs 2018 TPR files. For more details see the CHANGELOG entry for release 0.18.0.

As ever, this release of MDAnalysis was the product of collaboration of various researchers around the world featuring the work of 12 different contributors. We would especially like to welcome and thank our six new contributors: Ayush Suhane, Mateusz Bieniek, Davide Cruz, Navya Khare, Nabarun Pal, and Johannes Zeman.