Blog

Chan Zuckerberg Initiative EOSS4 Award

MDAnalysis is excited to announce that we have been awarded an Essential Open Source Software for Science (EOSS) grant by the Chan Zuckerberg Initiative for our proposal “MDAnalysis: Faster, extensible molecular analysis for reproducible science”.

Over the next two years, this award will fund work on two major aims that will improve MDAnalysis for our users and make it easier for scientists to contribute code, with a view towards increasing code reuse, interoperability, and reproducibility:

(1) Improving the performance of essential core components by rewriting Python code in Cython

As MDAnalysis has now become a mature library with a stable API, we want to focus on improving the performance of core library components by rewriting essential code in Cython. This will enable us to handle the increasingly large and complex datasets generated to solve the biomedical challenges of tomorrow.

(2) Launching the “MDAKits” ecosystem

We will launch the MDAnalysis Toolkits (“MDAKit”) ecosystem. Inspired by SciPy’s SciKits, an MDAKit will contain code based on MDAnalysis that solves specific scientific problems. An MDAKit can be written by anyone and hosted anywhere. In order to streamline the process we will provide a cookiecutter-based template repository. MDAKits will have a well-defined API, have documentation, will be registered with MDAnalysis, and will be continuously tested for compatibility with the current version of MDAnalysis. As part of the registration process, code will be reviewed by MDAnalysis developers and we will encourage the scientific publication of the package, e.g., in the Journal of Open Source Software. We will also refactor a number of analysis modules as separate MDAKits. Overall, the introduction of MDAKits will enable the rapid development and dissemination of new components and tools that can be released independently of the main package. We will soon describe the details of MDAKits in a white paper and seek a discussion with the wider community. Ultimately, MDAKits should help address two long-standing challenges in the biomolecular simulation community: effort duplication and reproducibility.

Along the same lines, we will focus on software interoperability with other packages by expanding our converters framework.

By validating, maintaining and publicizing new code through a common process, we encourage scientists to reuse and extend existing packages rather than re-creating their own ones. We hope that this combined approach will improve consistency, transparency, and accessibility in methods used across scientific studies.

This project will be led by Irfan Alibay, Lily Wang, Fiona Naughton, Richard Gowers, and Oliver Beckstein.

About MDAnalysis

The MDAnalysis package is one of the most widely used Python-based libraries for the analysis and manipulation of molecular simulations. The objective of MDAnalysis is to provide simple, flexible, and efficient means of handling molecular structure data from simulations and experiments and support research in biophysics, biochemistry, materials science and beyond.

MDAnalysis is a fiscally-sponsored project of NumFOCUS, a nonprofit dedicated to supporting the open source scientific computing community.

MDAnalysis 2.0 is here

We are happy to release version 2.0.0 of MDAnalysis!

This is a new major version, and includes major API breaking changes in addition to a large number of updates and enhancements. This release concludes our current roadmap, an updated roadmap detailing our future steps will shortly be published.

This version supports Python 3.6 through to 3.9 on Windows (32- and 64-bit), MacOS, and Linux. It also offers preliminary support for ppc64le and ARM64, but does not currently support Apple M1 processors.

Support for legacy version of Python (e.g. 2.7 and 3.5) is only available in MDAnalysis 1.x releases.

Upgrading to MDAnalysis version 2.0.0

To install with conda from the conda-forge channel run

conda update -c conda-forge mdanalysis

To install from PyPi with pip run

pip install --upgrade MDAnalysis

For more help with installation see the installation instructions in the User Guide.

Notable new additions

There were many awesome contributions to MDAnalysis for version 2.0.0. Here we list a few notable example of these. For more information please see the CHANGELOG.

RDKit converter

Google Summer of Code student @cbouy implemented a new RDKit converter which allows for seamless conversions from RDKit to MDAnalysis structures. See @cbouy’s GSoC RDKit report for more information.

It currently accurately converts 90% of chemical space (based on ChEMBL27), but future changes will improve this.

OpenMM converter

@ahy3nz implemented an OpenMM converter which can read OpenMM objects directly into MDAnalysis.

Universe serialization

Google Summer of Code student @yuxuanzhuang implemented a means to serialize MDAnalysis.Universe objects, paving the way for parallel analysis. See @yuxuanzhuang’s GSoC Universe serialization report for more information.

Mean Square Displacement

@hmacdope implemented a new Mean Square Displacement analysis class.

Results class

@PicoCentauri implemented a Results class to store results from analysis classes. This change streamlines the API for analysis classe and paves the way for a soon-to-be released command line interface for MDAnalysis.

H5MD support

NSF REU student @edisj implemented both a reader and writer for the H5MD format.

Bond-Angle-Torsion coordinates

@daveminh implemented a means to translate from Cartesian to Bond-Angle-Torsion coordinates.

Important fixes

For a full list of bugfixes see the CHANGELOG. The following are selected fixes that may have lead to wrong results depending on your use case.

Core

  • Fixes an issue where select_atom, AtomGroup.unique, ResidueGroup.unique, and SegmentGroup.unique did not sort the output atoms (Issues #3364 #2977)
  • Fixes the sometimes wrong sorting of atoms into fragments when unwrapping (Issue #3352)
  • AtomGroup.center now works correctly for compounds + unwrapping (Issue #2984)

File formats

  • Fixes support for DL_POLY HISTORY files that contain cell information even if there are no periodic boundary conditions (Issue #3314)
  • PDBWriter will use chainID instead of segID (Issue #3144)
  • PDBParser and PDBWriter now assign and use the element attribute (Issues #2422 #2423)

Analyses

  • Fixes issue when attempting to use/pass mean positions to PCA analysis (Issue #2728)
  • Fixes issue with WaterBridgeAnalysis double counting waters (Issue #3119)
  • Documents and fixes the density keyword for rdf.InterRDF_s (Issue #2811)
  • Fixed Janin analysis residue filtering, including CYSH (Issue #2898)

Changes to functionality

  • Deprecated hbonds.hbond_analysis has been removed in favour of hydrogenbonds.hbond_analysis (Issues #2739, #2746)
  • TPRParser now loads TPR files with tpr_resid_from_one=True by default, which starts TPR resid indexing from 1 (instead of 0 as in previous MDAnalysis versions) (Issue #2364, PR #3152)
  • analysis.hole has now been removed in favour of analysis.hole2.hole (Issue #2739)
  • Writer.write(Timestep) and Writer.write_next_timestep have been removed. Please use write() instead (Issue #2739)
  • Removes deprecated density_from_Universe, density_from_PDB, Bfactor2RMSF, and notwithin_coordinates_factory from MDAnalysis.analysis.density (Issue #2739)
  • Removes deprecated waterdynamics.HydrogenBondLifetimes (PR #2842)
  • hbonds.WaterBridgeAnalysis has been moved to hydrogenbonds.WaterBridgeAnalysis (Issue #2739 PR #2913)

Deprecations

  • The bfactors attribute is now aliased to tempfactors and will be removed in 3.0.0 (Issue #1901)
  • WaterBridgeAnalysis.generate_table() now returns table information, with the table attribute being deprecated
  • Various analysis result attributes which are now stored in Results will be deprecated in 3.0.0 (Issue #3261)
  • In 3.0.0 the ParmEd classes will only be accessible from the MDAnalysis.converters module
  • In 2.1.0 the TRZReader will default to a dt of 1.0 ps when failing to read it from the input TRZ trajectory

Author statistics

Altogether, this represents the work of 35 authors, 17 of which were new contributors:

MDAnalysis thanks NumFOCUS’s continued support as the organisation’s fiscal sponsor.

  • The MDAnalysis Team

Google Summer of Code Students 2021

We are happy to announce that MDAnalysis is hosting two GSoC students this year – @ojeda-e, and @orioncohen. MDAnalysis has been accepted as its own organization with GSoC for a second year running and we are grateful to Google for granting us two student slots for two exciting projects. Both the students and mentors have a very exciting few months ahead!

Estefania Barreto-Ojeda: Curvature analysis of biological membranes

Estefania Barreto-Ojeda

Interested in contributing to an open-source initiative, Estefania will expand the capabilities of MDAnalysis by integrating a new MDAnalysis module to calculate membrane curvature to derive and visualize membrane curvature profiles of protein-membrane/membrane-only systems obtained from Molecular Dynamics (MD) simulations. With the introduction of this analysis module, users will rapidly extract mean and gaussian curvature of biological membranes and their respective visualization in 2D-profile maps.

Estefania is a Ph.D. candidate in Biophysical Chemistry at The University of Calgary, Canada in the research group of Peter Tieleman. Her research work is focused on membrane curvature induced by ABC transporters, a superfamily of transmembrane proteins involved in cancer and antibiotic resistance. A typical day for Estefania includes running Coarse-Grained (CG) MD simulations using the Martini force field, reading literature on ABCs, and working on cool data visualization workflows. In her free time, she enjoys camping and road tripping in the Canadian Rockie Mountains and going for long bike rides.

Estefania can be found on github as @ojeda-e and on twitter as @ebojeda.

Her journey will be documented on the blog Le Mirroir.

Orion Cohen: A Solvation Module for MDAnalysis

Orion Cohen

The macroscopic behavior of a liquid is determined by its microscopic structure. For ionic systems, like batteries and many enzymes, the solvation environment surrounding ions is especially important. By studying the solvation of interesting materials, scientists can better understand, engineer, and design new technologies. The aim of this project is to implement a robust and cohesive set of methods for solvation analysis that would be widely useful in both biomolecular and battery electrolyte simulations. The core of the solvation module will be a set of functions for easily working with ionic solvation shells. Building from that core functionality, the module will implement several analysis methods for analyzing ion pairing, ion speciation, residence times, and shell association and dissociation.

Orion is a Ph.D. student at the University of California Berkeley, working with Dr. Kristin Persson at the Lawrence Berkeley National Laboratory. His work leverages high-throughput chemical simulations and machine learning to discover new materials for Lithium-ion batteries. Orion is passionate about making science more accessible, reproducible, and efficient with powerful open-source software. If he isn’t toiling at his computer, Orion is probably playing board games, camping, or relaxing in the temperate Berkeley sun.

Orion is on Github as @orioncohen and on twitter as @orion__archer.

He will be sharing his experience with GSoC on his website.

@richardjgowers @IAlibay @fiona-naughton @orbeckst @lilyminium @hmacdope (mentors)