Interoperability roadmap

On June 18 2020, MDAnalysis was pleased to release the first major version, 1.0.0. As described in our 2019 roadmap, this is the last version that supports 2.7. We will continue backporting relevant bug fixes where feasible (e.g. the upcoming 1.0.1), but the next major release will be 2.0.0, which will support Python 3.6+.

As we look forward to this next milestone, it is time to consider the next directions of MDAnalysis. The development of MDAnalysis has always been driven by the growing need for standardised, accessible analysis tools for open, reproducible, and collaborative research. While many major packages for molecular dynamics simulation provide their own set-up and analysis software, these are necessarily targeted to their own particular standards. MDAnalysis aims to provide analysis tools for simulation data in general, so historically a key objective has been to expand the number of supported package-specific data formats. As of version 1.0.0, we support over 40 file formats used in major packages for both molecular dynamics and quantum chemistry.

In 1.0.0 we also began to explore an exciting new approach: direct interoperability with other popular packages for molecular analysis by becoming API compatible instead of just file-format compatible, an approach also reinforced by discussions at the 2019 MolSSI Workshop: Molecular Dynamics Software Interoperability. Our new converters are distinct from topology parsers and coordinate readers as a third avenue for loading data into MDAnalysis. In 1.0.0 we added converters for two libraries: the molecular editor ParmEd, and chemfiles, a library for reading data from computational chemistry formats.

The general lack of interoperability between software packages in the molecular modelling community has been highlighted in the 2019 report of the NSF MolSSI on Molecular Dynamics Software Interoperability, noting consequences such as great duplication of effort in developing and maintaining similar tools across different formats; significant barriers to collaborating and transferring data; and requiring scientists to learn multiple packages and languages to access the full breadth of available analysis algorithms.

Moving forward, our plan is to increase the range of analyses and formats accessible to users by becoming interoperable with other relevant libraries. This reduces the need to duplicate and support existing tools within our own framework, and allows MDAnalysis to become a general-purpose analysis toolkit. We are already in the process of expanding compatible libraries in 2.0.0 by adding support for the widely popular RDKit cheminformatics toolkit through a Google Summer of Code projects being carried out by Cédric Bouysset (@cbouy).

By the end of 2021, we aim to have expanded the range of our Converters framework to include packages in three categories: widely-used analysis libraries, such as MDTraj and pytraj; libraries that can expand the range of formats we can support, such as OpenBabel; and direct interfaces with computational chemistry engines such as OpenMM and Psi4.

Ensuring robust interoperability is best done as a community effort. If you are interested in contributing, or have comments or suggestions on our future directions, please get in touch!