Towards MDAnalysis 3.0

Overview

MDAnalysis is beginning to plan for our 3.0 release! Following discussions at our first UGM, we (the MDAnalysis Core Developers) have decided on a set priorities for this exciting milestone (detailed below). Our primary goals can be divided into several main themes.

A trimmed down core library

As MDAnalysis continues to grow along with the field of molecular simulations, we are aiming to continue to provide a highly functional, easily maintainable package that can be the foundation of a wide variety of downstream tools.

Following the successful introduction of the MDAKits ecosystem (modelled on the SciKit ecosystem, see the MDAKits paper) as part of our EOSS4 work, which provides infrastructure for creating FAIR packages that address specific scientific questions in the analysis of molecular simulation, we are aiming to move some more specialized analyses currently present in the MDAnalysis core library to MDAKits. Several analysis modules, including hole2 (pore and cavity analysis), encore (ensemble similarity analysis), psa (path similarity analysis), helanal (helix orientation) and waterdynamics (water orientation and dynamicss) will become stand-alone MDAKits and are going to be removed from MDAnalysis.analysis in 3.0. Some of these packages are already deployed as MDAKits and are deprecated in the core library.

This change will allow for a reduction in the core dependencies of MDAnalysis and reduce the maintenance burden of specialist analyses in the core library. This will also empower those developing specialist analysis code to rapidly develop and prototype their code in well supported downstream packages. Additionally, core developers will be able to spend more cycles on core competencies, fixing critical issues and planning future improvements, ensuring that MDAnalysis can be easily and effectively maintained for years to come.

For at least the 3.x life cycle, the MDAnalysis development team will fully maintain MDAKits that have been moved out of the core (unless other maintainers take over). We are, however, very much looking for community members who are willing to participate in MDAKit maintenance or want to take over maintenance of a kit. If some functionality is important for your research, work with us: We will teach you the basics of package maintenance and you will perform a great service to the molecular simulation community!

Enhancements

We have identified several areas that we would like to target for future improvement of MDAnalysis across four key themes.

  1. Improving interoperability
    • Add support for new formats
    • Moving converters to MDAKits
    • Improve interoperability with chemoinformatics and simulation software
  2. Scientifically aware operations
    • Improve guessers to enable context-dependent chemical information to be easily inferred
    • Make MDAnalysis more chemoinformatics-aware
    • Improve support for units and unit systems
  3. Support for streaming and cloud storage
    • Add support for generic streams in file readers
    • Enable reading of trajectories and topologies directly from cloud storage (eg AWS S3)
  4. Enhanced parallelism and performance
    • Improve performance of key distance routines with distopia
    • Create a compiled backend for ASCII topology and coordinate readers
    • Redesign precision model in MDAnalysis internals to guarantee user-controlled precision
    • Add support for parallelism via a Dask based parallel backend (continuation of a GSOC project)

We aim to lay the groundwork for these future directions in our work for MDAnalysis 3.0.

User facing API changes

We are also planning to clear up some API inconsistencies scattered across the library that require an API break, including some instances where coordinate readers behave differently relative to each other, renaming some arguments to clearer names and addressing other miscellanea. Additionally, we are aiming to provide a more consistent and usable low-level Cython API surface that compiled extensions can hook into. We are also investigating the feasibility of improving consistency of precision handling in various parts of the MDAnalysis core library, as the use of double and mixed precision has been identified as a key performance bottleneck.

Improving organization structure and formalising roles and responsibilities.

We have identified that assigning direct roles and responsibilities across our Core Developers is a change that will be beneficial for the health of our community going forward. To this end we are workshopping a more formal organization structure that will make our internal processes more transparent and allow more functional redundancy in key roles. Watch this space!

If you have any feedback on this roadmap or would like to discuss our future plans with us, contact us on our Discord or mailing list.

– Hugo MacDermott-Opeskin on behalf of the MDAnalysis Core Developers