Blog

Relicensing MDAnalysis

LGPLv3 logo

This blog post outlines MDAnalysis’ proposal to change its license to the GNU Lesser General Public License (LGPL v3+).

A summary of our reasons for proposing this license change, alongside upcoming actions for community members and library contributors are provided.

⚠️ Disclaimer The MDAnalysis core team members are not lawyers. As such the information provided here does not, and is not intended to, constitute legal advice. This blog post also does not represent MDAnalysis’ full legal position on software licensing; it simply aims to inform MDAnalysis developers and users on why we believe the library should be relicensed.

Further information on open-source software licensing can be found from sources such as the Open Source Initiative, tl;drLegal and the Software Sustainability Insitute.

Should you have any concerns about licensing, we always strongly recommend getting legal advice before making any decisions on how licensing changes may affect you.

Overview

We want to change the license of MDAnalysis from the GNU General Public License v2 (or any later versions) (GPL v2+) to the less restrictive GNU Lesser General Public License v3 (or any later versions) (LGPL v3+) license. Both are open source licenses but it is our view that the LGPL v3+ will give developers more freedom in how they license any of their own codes that make use of MDAnalysis.

As detailed by the Open Source Definition, licenses are core to the definition of open source. “Open source doesn’t just mean access to the source code”. The license defines how code can be used, copied, changed, and incorporated into other code.

License changes will affect how people interact with the MDAnalysis code base going forward. We need the agreement of our contributors and community members to change from GPL v2+ to LGPL v3+.

In this post we want to share our motivation, outline the relicensing process, and invite comments / questions from the community.

Rationale for license change

Why is GPL v2+ no longer the best choice?

Since its initial release in 2008, MDAnalysis has grown from a small Python package used by a handful of enthusiastic graduate students and postdocs to a mature library that is used by thousands of researchers in the molecular sciences. The MDAnalysis library was published under an open source license from the start so that anyone could freely use it, contribute to it, and build on it. We chose the GNU General Public License version 2+ (GPL v2+) for this purpose. The GPL v2+ has a “copy-left” clause that requires anyone using MDAnalysis in their own code to also adopt a compatible version of the GPL for their code. This means that code contributors could feel that any time and work that they invested into MDAnalysis would not end up contributing to software without open-source licensing.

However, the GPL v2+ has also created barriers to adoption of MDAnalysis. Under many interpretations, ours included, this prevents developers who use MDAnalysis from making their own code available under non-GPL licenses. It is the MDAnalysis core team’s view that we do not want to dictate how our developers and users should license their code, but we do wish that work on the MDAnalysis library remains open and free.

Changing to a less restrictive license would benefit the MDAnalysis community, increasing the number of codes which can use MDAnalysis, and enabling users in corporate environments to use the library with more certainty. The reduced licensing complexity also paves the way for our proposed MDAKit ecosystem.

Why now the LGPL v3+?

We therefore propose to undergo the process of relicensing MDAnalysis under the GNU Lesser General Public License v3 (or any later versions). This open source license fulfills a number of important requirements for us:

  1. Downstream codes are able to freely import or link to MDAnalysis library components without impacting the license choice of the downstream code.

  2. Downstream codes are able to use and subclass any MDAnalysis components under its application programming interface (classes, methods, and data objects), without impacting the license choice of the downstream code.

  3. Codes that either copy or extend the MDAnalysis library should fall under the copyleft license requirements of the MDAnalysis library license.

Thus, it is our view that the LGPL v3+ license gives people the freedom to choose any license for their own code that makes use of the MDAnalysis library as a whole (namely import MDAnalysis or subclassing). This includes closed / commercial licenses (although we encourage the use of open source licenses). However, one would not be able to just take parts of the MDAnalysis code and add it into another codebase unless this other code is then also licensed under a compatible copyleft license (e.g. GPLv3+/LGPLv3+).

We considered other popular licenses but none fulfilled the requirements listed above.

How will the relicensing process work?

As of writing, MDAnalysis has over 160 contributors, all of whom have contributed code under the terms of the GPL v2+ license. We also have a large user community that uses the library for many wonderful scientific applications, including several downstream libraries.

Ultimately, the final decision on relicensing rests with code authors. However, we fully recognise that this is a big change for the MDAnalysis user base and the wider molecular sciences community. As always, we are fully invested in ensuring that our actions reflect the needs of our community. We therefore want to give everyone an opportunity to ask questions about or comment on the relicensing effort as part of this process.

Consultation period (7th November until 5th December 2022)

We will start the process with an open consultation period lasting 28 days from 7th November to 5th December 2022.

During this period we encourage members of the community, both developers and users, to comment on and ask questions about the proposed relicensing efforts. The aim is to ensure that relicensing is indeed in the interest of the community. We will do our best to account for any concerns raised before attempting to continue with the long and time-consuming process of relicensing.

We wish to open this conversation on our public forums (mailing lists, discord, twitter). As legal matters such as licensing can sometimes be sensitive in nature we have also set up an email address ([email protected]) monitored solely by the MDAnalysis Core Developers for any private queries that you may have.

A summary of open discussions and frequently asked questions will be made available on the MDAnalysis wiki.

Note: Whilst the consultation will only last 28 days, we will continue to engage with conversations on this topic for the entire length of the relicensing process.

Contacting contributors (6th December onwards)

After the consultation period, we will contact every code contributor to the core MDAnalysis library with a request to agree to changing their contribution’s license from the current “GPL v2 or any later version” to “LGPL v3 or any later version”.

It is important that we hear back from as many contributors as possible. If you have contributed to MDAnalysis in the past but have since changed your git-linked contact details, we would kindly ask if you could email [email protected] to let us know how best to contact you.

License change

We do not know how long relicensing will take, especially as contacting historical contributors will likely be a very slow process. Nevertheless, our aim is to change the license as quickly as possible. We will keep the community regularly updated on our progress.

Acknowledgments

We are very grateful for the administrative and legal support from our fiscal sponsor, NumFOCUS.

– The MDAnalysis Core Developers

MDAnalysis 2.3 is out

We are happy to release the version 2.3.0 of MDAnalysis!

This relatively small update to MDAnalysis reflects our commitment to doing more frequent (trimonthly) releases of the library.

In line with NEP 29, the minimum required NumPy version has been raised to 1.20.0 (1.21.0 for macosx-arm64).

Supported Python versions: 3.8, 3.9, 3.10

Supported Operating Systems:

Upgrading to MDAnalysis version 2.3.0

To update with conda from the conda-forge channel run

conda update -c conda-forge mdanalysis

To update from PyPi with pip run

pip install --upgrade MDAnalysis

For more help with installation see the installation instructions in the User Guide.

Notable changes

For a full list of changes, bugfixes and deprecations see the CHANGELOG.

Fixes:

  • Fixed reading error when dealing with corrupt PDB CONECT records, and an issue where MDAnalysis would write out unusable CONECT records with index>100000 (Issue #988).

Enhancements:

  • Formal charges are now read from PDB files and stored in a formalcharge attribute (PR #3755).
  • A new normalizing norm parameter for the InterRDF and InterRDF_s analysis methods (Issue #3687).
  • Improved Universe serialization performance (Issue #3721, PR #3710).

Changes:

  • To install optional packages for different file formats supported by MDAnalysis, use pip install ./package[extra_formats] (Issue #3701, PR #3711).

Deprecations:

  • The extra_requires target AMBER for pip install ./package[AMBER] will be removed in 2.4.0. Use extra_formats (Issue #3701, PR #3711).

CZI EOSS Performance Improvements:

A series of performance improvements to the MDAnalysis library’s backend have been made as per planned work under MDAnalysis’ Chan Zuckerberg Initiative EOSS4 grant. Further details about these will be provided in a future blog post.

  • MDAnalysis.lib.distances now accepts AtomGroups as well as NumPy arrays (PR #3730).
  • Timestep has been converted to a Cython Extension type (PR #3683).

Author statistics

This release was the work of 10 contributors, 3 of which are new contributors.

Our new contributors are: @miss77jun @rzhao271 @hsadia538

Acknowledgements

MDAnalysis thanks NumFOCUS for its continued support as our fiscal sponsor and the Chan Zuckerberg Initiative for supporting MDAnalysis under an EOSS4 award.

— The MDAnalysis Team

MDAKits

As part of our CZI EOSS 4 grant we announced our plans to create an MDAKit ecosystem. With this post we aim to make our plans more concrete and solicit feedback from the community.

Beyond the outline provided here, the complete details of our plans can be found in our white paper named MDAKits: Supporting and promoting the development of community packages leveraging the MDAnalysis library [v0.1.0], available as a PDF at DOI 10.6084/m9.figshare.20520726.v1.

What is an MDAKit?

MDAKits are standalone packages containing code using MDAnalysis components that solves a specific scientific problems or in some form enhances the functionality of MDAnalysis core library. An MDAKit can be written by anyone and hosted anywhere.

A MDAKit can be registered in the MDAKits registry. In this case, it has to fulfill a number of additional requirements such as open-source licensed, hosted in a version control system, clear designation of authors/maintainers, documentation, and tests and continuous integration. Registered MDAKits will be listed publicly and thus be advertised to the whole MDAnalysis community. They will also be continuously tested against the latest released version and the current development version of the core MDAnalysis library so that users and developers have an up-to-date view of the code health of an MDAKit.

Why?

The open sharing of code that abides by the basic principles of FAIR (findability, accessibility, interoperability, and reusability) is essential to robust, reproducible, and transparent science. However, scientists typically are not supported in making the substantial effort required to make software FAIR-compliant, or incentivized with academic recognition or reward.

Our goal with MDAKits is to lower the barrier for researchers to produce FAIR software.

We support developers in creating new packages, guiding them through the process of achieving best practices and FAIR compliance. At the same time, we hope to make MDAnalysis useful to a broader community.

How to develop an MDAKit?

We are producing tools for creating MDAKits to help developers and we are working on infrastructure to publicize MDAKits. Our work on MDAKits is an ongoing process but you can now get started creating your own MDAKit:

MDAKit project template

Our first tool is the cookiecutter-mdakit, a cookiecutter template that generates a skeleton project that implements our recommended best practices. With cookiecutter installed, execute the following command inside the folder you want to create the skeletal repository

cookiecutter gh:MDAnalysis/cookiecutter-mdakit

Follow the prompts or hit enter for the default options.

Then add your own code to the project. Add tests — you can extend the example tests in the template that show how to test MDAnalysis-based code. Commit and push your changes.

(The MDAKit cookiecutter is based off the Cookiecutter for Computational Molecular Sciences (CMS) Python Packages by Levi N. Naden and Jessica A. Nash from the Molecular Sciences Software Institute (MolSSI) and Daniel G. A. Smith of ENTOS. Thank you!)

Registering an MDAKit

If you want to register your MDAKit then create a pull request to add a meta data entry metadata.yaml to MDAnalysis: MDAKits/mdakits/{YOUR_MDAKIT_NAME} (where you will also find a template to get you started). Your PR will be reviewed for compliance with the requirements (for right now, see the white paper for specifics). Once registered, your MDAKit will be continuously tested.

Towards publication

The best practices that we encourage MDAKits to fulfill essentially amount to the majority of the contribution criteria for submissions to software-focused journals such as the Journal Open Source Software (JOSS). We encourage MDAKits to consider submission to such a journal once they meet the required levels of best practices. We are working towards streamlining the submission process for JOSS.

Give us feedback!

We are looking for feedback from the community: please let us know via our malinglist or discord or via the MDAKits issue tracker what your thoughts are:

  • As a user: What do you like or dislike about the MDAKits approach? Would you want to use an MDAKit?
  • As a developer: Would you be interested in creating an MDAKit? What should we do to make it easy for you?

Get in touch! MDAKits are new and we look forward to adapting the initial (v0.1.0!) approach based on what we hear from the community.

@IAlibay @jbarnoud @orbeckst @richardjgowers @fiona-naughton @lilyminium