26 Apr 2018
We are happy to anounce that MDAnalysis is hosting two GSoC
students for NumFOCUS this year, Ayush
Suhane (@ayushsuhane) on GitHub) and Davide
Cruz (@davidercruz).
Ayush Suhane: Improve Distance Search Methods in MDAnalysis
With the capability of multiple MD codes to easily handle milions of
atoms, a major roadblock to analysis of this vast amount of data
corresponding to positions of each atoms at every timestep is the time
to evaluate pairwise distance between multiple atoms. Almost every
operation requires the distance between the pair of atoms, fast
calculation of pairwise distance is of utmost importance. Multiple
basic analysis functions like Radial Distribution Function, Contact
Matrices, depepend very heavily on fast distance evaluations. Apart
from naive approach for pairwise calculations which scale as
\(\mathcal{O}(N^2)\), other forms of data structures like KDTree,
Octree are sugested for faster calulations based on the
requirements. Based on the MDAnalysis, two use cases are identified as
highly used in majority of the analysis algorithms. The goal of the
project is to identify the data structure based on the requirements of
the use case and implement in the MDAnalysis library along with clear
documentations and test cases.
Ayush is a graduate student in Materials Engineering at UBC, Canada.
He is working with Molecular dynamics simulations
for his Master’s thesis. He wishes to contribute to the open source
community by simplifying the complex analysis and visualization
involved in MD. During GSoC, he aims to use the opportunity to
learn from the already established open source contributors
and continue the tradition by becoming an active member of
the community. In his free time, he also likes to
read fiction novels and play computer games.
Ayush will describe his progress on his blog.
Implement trajectory transformations on the MDAnalysis API, to be
called on-the-fly by the user, eliminating the requirement for
multiple intermediate steps of modifying and saving the trajectory,
and giving users a more efficient and simple workflow for simulation
data analysis.
Davide is currently on the last year of his PhD on Molecular Biosciences
at ITQB-NOVA in Lisbon, Portugal. For his thesis he is using MDAnalysis to
analyse the results of molecular dynamics simulationations and this GSoC
project is an opportunity to contribute to the community. He expects to
learn a lot about python and software development during this summer.
Davide will describe his progress on his blog.
Other NumFOCUS students
NumFOCUS is hosting 45 students this year for several of their supported and
affiliated projects. You can find out about the other
students
here.
22 Apr 2018
MDAnalysis version 0.18.0 has been released.
This release brings various fixes and new features and users should update with either pip install -U MDAnalysis
or conda install -c conda-forge mdanalysis
.
One exciting new feature is the addition of duecredit
to keep track of what citations are appropriate.
Once you have written an analysis script (e.g., myanalysis.py
)
and have installed the duecredit
package (pip install duecredit
),
you can either set the environment variable DUECREDIT_ENABLE=yes
and run your script python myanalysis.py
or to run python -m duecredit myanalysis.py
to be given a report of what citable software you have used.
You can then use the data written by duecredit to export the bibliography and have it ready to be imported into your reference manager.
We hope that this will allow all contributors of analysis packages within MDAnalysis to get properly cited
and we are working on retroactively adding all required citations to duecredit.
The AtomGroup.groupby
method now supports using multiple attributes to group by,
for example one could create a dictionary which allows a particular residue name and atom name to be quickly queried:
>>> grouped = ag.groupby(['resnames', 'names'])
>>> grouped['MET', 'CA']
<AtomGroup with 6 atoms>
When writing GRO files,
previously all atoms would have their indices reset so that they ran sequentially from 1.
To preserve their original index, the reindex
option has been added to the GROWriter
.
For example:
>>> u = mda.Universe()
>>> u.atoms.write('out.gro', reindex=False)
or
>>> with mda.Writer('out.gro', reindex=False) as w:
... w.write(u.atoms)
Gromacs users can benefit from a new feature when reading TPR files. Now, when the topology is read from a TPR file, the atoms have a moltype
and a molnum
attribute. The moltype
attribute is the molecule type as defined in ITP files, the molnum
attribute is the index of the molecule. These attributes can be accessed for an atom group using the plural form:
>>> u = mda.Universe(TPR, XTC)
>>> u.atoms.moltypes
>>> u.atoms.molnums
These attributes can be used in groupby
:
>>> u.atoms.groupby('moltypes')
>>> u.atoms.groupby('molnums')
to provide access to all atoms of a specific moleculr type or that are part of a particular molecule.
The AtomGroup.split
method of atom groups can also work on molecules:
>>> u.atoms.split('molecule')
and will create a list of AtomGroup instances, one for each molecule.
For convenience, various Group classes have been moved to the top namespace (namely, AtomGroup
, ResidueGroup
, SegmentGroup
):
import MDAnalysis as mda
u = mda.Universe(topology, trajectory)
# for creating AtomGroups from arrays of indices
ag = mda.AtomGroup([11, 15, 16], u)
# or for checking an input in a function:
def myfunction(thing):
if not isinstance(thing, mda.AtomGroup):
raise TypeError("myfunction requires AtomGroup")
And finally, this release includes fixes for many bugs.
This includes
a smaller memory footprint when reading NetCDF trajectories,
better handling of time when reading DCD trajectories
and adding support for Gromacs 2018 TPR files.
For more details see the CHANGELOG entry for release 0.18.0.
As ever, this release of MDAnalysis was the product of collaboration of various researchers around the world featuring the work of 12 different contributors.
We would especially like to welcome and thank our six new contributors:
Ayush Suhane,
Mateusz Bieniek,
Davide Cruz,
Navya Khare,
Nabarun Pal,
and
Johannes Zeman.
14 Feb 2018
MDAnalysis has been accepted as a sub-org of the NumFOCUS foundation,
for Google Summer of Code 2018. If you are interested in working with us
this summer as a student read the advice and links below and write to us on the
mailing list.
We are looking forward to all applications from interested students
(undergraduate and postgraduate).
The application window deadline is March 27, 2018 at 12:00 (MST). As
part of the application process you must familiarize yourself with Google
Summer of Code 2018. Apply as soon as possible.
Project Ideas
We have listed several possible projects for you to work on on our
wiki.
Alternatively if you have your own idea about a potential project
we’d love to work with you to develop this idea; please write to us on the
developer list to discuss it there.
You must meet our own requirements if you want to be a student with MDAnalysis
this year (read all the docs behind these links!). You must also meet the
eligibility criteria.
As a start to get familiar with MDAnalysis and open source development you
should follow these steps:
Complete the Tutorial
We have a tutorial explaining the basics of MDAnalysis. You should go through
the tutorial at least once to understand how MDAnalysis is used.
Introduce yourself to us
Introduce yourself on the mailing list. Tell us what you plan to work
on during the summer or what you have already done with MDAnalysis
Close an issue of MDAnalysis
You must have at least one commit in the development branch of
MDAnalysis in order to be eligible, i.e.. you must demonstrate that
you have been seriously engaged with the MDAnalysis project.
We have a list of easy bugs to work on in our issue tracker on
GitHub. We also appreciate if you write more tests or update/improve
our documentation. To start developing for MDAnalysis have a look at
our guide for developers and write us on the
mailing list if you have more questions about setting up a
development environment.
— @kain88-de