Outreachy Report - 2022 Improve MDAnalysis by implementing type hinting

About Me

I am Uma Kadam , a Computer Science and Engineering undergraduate at Indian Institute Of Information Technology Guwahati. My interests primarily lie in exploring ML & AI which is evident from my research internship experience on NLP . Participating in numerous hackathons, some of which I won, allowed me to explore new technologies and domains in computer science. My involvement with Outreachy provided me with a first-hand introduction to Open Source and set me on the path to a successful career in technology.

Linkedin LinkedIn   GitHub GitHub

Outreachy

Outreachy is a 12+ week internship program where contributors work with an open-source organization under the guidance of experienced mentors. By providing opportunities to work with participating organizations, Outreachy supports people from underrepresented groups in technological sector. Providing mentorship to build technical skills and establishing an inclusive community that has no room for systematic bias or discrimination, it aims to help minority members pave the way into the tech industry.

Motivation

The object-oriented Python library MDAnalysis analyzes trajectory data derived from molecular dynamics (MD) simulations in many popular formats. Python’s dynamic nature enables us to develop with speed, flexibility, and ease of use but if you are not careful, you may trade short-term expedience for long-term lack of maintainability because of the dynamic nature of Python. Type hints are implemented with the help of typing module.

While type hints and type annotations do hint towards or indicate the appropriate types they do not enforce them. By utilizing typecheckers such as mypy, I made sure that the code is performing what it should regarding the types passed around between functions, and the annotated function signatures only boosted the readability of the code as well as improved communication within it.

Introducing type annotations and type hints provided a multitude of benefits like:

Contributions made during Outreachy Project

The annotations are not all merged yet, and they do not cover the full module. However, spending efforts on type annotations gave us an idea of the challenges it represents for MDAnalysis and will lead to better code in the future. Already, we identified corner cases that could have led to bugs.

Example :

    def search(self, atoms, radius, level='A'):

With the help of type hints this is tranformed to:

    def search(self, atoms: AtomGroup, radius: float, level: str = 'A') -> Optional[Union[AtomGroup, ResidueGroup, SegmentGroup]]:

Here the type hints used for the input parameters of the function search inform us that the type of Atoms is Atomgroup, radius is a float value and level is a string which takes default value ‘A’.

The return type of the function can be None or AtomGroup or ResidueGroup or SegmentGroup. Here the Optional keyword suggests that the type can be None or the type enclosed within its brackets and the Union keyword suggests that the type could be anything from the types conatined within its brackets.

Conclusion :

In my opinion, Outreachy provided an incredible learning opportunity for a newcomer to open source development like me. There is no doubt that the internship was an exceptional experience and I would like to extend my sincere gratitude to Outreachy, my mentors, and the MDAnalysis community for that experience. During my time working with the MDAnalysis library, I have learned a great deal not only from the technical side of working with the library, but also from experiencing the rich set of policies and guidelines that are created by the MDAnalysis community, and how they shape MDAnalysis’ inner workings.