Saturday, May 18, 2019

A Birthday, Collaboration, and the Open Source Process

Yesterday I turned thirty, and this past month I got my first and second “real” pull requests accepted, into the Astroquery module of Astropy.

If you don't understand what I just said, I'm going to need to do some explaining. Let's start with the concept of “open source” mentioned in the title: open source, as used in computing, refers to computer programs where the source code for the program is available somehow for inspection. An open-source program is one where anyone can come along and look at the underlying code, and usually (though it depends on the license) take it, modify it, and use it themselves. Typically it also involves an idea of open collaboration, where anyone can suggest improvements to the code for the benefit of all users.

A “pull request” is one such way to suggest an improvement, using the popular version control software Git (originally written by Linus Torvalds, also the creator of the original Linux kernel). The website GitHub.com hosts vast numbers of Git repositories (the name for a collection of all the source code for a project) and makes it easy to coordinate collaboration from many people around the world. A pull request is a request to the maintainer of a repository to merge (or “pull in”) some changes from another source.

Around a month and a half go ago I started using the Astroquery module of the Astropy project (which is a collection of Python code for use in astronomy). The Astroquery module allows you to query various astronomical databases that don't have official APIs; I use it for searching for information about atomic transitions from the National Institute of Standards and Technology (NIST) Atomic Spectra Database (ASD). Anyway, I discovered that there was some information being returned that wasn't being parsed into the returned results, so I made a one-line addition to my local copy of the code (after a little experimentation) which made it work. I figured it might be of interest to other people, so I made a pull request to the maintainers of the package, and after going through the review process it got accepted!

This was more of a feature addition than anything, but a week or so later I discovered an actual bug in the handling of certain Unicode characters present in the database. (The dagger character [†] was being written as an HTML multi-character code which broke the fixed-width formatting that was being performed on the query results.) This required a little more detective work to figure out, and some back-and-forth with the package maintainers on what a good fix would look like, but I found a simple, effective fix and submitted a pull request for that as well. This time the process was slightly more involved, as I wrote an automated test to cover the situation and a change log entry for the issue I'd raised regarding the bug, but after another week or so this one got accepted as well.

I've long admired the idea of open source, of people around the world giving of their time and creativity to improve software freely available to everyone, and it's a great feeling to finally be part of it myself. A person's contributions to open source projects can look good on a résumé as well (it shows you can code and work as part of a team), so it has practical benefits as well. I don't know what form future contributions might take, but I'd definitely like to continue contributing in the future as my knowledge and skill allow. A hui hou!

No comments:

Post a Comment

Think I said something interesting or insightful? Let me know what you thought! Or even just drop in and say "hi" once in a while - I always enjoy reading comments.