I didn't particularly enjoy my first introductions to open source:

Later I found out I wasn't alone. This tweet ended up in a conference talk about how contributing to open source is a hassle:

So what makes starting out in open source so difficult?

Anecdotes from friends and online chatter suggest that negative first experiences with open source center around it seeming either intimidating or overwhelming. This was the case with my first two introductions to open source.

What follows is not meant to call out any individual or project I worked with. These are unpleasant memories of possibly misinterpreted experiences that happened a few years ago, but I still happily follow both projects.

CyanogenMod: Intimidating

My daily driver in high school was an ancient, refurbished Galaxy S4 [1] that was beginning to show its age. It struggled to run Samsung's bloated fork of Android 5, so I installed an alternate Android fork called CyanogenMod [2]. CyanogenMod was an especially popular "stock" Android ROM, but it was far from being free from bugs. I ran into one of those bugs when I was trying out Google Cardboard.

CyanogenMod had an option to change the display DPI of your device, which I used because Android's UI by default wastes far too much space. This worked quite well, except for some reason it caused Google Cardboard to render VR incorrectly. Back then Cardboard was a relatively new and very uncommon thing [3]. I figured it would be unlikely for one of my device's CyanogenMod maintainers to find this bug, so I took it upon myself to report it.

I searched around, found CyanogenMod's JIRA issue tracker, and wrote a ticket with steps to reproduce the problem. I eagerly anticipated some sort of praise for finding the bug.

Unsurprisingly, that's not what I got.

The response I received was not content with just the reproduction steps. They asked for all sorts of details I didn't understand (what's a regression?) in a tone I perceived as hostile and belittling, whether that was the intent or not. I took this more personally than I should've, and it was enough to make me not want to touch open source contribution ever again...or so I thought.

MediaWiki: Overwhelming

A year later, inspired by a suggestion I read in Lifehacker [4], I put together a personal wiki powered by MediaWiki to keep track of various things. A feature I wanted it to have was a list of favorite pages on the homepage, without me having to manually edit the list every time I wanted to update it.

Luckily, Extension:Favorites could take care of this for me.
Unluckily, it was using deprecated APIs and was thus incompatible with new versions of MediaWiki.
Luckily, some people posted patching instructions on the talk page.
Unluckily, it was a manual process that I would have to go through every time I installed the extension.

I figured that if I didn't feel like manually patching the extension, nobody else would. I took it upon myself to once again attempt my first open source contribution. I soon found that the process for contributing to MediaWiki was more involved than I anticipated:

  1. Find the page documenting all this
  2. Install git
  3. Configure git
  4. Make a Wikimedia developer account
  5. Log in to Gerrit
  6. Generate an SSH key
  7. Add SSH key to Gerrit
  8. Add SSH key to git
  9. Install git-review
  10. Configure git-review to work with Gerrit
  11. Clone the repository you want to contribute to
  12. Set up git-review in that repository
  13. Make a branch
  14. Make the change you wanted to make
  15. Stage files
  16. Make commit
  17. Use git-review to push the change to Gerrit (instead of using git push)
  18. Something isn't working right, probably because you're using Windows
  19. Oh wait, maybe you should've filed a ticket in Phabricator, Wikimedia's issue tracker
  20. Think about giving up
  21. Find the instructions hidden at the bottom of the contributing documentation that explains how to use Gerrit's web interface
  22. Struggle through using Gerrit's web interface

Aside from Wikimedia's usage of Gerrit and git-review for code review [5], the above steps are fairly standard for an open source project. But back then I expected changing three lines of code to be much simpler, taking no longer than perhaps an hour. This frustration prompted the tweet beginning this blog post.

I had no experience with source control whatsoever, with my only programming experience coming from tiny personal projects or from classes I took. Neither of those prepared me for a 16-step process, and the number of tools I had to learn just to make a simple change overwhelmed me. In addition, I had to go back in forth in code review and make more changes to comply with MediaWiki's commit message and code conventions that I was not aware of beforehand.

Eventually I did manage to get all my changes merged in, mainly because I wanted to finish what I started. But the drawn-out, confusing process made me question my ability to contribute to the project. My last contribution to MediaWiki, for a variety of reasons, was only two weeks after I had started. [6]

Intermission: Portal

Everyone knows and loves the massively successful Portal, which has sold millions of copies and embedded itself in online culture. While it receives copious acclaim for its unique gameplay and story, I think Portal deserves some praise for its approach to player onboarding.

Instead of introducing players to all the game's mechanics at once, Portal's tutorial system is subtle and gradual. Players begin the game only able to pick up items and walk through portals they do not create. Later they are given the ability to control just one portal, then both. The game levels are designed progressively so that players usually only need to learn one new strategy per "test". Level hazards are also introduced gradually so as to not overwhelm the player.

Portal forgoes an explicit tutorial and instead chooses to integrate it into the game's plot itself. A 30-second tutorial could cover the same concepts, but it would feel artificial and "separate" from the game. The player would have a harder time retaining the information thrown at them and would be stuck in a sort of "rookie phase" until they got used to the game controls and mechanics. Portal's choice of a hands-on approach makes the game feel natural at every point in the game, not just after finally getting down the controls and game mechanics.

I think the ideal introduction to open source is facilitated in a similar way. [7]

Zope 3: The Video Game

I interned at NextThought this summer, and their backend makes heavy use of the Zope Component Architecture as well as other Zope 3 packages.[8] I entered the internship with a naΓ―ve aversion to external libraries, forged by my experiences as a solo hobby scripter. I especially disliked packages that were large, had many dependencies, or did not have completely detailed and up-to-date documentation. With over 370 GitHub repositories, Zope 3 initially was my mortal enemy.

Ironically, even though I liked Zope 3 far less than CyanogenMod or MediaWiki, it gave me a far better introduction to open source than the other two. I'm not sure if it was by chance or on purpose, but this introduction was gradual and effective.

Practice source control first [9]

My intern project this summer was to get zope.app.apidoc working with NextThought's backend. I was deathly afraid of changing apidoc's source code, so instead I worked on a bridge script that monkeypatched apidoc.

Despite my mentor's strong recommendations, I started the project not using any source control. My previous experiences with git when contributing to MediaWiki weren't all that great and I didn't think git was necessary for a one-person project. I worked out of a folder in my home directory and would occasionally send my mentor zip files over chat. Later I started uploading those zip files to Google Drive, which was hardly an improvement. After losing a large amount of progress because of my "working dirty", I decided to finally give git another try.

It is difficult to know how to share code with others if you don't even know how to share code with yourself.

git certainly has a learning curve. When I was trying to contribute to MediaWiki, this learning curve was yet another obstacle in my way. I wanted to learn the bare minimum needed to put in my change request, which made git esoteric and frustrating.

But using git for my own project allowed me to get acclimated to its syntax and usage and allowed me to see the benefits of multiple branches, release tags, etc. It's a really helpful tool to know in general, and it's never too early to learn if you plan on ever touching any code whatsoever.

Especially if you're me and need a tool like git to clean up your constant mistakes.

Don't take things personally

When I was working on my "apidoc bridge", I was working out of a repository on my personal GitHub account. I was actually given a repository in NextThought's GitHub organization with a copy of zope.app.apidoc's source code. But the description was "Private fork of zope.app.apidoc to prep changes prior to creating a public PR for them." and I wasn't planning on making a pull request, so I ignored the repository at first.

But after a few weeks I started wondering if I was "expected to" create a pull request for apidoc as part of my internship. On a whim I took a look at its one open issue.

As luck would have it, the monkeypatching code I had been working on for apidoc-bridge mostly solved that issue. So I made a fork of zope.app.apidoc's repository [10], transferred my changes, and learned how to submit a pull request on GitHub. I hit the green button to create my pull request and eagerly anticipated its review.

https://assets-cdn.github.com/images/modules/site/product-illo/img-clear-feedback.png

Big Scary Red Circle With X [11]

My pull request wasn't closed, the reviewer simply requested changes. But GitHub displays "changes requested" with a Big Scary Red Circle With X, so initially it felt like rejection. I took it personally and felt like my code wasn't good enough.

But then I decided to actually read the detailed feedback I received and was able to see that the reviewer was on my side. The comments weren't at all intended to attack me or my work, but instead to make my work better. The reviewer wanted to make sure that my pull request was as good as it could be, which is why he requested changes.

I think that's crucial to remember. Open source is ultimately a collaborative effort centering around projects that are bigger than the sum of their contributors. It's important to not get too attached to any code and to keep in mind that everything is done for the good of the project. It's nothing personal. It's something I missed completely when I filed my CyanogenMod issue.

The requested changes to my pull request taught me other important things about starting out in open source.

Read the contributing guidelines

All good projects have guidelines on code style, indentation preference, commit messages, etc. These guidelines are protections against impulsive programmers (like myself) that like to "fix" everything to be a certain way...often to the detriment of others and accomplishing nothing except making the commit log noisy or starting edit wars. Having a consistent code style also makes code far easier to read.

MediaWiki has a code style document I could have found easily had I bothered to search for it in the first place. Reading it would have saved myself and my reviewers a lot of time when I was contributing my fixes.

The Zope Foundation (at least, as far as I knew [12]) didn't have a code style document for its projects, but they have the next best thing: existing code. There's rarely a reason to adopt a different style than what's already present in the code you're editing.

Respecting code style is an extension of not taking things personally. Consistency is more important than personal preference. [13]

Justify your changes

Most of my commit messages were a single line, and my pull request was submitted without a description. While it's nice to have one line to describe what is being changed, it doesn't explain to your reviewers why something is being changed.

In my case, the reviewers of my pull request maintain a lot of different projects, Zope or otherwise. There is no way they can track the intricate details of all of those projects in their heads, especially for something like apidoc that hadn't received much activity for a while. The reviewers did ask me questions about why I did certain things in my code, but I would have saved us a lot of time if I had just added those details to commit messages and my pull request description in the first place.

You'd comment your code, so why wouldn't you comment your commits? [14]

Write tests

zope.app.apidoc proudly had 100% test coverage until my pull request came along. I had manually verified to the best of my ability that my code worked, but I hadn't written an actual test for it. Frankly, I didn't want to. Tests are gross. Or at least not very fun to write.

However, other people are going to be working on apidoc that aren't me. They won't be able to read my mind and know exactly what I did to test the code, and so future changes could break this feature without anyone knowing. That's why the biggest change requested to my pull request was the addition of tests to bring code coverage back to 100%.

Putting off writing tests can lead to problems in the future. As I was writing a test as requested for the new feature, I discovered that the way I implemented that feature actually broke most of the other tests. My code needed a rewrite in order to be testable, and procrastinating the writing of that test would have created more maintenance work in the future.

The purpose of software tests is so that if a new change is made that breaks the code, it's easily identifiable what breaks and when it happens. Tests are only as useful as their test coverage, and test coverage begins with pull requests.

Stay organized

I can get pretty reckless when it comes to sketching out ideas in code. I have a continual habit of making quick "temporary" changes in my current working space that never get removed, and then making completely different changes in the same folder/branch. This breaks a lot of stuff. Even worse, I like to use commits like a reflexive save button without any sort of testing or organization, making it difficult to restore my repository to a point where everything wasn't terrifically broken. It's theoretically possible, but I don't want to sift through 50 commits to find the last one that worked.

It became a regular habit of mine to export patches of certain changes, delete my local repository and re-clone it, apply the patches, and force push to GitHub. Not a fun way to code.

https://jason.pureconcepts.net/images/git-commit-history.png

Don't do this, it will ruin your life and make maintainers hate you [15]

Commits need to represent a single cohesive "thought" or task, and consequently so should your pull requests. Initially I wanted to commit a pile of changes at once for my apidoc pull request. But to keep everything organized and easy to reference in the future, I was asked to take out the unrelated changes. These changes later became additional pull requests.

It's also important to use branches for your work. For one thing, they would've helped prevent the "git hell" I put myself into when I was trying to work on my current pull request and future pull requests on the master branch of my fork at the same time. One of my later apidoc pull requests had conflicts with the upstream repository. Had I been working in a separate branch for the pull request I could've just rebased on master and resolved conflicts that way...but I was working on master, and had to run a git reset.

My reviewer suggested that I first open an issue for the problem before creating a pull request, and then making a branch named after that issue to work in. Both seem to be universally regarded as best practices.

Expect the process

Modern software development has a lot of moving parts, and open source projects are no exception. There's a lot of tools to learn and things to get familiar with. I started my MediaWiki change request expecting to be able to drop a few lines of code in an hour and call it a day. But that's simply not possible with any modern organized software project.

Learning to be patient with myself and to take time to learn things the right way with apidoc (not that I was very good that that) led to a much better experience. The one pull request I wanted to make became four. The biggest pull request kept requiring new changes as my reviewers found new things to add to it. It morphed from a relatively small change to a more involved feature addition. That's just the way these things work, I guess.

Software evolves. Open source is a process. Expecting to spitball some changes and call it a day will lead to frustration, as evidenced by my first two introductions to open source. Not all of your contributions will be useful or well-received. That's just the way things go. But accepting and trusting the process will pay off in the end. Open source isn't just about the commits, it's about people of various backgrounds and locations coming together to volunteer time and effort to discuss and continually improve a project.

Overall, it can be addicting to be part of a project's incremental development and story. Open source's irresistible charm was able to draw me back, even after one or two bad experiences.

Footnotes

[1] Well technically it was two of them, since my first one died after 2 months of ownership.
[2] CyanogenMod is the predecessor to LineageOS, which was created in the aftermath of a lot of corporate drama in Cyanogen Inc.
[3] Come to think of it, it still is.
[4] The first-edition book, not the website, if that gives you an idea of how old this advice was.
[5] I still have a strong dislike for Gerrit. So do actual MediaWiki developers.
[6] The biggest reason being that I was still new to PHP. I still liked MediaWiki enough to edit their wiki about 1,900 times, so it's not like I hated the project.
[7] echo "Now you're thinking with portals" | sed 's/portals/open source/'
[8] Zope 3 is a separate project from Zope 2 and its latest version, Zope 4.
[9] Alternative heading: git gud.
[10] While I did have a private repository in NextThought's GitHub organization set aside for me to make a pull request, it was easier to work with a proper GitHub fork.
[11] Image taken from GitHub's features page).
[12] I later learned that the Zope Foundation uses PEP8, a common style guide for Python projects.
[13] For example, I have a strong dislike for MediaWiki's code conventions, especially its tab indentation and treatment of parenthesis. But open source is about the project, not the individual, so there's no reason for me to force my personal preferences on it.
[14] You do comment your code, right?
[15] Image taken from Jason McCreary's "When to make a Git commit"