Open Source Contributors: Who’s Missing--and Why?

Women and minority programmers make up only a small fraction of open source contributors.

Open source software has indisputably advanced the software industry as a whole in myriad ways. It has fostered faster innovation. It has helped enable new paradigms, like DevOps. It has made all sorts of important software programs, from Web browsers to video editing software, accessible to people who, in past decades, could not have afforded them. Yet, open source also exemplifies, and exacerbates, a major challenge for the software industry: achieving greater demographic diversity. When you look at open source contributors, you find that most of them look very much alike: white and male.

In fact, the open source space is even less diverse than the tech industry as a whole. And that's no mean feat, given how incredibly un-diverse tech companies in general tend to be.

That's a fascinating reality, and it bears some investigation for anyone who wants to understand the dynamics that determine which sorts of people are envisioning, designing and writing some of the most important software platforms today--from Firefox and Apache to Linux and Kubernetes.

Open Source Demographic Data

The open source ecosystem is a loosely defined community. No single organization represents or can speak for all of its members. Given these facts, it's not surprising that complete data about the demographics of open source contributors just doesn't exist.

Nonetheless, there are some ways to glean insights into open source contributors. Probably the best data source to date is a 2017 survey by GitHub. The company queried 5,500 randomly selected individuals who collectively represented about 3,800 open source projects.

Among other findings, the survey revealed that a full 95% of open source contributors who responded were men. Only about 16% identified as members of minority racial groups.

To put those figures in context, consider that women account for about 25% of the tech workforce overall, and more than 30% are non-white.

Admittedly, the GitHub survey data is not perfect. It's a bit dated. GitHub's definitions of "minority" may not have been ideal. It represents only open source contributors who host their projects on GitHub.

Still, given how large the gaps are between the demographic trends that GitHub identified in open source and those in tech as a whole, it's hard to argue that open source's demographic challenges are not greater than those of the tech industry in general.

That conclusion would seem to be borne out by a simple look around the open source space. I can think of only one prominent leader of an open source organization who is a woman (Mitchell Baker, the CEO of the Mozilla Foundation) and one who is not white (Kelsey Hightower, a key figure in the world of Kubernetes, and currently a developer advocate at Google). It's much easier to think of leading figures at closed-source software companies who come from more diverse backgrounds.

Open Source's Diversity Surprise

In some ways, these trends among open source contributors may seem unsurprising. It's not news that the tech space is mostly white and mostly male, and has been for decades.

Yet, the fact that open source is even less diverse than tech in general seems harder to explain. If anything, you might think open source would be more diverse. After all, in many cases, the demographic identity of people who contribute to open source projects is not even known to others within those projects, unless for some reason they volunteer it. No one knows your race or gender by looking at your GitHub profile.

For that reason, it would be hard to argue that active discrimination explains the demographic trends in open source. The lack of diversity at a company could be explained by hiring committees dismissing diverse candidates. But, in open source, there are no hiring committees or other gatekeeping bodies that have much insight into the demographic profile of contributors. You get judged on the quality of your code alone.

Thus, I suspect that the lack of diversity in open source isn't caused by discouragement or discrimination against women and minority groups who want to be involved. It's probably the result of fewer of these people wanting to contribute to open source in the first place.

That, too, may be surprising. In some ways, open source would seem like a great place to get started for programmers who worry that their race or gender would put them at a disadvantage at software companies. You might also think that because women and non-white people tend to possess less wealth, they'd be drawn to open source because it gives them access to so many great tools free of cost.

Why Isn't Open Source More Diverse?

And, yet, that doesn't seem to be happening. One reason why may be that open source has a reputation as a cutthroat, unsupportive place. Linus Torvalds, for example, has a long history of rather crude rants (among them a disparaging comment about the importance of diversity). Perhaps programmers who already feel vulnerable based on their race or gender hesitate to join communities where they worry about unchecked aggression or criticism.

On the other hand, it's not as if the rest of the tech industry is known for being warm and friendly, either. Yet women and minority programmers are less hesitant about working for closed-source companies.

The historical roots of open source, and the free software movement more broadly, may also be a factor. The first free software projects were closed tied to the community surrounding Unix, which was pretty white and male. Every single important figure I can think of in the history of free and open source software--like Richard Stallman, Linus Torvalds, Bruce Perens and Eric S. Raymond--has been white and male.

Perhaps the demographic trends that we see today in the open source space, then, are simply deeply entrenched norms that were set decades ago. You could argue that there was a least a little more diversity in the early tech industry as a whole, exemplified by figures like Ada Lovelace and Grace Hopper.

Economics may be a factor, too. I mentioned above that, in general, women and minority programmers have less wealth. That could draw them to open source projects that produce software free of cost. But it could also be a hindrance, given that people who have less money to begin with probably also have less free time to spend writing code that they give away for free to open source projects.

Coupled with the fact that many of the prominent white men in the open source space were quite well-off before they got involved in open source (Torvalds, who wrote the Linux kernel as a penniless college student, is an obvious exception), this is the most compelling explanation to me.

That said, I don't think any of these factors alone fully explains the demographics of open source contributors. It's a complicated trend, and there is no simple way to change it. Simply making more programmers aware of all of the opportunities in the open source space, and the ways they can help jump-start careers, may be one step forward. Giving more prominence and power to the women and minority programmers who do exist in open source projects may help, too.

Yet, even with efforts like these, I suspect it will take many years before open source manages to clear the very low hurdle of achieving as much diversity as the tech industry as a whole, let alone move toward true proportionality with society writ large.

