Skip navigation
open source stamped on a document Alamy

Is Generative AI the Next Big Threat to Open Source Software?

Most open source communities lack the resources to train generative AI algorithms effectively. There is one solution, but will it make open source less "open"?

Table of Contents:
How Generative AI Impacts Open Source
A New Problem for Open Source
Corporate Money and Generative AI
What's Next for Open Source Generative AI?

In many ways, times have never been better for open source software. Open source code is everywhere, enterprises have a strong preference for open source, and open source has become central to the digital economy.

Yet there's a rapidly maturing technology that has the potential to create a huge new set of challenges for open source: generative AI. As generative AI technologies — like the ones behind Copilot and ChatGPT — become increasingly prevalent, open source communities face a growing risk that they will lose influence within some of the most important sectors of the software economy.

Here's why generative AI poses an unprecedented threat to open source, and what open source communities can do in response. 

How Generative AI Impacts Open Source

The main reason why generative AI technology threatens open source isn't that the code behind most major generative AI tools isn't open (although it's not — solutions like ChatGPT are closed-source). Open source projects could easily write algorithms designed to emulate the ones behind generative AI tools — which they are already doing.

Related: Does AI-Assisted Coding Violate Open Source Licenses?

Instead, the problem is that most open source communities don't have the resources necessary to train generative AI algorithms effectively. Producing generative AI software requires more than code. It also requires the ability to collect and analyze massive amounts of data. To do that, you need massive amounts of computing power, which necessitates massive amounts of money — something most open source projects lack.

Closed-source generative AI companies aren't subject to this challenge because they have deep pockets or venture capital to pay for AI training. Tools like ChatGPT were trained by parsing millions and millions of records from the internet, which was feasible because OpenAI, the developer of ChatGPT, has raised billions of dollars in funding.

Thus, if there is no open source alternative to ChatGPT, it will be because open source communities lack the resources necessary to perform AI training. Open source developers can produce the code behind generative AI, but that's only one of the two fundamental ingredients in modern generative AI technology.

A New Problem for Open Source

In this respect, generative AI creates a fundamentally new problem that open source has never faced before — despite having contended with a variety of other challenges.

Originally, open source developers had to prove that their projects could produce high-quality software that worked at least as well as closed-source alternatives. They achieved that by the 2000s, when open source platforms like Linux became widespread.

Related: AI-Assisted Coding: What Software Developers Need to Know

Then, with the rise of cloud computing, open source projects faced the challenge that cloud architectures undercut the freedoms that open source is supposed to ensure. Nonetheless, open source communities have managed to become very influential in the cloud; although the major public cloud platforms are mostly closed-source, they rely heavily on key open source technologies such as Kubernetes to deliver their services. And there are plenty of important open source cloud platforms. Open source has conquered the cloud.

But will open source conquer generative AI? I have my doubts. We'll see plenty of open source generative AI algorithms, but I'm not sure who — if anyone — is going to pay for the AI training that those algorithms need to go head-to-head with competing closed-source technology.

Corporate Money and Generative AI

There is one scenario where I can envision open source projects creating viable alternatives to closed-source generative AI technologies. It involves large businesses providing the funding or infrastructure that open source coders need to train AI models. A company like Google or IBM, for example, might decide to support an open source generative AI project by helping it complete training.

This approach could work in creating an open source alternative to tools like ChatGPT. The caveat, of course, is that it would allow big companies to wield outside influence over the open source versions of generative AI technology.

That's a trend that's already happening in other open source projects; for example, Google has historically played a key role in Kubernetes development, which arguably gives Google influence over Kubernetes product direction and feature development, even though it's an open source project.

There's nothing inherently wrong with this, but it does raise questions about how "open" open source really is when large companies toss around resources as a way of influencing what gets developed and what doesn't. I worry that open source generative AI would lose much of its potential if it ends up under the heavy influence of certain companies, rather than being an organic, community-centered endeavor.

What's Next for Open Source Generative AI?

Maybe open source communities will find creative ways to work past the challenges posed by generative AI. Open source has proved surprisingly resilient in the past, and a lack of financial resources didn't prevent projects such as Linux from becoming massively successful.

Still, the training requirements of generative AI mean that open source projects focusing in this niche are operating in uncharted waters. They'll need to think strategically if they want to create usable generative AI tools without selling out to large businesses.

Related reading:

About the author

Christopher Tozzi headshotChristopher Tozzi is a technology analyst with subject matter expertise in cloud computing, application development, open source software, virtualization, containers and more. He also lectures at a major university in the Albany, New York, area. His book, “For Fun and Profit: A History of the Free and Open Source Software Revolution,” was published by MIT Press.
Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.