In a world where cybersecurity threats--not to mention software licensing violations--are steadily increasing in intensity, source composition analysis, or SCA, is becoming a vital risk management tool for more and more development teams.

Should you adopt SCA to help secure your code? Or is it overkill for your needs? The answer depends on several factors, such as the complexity of your software supply chain and how extensively you incorporate open source into your applications.

Here’s an overview of how source composition analysis works, and who stands to gain the most from it.

What Is Source Composition Analysis?

Source composition analysis is a process that automatically scans source code to determine which components exist within it.

In most cases, the components identified via SCA are modules, libraries, dependencies or code snippets that developers borrow from open source projects and incorporate into their own applications.

However, SCA is not strictly limited to identifying open source components within a codebase. When properly configured, SCA tools can discover any type of external component within source code--such as code that developers borrow from a partner organization whose source code is not publicly available, but is shared within a partner network.

SCA tools usually work by scanning source code and comparing its contents to databases of external code, dependencies and libraries. If they find code that resembles, imports or depends on other code identified within the database, they flag the code as originating from an external source. These tools also can usually identify the specific source where the code came from and any known risks associated with it.

Which Problems Does SCA Solve?

The main purpose of SCA is to help developers find risks that external components introduce into their applications.

SCA is important because developers frequently incorporate third-party code or dependencies into the applications they write. They do this because reusing existing code is easier and faster than writing everything from scratch. And, in many cases, it’s perfectly legal and ethical to add open source code to an application you are building, even if the application itself is otherwise based on proprietary, closed-source code that you write yourself.

The problem, though, is that incorporating third-party code into applications can subject companies to two main types of risks:

Security vulnerabilities: If you add an insecure library, module or other component to an application, your application could potentially be breached or exploited through the vulnerable code.
Licensing risks: When you use third-party code within your own application, you may need to follow certain licensing rules (such as distributing your application in a certain way, as the GPL requires, or attributing the original authors of the code). Failure to adhere to these requirements could lead to software licensing lawsuits.

These risks are compounded by the fact that developers don’t always do a great job of keeping track of which third-party components they introduce into a codebase. A programmer may import a library, or even just copy and paste some code, into an application without noting where the code came from. As a result, developers may end up with security or licensing issues within their software that they’re not even aware of.

Matters become even more complicated in cases where developers borrow code from one source, but the code actually originated somewhere else. You may pull code from a GitHub repository thinking that the owner of the repository is the original author of the code, for instance. But, in reality, the repo owner could have copied the code from a different repository without noting that fact, leaving you blind to the code’s true origin.

Source composition analysis tools solve these problems by automatically detecting third-party components within a codebase and tracking them to their original source. Most SCA tools can also alert developers to any known security problems or strict licensing rules associated with external code.

How New Is Source Composition Analysis?

SCA has been around for a while from vendors like Black Duck Software (now owned by Synopsys), which was founded in 2002, and WhiteSource, founded in 2011. But SCA has gained greater attention in recent years for two main reasons.

One is the widespread reuse of open source code. Whereas it was historically rare to incorporate open source code into proprietary applications, 72% of companies now use open source code internally, according to the Linux Foundation. With so much open source floating around within corporate codebases, keeping track of security and licensing issues associated with that code has become a priority for companies.

The second factor is growing awareness of the risks associated with software supply chain vulnerabilities in the wake of attacks like the one involving SolarWinds. Although the SolarWinds incident didn’t involve open source code, it was nonetheless a reminder of how vulnerabilities within third-party software can turn into security issues for any organization that uses that software.

The Limitations of Source Composition Analysis

While SCA has become increasingly important today, it’s important to note that it’s hardly a comprehensive software security solution.

SCA can only identify risks associated with software components that can be traced to a known third-party source. SCA tools won’t detect vulnerabilities within original code that your developers write themselves. Nor can they discover vulnerable code from a proprietary source whose developers don’t publicly disclose known security flaws. (For that reason, SCA wouldn’t have helped victims of the SolarWinds supply chain breach, for instance, because SolarWinds’s source code is proprietary.)

SCA tools also won’t catch vulnerabilities that arise from the way software is configured or deployed. Insecure network configurations, IAM policies and so on are beyond the purview of SCA.

Who Needs SCA?

Whether your organization should incorporate SCA into its software delivery pipeline depends mostly on how extensively you use code from third-party sources.

Given that the majority of businesses use open source code today for internal reasons--some without fully knowing how or where they are doing so--most companies will benefit from SCA as a means of keeping track of potential security or licensing issues that arise from using third-party open source code.

However, companies that write all of their software from scratch, or that have explicit (and well enforced) policies against the use of open source code, don’t stand to gain much from SCA.

Neither do companies that have simple software supply chains. If all or most of the software that you use originates in-house, you don’t need SCA as much as a company that depends extensively on third-party vendors or open source projects.