How to Defend Against Software Supply Chain Attacks
A software supply chain attack is when a third-party actor infiltrates your systems by somehow getting their own malicious code into your applications without you even knowing. So how do you defend against an attack you didn't know was taking place?
The Commoditisation of Software
To appreciate how easy it is for rogue software to get into your systems, it is necessary to understand that applications are built as many "layers", where an application developer only focuses on a few thin slices. For the rest, developers leverage the decades of work of other organisations to get a complete system.
For example, an e-commerce web application will integrate technologies from a payment gateway like Stripe, HTTP server technologies from the Apache Software Foundation with Tomcat and database management libraries from Spring with Spring Framework. Writing every part of the software yourself would mean spending years on application development rather than weeks or months - and it wouldn't necessarily be any better or more secure than what is readily available.
Even if you decided to write all of the code from scratch, you would still be using a coding language put together by a different company - such as Java by the Oracle Corporation - and this is never guaranteed to be completely free from vulnerabilities. You can carry on with this type of analysis all the way down to your computer's operating system, the bootloader, the internet service provider, and so on.
It is impossible to develop a complete system without building on the shoulders of the giants among us, one way or another.
The requirement to leverage existing work and technology in any new application has been understood for many years, and the software development community have created systems and tools to make this possible. Most notably, we develop small, reusable and targeted libraries of code that can be imported and used as dependencies in other projects to reduce the need to re-engineer the wheel each time.
Popular libraries will be well-architected to perform a specific function very well and nothing more. Developers do not want to add any unwanted bloat to their projects as it makes the overall system larger, with more code to review. The most famous libraries will be downloaded and incorporated into millions of projects each month, often with a corporate sponsor taking responsibility for the design and security of the distributable package.
One of the results of this modularisation and packaging of multiple, reusable, small, specific packages for developer use is the commoditisation of most of the software stack - from communication protocols to database management to analysis functions and everything else. All that is left is for a development team to gather the packages they need and write the limited amount of business logic bespoke to their own project.
Although there are the obvious advantages of both speed and ease of development of new systems, this comes at the potential expense of malicious code injection in any of these reusable libraries.
The Distribution of Software
For malicious code to become part of an application, it's essential to understand the process in which the aforementioned reusable libraries can be downloaded and added to a project.
Essentially, vast, centralised repositories have been set up, maintained and made available, separated only by coding languages. So, for example, there is the NPM centralised repository for JavaScript projects, Maven for Java projects, pub.dev for dart projects, and so on. These repositories rely on the fact that individuals and organisations will publish more code snippets for the increasing benefit of the wider development communities. To incentivise users to publish code, they make registering as straightforward as possible and avoid any security checks or validations of your user account or coding ability.
This lack of assessment of what is being published is a key criticism of the centralisation of code distribution. Yet, it is also one of the core strengths of open-source development - you get huge swathes of valuable code for free, but you get it at your own risk.
The act of adding third-party code to your projects is usually as simple as adding a single line of text to a requirements file, such as a package.json
file for JavaScript:
{
"name": "my-amazing-system",
"version": "1.0.0",
"dependencies": {
"third-party-dependency-1": "1.0.0",
"third-party-dependency-2": "2.1.0",
}
}
This file will set up a JavaScript project to pull down both third-party-dependency-1
(version 1.0.0
) and third-party-dependency-2
(version 2.1.0
) from the NPM central repository and add the two codebases to the project. Of course, these external dependencies could contain any arbitrary code, good or bad, helpful or malicious.
Three Supply Chain Attack Vectors
We now know that development teams need third-party code to create a system and that it is effortless to pull down and incorporate this third-party code into a project with just a single line of text. However, combining these facts allows three possible ways for malicious code to be unintentionally merged into a codebase:
- Using a library that has had security issues since its creation
- Not downloading the library that was intended
- Updating a library that was safe but now contains unsafe code
The easiest way to incorporate dodgy code is to specify a dependency with security issues from the outset. For example, specifying a dependency in your project named unsafe-html-rendering
should trigger an internal alarm bell that this library should not be used for production purposes based on how it renders HTML in an unsafe way - it may only be appropriate for private, personal, hacky projects. A less contrived example would be using a library like express.js
, the popular node.js web application framework, which contained various vulnerabilities early in its life due to relatively technical and unexpected use cases when accepting specific HTTP requests. These vulnerabilities went unknown for many years until they were later patched, highlighting how even projects used by millions of developers can have issues that few people would be aware of.
Another popular method of injecting malicious code comes from more active, nefarious actors with typosquatting. Similar to how domain name typosquatters will purchase domain names with misspellings like "facebok.com" or "gogle.com" and divert users to unintended websites, the same can be achieved with the distribution of software packages. Researchers in 2018 discovered 12 malicious Python libraries uploaded to PyPi - the official Python distribution source - named "diango", "djago", "dajngo", and other misspellings of the official and hugely popular django
framework. These variant packages were duplicates of the real django
framework with the unfortunate addition of code allowing the ability for hackers to remotely access and control any server where this vulnerable codebase would ultimately be deployed.
Finally, it is never beyond imagination that a third-party library that was historically always safe becomes unsafe due to an update. The update could include an unforeseen security issue or have been purposefully planted by the author due to going rogue or having their publishing account hijacked. Due to the way software dependencies are imported, a patch update to a dependency may automatically be pulled down as part of a team's normal software development lifecycle, and they would not necessarily be aware of the potential risk of suddenly including malicious code.
Defending against Attack
The Cybersecurity and Infrastructure Security Agency documents how it is often challenging to fully defend against supply chain attacks after an incident has occurred without considerable time and expense. Therefore, it is prudent to take appropriate mitigation and security steps before any possible attack occurs.
In the most general sense, the best defence is to be aware of your suppliers by maintaining an inventory, applying good policies when interacting with these suppliers and having robust and secure software development lifecycle processes. Third parties include all individuals and organisations that have published open-source code freely available in the repositories above, as well as any third-party outsourcing companies and private vendors.
More specifically, a comprehensive, secure software development process will incorporate security concerns from the outset and throughout the phases of the development - not at the end as a review process. The complete secure development process will include vulnerability scans, static and dynamic code analysis, testing, code reviews and penetration testing - with the option to use tooling and automation for most of this.
Importantly any practices and policies you have for secure development should also be applied by your suppliers, where possible, based on your contractual agreements.
To mitigate the damage of a supply chain attack, as well as many other attacks, the proper use of configuration and change management will help you audit the current state of systems and apply risk assessments to any changes. In addition, applying principles of least privilege and adopting a zero-trust attitude to any deployments, user accounts, and networks will minimise the damage a bad actor can do.
Conclusion
It may appear reckless for an organisation to include seemingly random or arbitrary code snippets from various authors in their production systems, but this is the reality of modern software development. The trade-off between time and effort with security is managed primarily by using trusted, popular third-party code with corporate sponsors like Google, Microsoft and Spring. With this approach, it might be reasonable to assume any downloaded code is useful and secure. However, it is always prudent to incorporate secure development practices into your software development lifecycle and apply strong security practices to your production environments to mitigate against any possible security incident.