Image credit: Unsplash
Ken Thompson the creator of Unix operating system and C programming language gave his now famous Turing award acceptance lecture in 1984 on the topic of “Reflections on Trusting Trust”. In the lecture he said: “To what extent should one trust that a program is free of Trojan horses? Perhaps it is more important to trust the people who wrote the software”. This holds more true today than it did back then. There is no way to be certain that any piece of software we use today is free of trojan horses. We see almost daily occurrences of attackers compromising the software supply chain by publishing malicious python packages or node.js packages. Only way we can be certain a software is not intentionally malicious is if we can identify its authors and hold them accountable if we see intentionally malicious behavior from it.
Today’s software ecosystem is much more complex than what it was back in 1984. We have many more ways an attacker can inject malicious trojan horses into the software supply chains. Some of these are:
- Compromise a developer’s development environment(E.g. IDE plugins).
- Compromise the code repository and inject malicious code.
- Compromise a third party source code package imported into code.
- Compromise the build pipeline and inject malicious code.
- Compromise a third party library linked during the build process.
- Compromise the artifact repository and inject a malicious library.
- Compromise the deployment pipeline and inject malicious code.
- Compromise runtime environment and link/load malicious library into running binary.
In this post we will discuss how to protect each of these stages and how to make sure the integrity of the software we are running is maintained. The guiding principle here is that we can attribute every line of code to the person who wrote it, we can trace every build and deployment step back to the person who initiated it and we can make them tamper proof. We will make use of the same signature pattern we outlined in the earlier blog post to achieve this.
Protecting the development environment
Lot of the development process happens on developers devices and unfortunately endpoint compromise will continue to be the weak link in security. Apple has started enforcing that all binaries running on Mac devices need to be notarized by Apple(and signed by the publisher). This is the right step forward. Microsoft is little behind in this. But with Windows 11 they have also started preferring the signed binaries. We would need to start enforcing these more strictly. We would need to enforce it for installable plugins too like browser plugins and IDE plugins and so on.
But ultimately we can not be completely sure that the endpoint device is not compromised. We need additional controls to make sure the code that’s getting into code repositories is trusted code.
Protecting the code repositories
Protecting access to the code repositories is relatively easy. It just requires the code repository behind the IDWall so that only the trusted users can access it. Making sure only the trusted users are making changes to the code repository requires the signed commits as described in the next section. We would also require the administrative changes made to the repository like the access controls or external web hooks will need to be tracked so that we can attribute those actions to the right person.
Git has become the most widely used version control system. Git supports the code committer to sign the commits using a private key. We can use the same TPM based keys to commit the code so that we know it was signed on a trusted device. If we can link the code signing with some kind of user presence detection, by asking the user to touch the fingerprint scanner for example, we can be more certain that the code was submitted by the real person and not a compromised endpoint.
Making sure that all software packages we use in our code are coming from trusted sources is quite a hard problem to solve. For software packages produced inside a company, we can have a reasonable process and artifact signing process to make sure it’s coming from a trusted source and is not tampered with. But doing this for third party source code, especially open source code is hard and often controversial.
Open source ecosystem has really flourished over the last few decades. Modern programming languages and frameworks like Golang, Python, Node.js, Ruby have publicly hosted package repositories. They have become part of the developer’s routine workflow to import open source packages. This has dramatically improved developer productivity and is impossible to take away. But attackers are able to routinely get malicious code into widely used software libraries. Sometimes it is by simple techniques like typosquatting. Sometimes by misleading developers to use a malicious library. Today we have no way of identifying these malicious actors and punishing them. Note that developers writing insecure code is perfectly fine as long as they are not intentionally malicious and don’t contain hidden trojan horses. The same legal framework as the real world should apply here. As long as the intention was not malicious, the authors should not be liable to simple coding mistakes. We should set the bar for intentionally malicious behavior high enough that developers are not discouraged from contributing to open source software.
Only thing we need from the open source ecosystem to ensure security of our software is, we should be able to identify the authors of the source code. We need to be able to associate the authors and publishers of open source code to their real world identity. This is so that we can punish the authors of intentionally malicious code. In addition to open source libraries publishing their software bill of materials(SBOM), we would need these SBOMs to be signed by the authors. Maybe this can be a two step process where the maintainers of the open source package certify the list of commits that went into a release and individual contributors sign the commits that went into the release. The released package can contain a signed manifest of the content of the library.
Next step in the process of ensuring the integrity of the software is to make sure the image we built contains only the trusted code and libraries and make it tamper proof(by signing the image itself). We can follow a process very similar to the one we outlined above. The build process will need to certify the contents of the image and the code commits that went into it. We can publish this as a signed image manifest.
Final piece of the puzzle is to make sure the software deployed on to a cloud service has a signed manifest that says who deployed the software, which image was used and which person or team are responsible for it. Again we will need this to be a traceable chain of trust. For example the deployment might have happened from a CI/CD job. We would need the deployment manifest to contain which CI/CD workload deployed it, who triggered the CI/CD job and who has authorized the deployment.
The future of cybersecurity is verifiable. It means we would be using verifiable identities, verifiable software, verifiable data, verifiable chain of trust and verifiable communication. This is an ongoing transition and we are happy to play our part in this bigger transition.
In this blog post, we have outlined how the protection aspect of cybersecurity is changing with the security 2.0 stack. It does not mean the detection and recovery aspects are going to become less important. Quite the contrary. They are going to be as important as ever. In fact the verifiable software and identities might help accelerate the detection and recovery processes.
Do you have thoughts or comments about how to evolve the cybersecurity stack? Please get in touch with us by emailing me at sukhesh at procyon dot ai.
Here are the links to other posts in this series: