This is something near and dear to my heart! The great thing about container images is the software distribution is based on static assets. This enables scanners to give teams actionable data without being on every host. This is a net new capability and I think enables better security in organizations who adopt containers. And unlike "VM sprawl" container systems are generally introspectable via a cluster level API like Kubernetes and scanning doesn't require active agents on every node. Two things that have happened recently in this space:
- Quay.io[1] offers scanning as a standard feature on all accounts including free open source accounts. This also includes notifications to external services like Slack. This is what it looks like when you ignore an image[1].
- The Kubernetes community has started automating scans of all of the containers that are maintained by that community to ensure that they are patched and bumped to the latest versions. A recent example[2].
The cool thing is that both of these systems utilize Clair[3] Open Source Project as a way of gathering up data sources from all of the various distribution projects. This all leads to the reason we feel automated updates of distributed systems are so critical and why CoreOS continues to push forward these concepts in CoreOS Tectonic[4].
If you hit any issues with Clair feel free to file an issue; we have a lot of folks who have been helping maintain the project.
Some quick feedback from the post and your website. It says there is a quick command to scan my container images but I couldn't find the command. I also signed up but the confirmation URL was a 404 and from "team" with subject "Confirmation email".
In any case really happy to see more people digging into these problems and coming up with new solutions.
This is great research but I think an important point is missed. It may come across that these images are vulnerable because of some intrinsic property of using Docker however this is not the case. What is also important to point out is by adopting Docker this analysis actually becomes easier to do across an organization and similarly mitigation becomes easier as well.
I think another aspect that is missed is that just because you use a vulnerable image doesn't necessarily mean you are at risk of being compromised no matter what other security layers you employ. This gets to the practical scenarios of security operations.
Absolutely agree. I did see some bad practices in the Docker community that I expect to see elsewhere as well. Specifically: reliance on deprecated images and not updating images during build. Thoughts?
I didn't address the implications of software vulnerabilities in respect to other mitigation techniques, however, as it's far outside the scope of the article. I probably should at least add a second addendum though. I'll work on this soon. Thanks!
Analysis is easier because you already have a running agent that you can
remotely query about deployed software. With regular OS you need to provide
such an agent.
I don't know why it's easier to mitigate risks, though. Maybe just because
it's easier to run the analysis.
> Analysis is easier because you already have a running agent that you can remotely query about deployed software.
Not sure I buy this. Sure, I can query the docker daemon for what images are running, but that's not enough to tell me which images are vulnerable. I still need to build something to actually scan the images.
Also, on any linux host, I don't need a daemon to tell me about deployed software - the package manager can do just that, and the tool used for scanning in this article appears to just query the package manager, which would work just as well on any linux host outside of docker.
That's correct, vuls queries the package manager for installed packages, versions, and changelogs. It then compares the CVEs found in the changelogs to NVD.
There are certainly flaws in this approach; it's one of the reasons we intend to support multiple scanners. We started with vuls because clair wasn't released yet and we wanted to support more than containers.
> Sure, I can query the docker daemon for what images are running, but that's not enough to tell me which images are vulnerable.
If you can query what images are running, you can tie it with list of deployed
software. Then you can compare that list with database of known
vulnerabilities; obviously, you'd do the same if you were assessing the host
OS without Docker. What's easier is that you already have an API that can be
called remotely.
> Also, on any linux host, I don't need a daemon to tell me about deployed software - the package manager can do just that
But you need to get to each of these hosts somehow and get the data out of
package manager, so a report can be prepared. This is the part that makes it
easier to assess what you have in the case of Docker. Then there is also
software that was not installed with OS-supplied package system, because
programmers somehow dislike those and work around them with virtualenv or
npm-du-jour.
> [...] the tool used for scanning in this article appears to just query the package manager, which would work just as well on any linux host outside of docker.
I haven't read the article, but most probably you're right.
The conversion of package version numbers to vulnerabilities is perilous and incredibly complicated. That's one of the most significant challenges that we want to solve, which is even more pressing considering how badly the CVE ecosystem is breaking down.
Note that an image containing vulnerable binaries is not the same thing as an exploitable cointainer. A container derived from a full OS like Ubuntu will have many binaries to provide a standard environment, but most of them will never be touched by the running program. That year-old image might have a vulnerable Perl version, but nothing in the container even runs Perl, so it's a non-issue.
This is why many people can get away with a minimal base image like Alphine-- a tiny busybox shell provides enough features to run the application while still supporting some manual debugging with docker exec. It also avoids false positives like these, letting you more quickly find precisely what you need to upgrade when a new OpenSSL vulnerability is announced.
(Disclaimer: I work on Google Container Engine / Kubernetes).
The only exception is when people have access to the underlying container, willing or not. Then these vulnerable binaries can lead to a vulnerable container.
This is also why the subjectivity in CVE rating is such a significant problem.
One thing that does get a mention but only right at the bottom of the post is using smaller base images (e.g. Alpine).
If you can I'd recommend this as a good practice to reduce these kinds of problems. The fundamental fact is that if you don't have a library installed, you can't be affected by a vulnerability in it. So the smaller your image, the fewer possible avenues for shipping vulnerable libs you'll have and you'll have to spend less time re-building images with updated packages.
I don't know the answer to this, but I do know that Alpine has some really awesome stuff around vulnerabilities, and I would presume that they react to vulnerabilities more quickly.
However, I intend to validate this presumption in a future project.
I'm looking for a base image choice and this article helped me a lot. It seems Debian base image is a good choice so far. Alpine is quite popular lately but I'm afraid musl library may cause some headaches in the future. Is Debian to go for production use? What about other alternatives like Centos?
CentOS/RHEL have a very small footprint in the open source community, it seems. I was pretty surprised by this because they have such significant corporate backing, a lot of enterprise software is RHEL only, and they may be the only linux distribution currently support SCAP (required by FISMA for federal agencies).
In order, I would opt for: binary image, alpine, then debian. There are other choices like CoreOS, FreeBSD, etc. if you are comfortable moving away from linux.
CoreOS, the distribution, is a Linux distribution. They also have a lot of container oriented tools. The distro itself is optimized for use in container management. It is also relatively lightweight for use as a container (but not as light as Alpine).
I will absolutely release some data. I intend to fully automate this research so that it is current whenever viewed as well.
Not sure about the state of CI/CD in the image building process, I assume it varies wildly. Two of the major points I'll address in my next posts are regarding deprecation in Docker repositories and lines of a Dockerfile important to minimizing vulnerabilities.
To be clear, one of those lines relates to making sure you pull in upstream during image building. This is super important, as it seems that people have assumed their base image will be current and that is not always the case.
So thats another factor to see if its a pattern:
Do the images w/o problems apt-get update && apt-get upgrade
And maybe there's an opportunity for a chrome browser extension that can overlay an indicator when choosing a docker image to pick one that uses best practices like that.
That's exactly what I'm trying to highlight. After you take that 'latest' image, if you're not applying updates regularly, you are vulnerable from almost day 1.
This also applies to most of the AWS, Digital Ocean, etc images I have seen as well. I'll be writing more about how to mitigate this in the next article.
Is there a way for teams with production Docker deployments to easily experiment with this kind of scanning on their own infra to understand their own situation? Maybe worth writing up a quick description of how operators can do something like that.
Absolutely. Docker and Quay.io both offer scanning for repositories they host, there are open source options like vuls and clair that are a bit more work to set up, and we have a free plan for up to 5 hosts and for open source projects and schools.
How do people currently scan their infrastructure to look for vulnerabilities? Do you have a dedicated team that handles this, or is security "everyone's job"?
I'd say that's very organization specific. Personally I'd see the maturity curve being
No-one's doing it --> specialists doing it --> everyone's doing it.
With the speed of modern development, ideally everyone should have a good handle on basic security practices, ideally with a specialist team available for more niche requirements
I did some market research before I started working on Federacy (which began as frustrations I encountered at mopub/twitter). It seems that very few companies sub-hundreds of employees and thousands of servers have a security specialist, and almost no one is running vulnerability analysis in a real way.
So really big companies will/do have teams of VA people, also where they're a regulated industry so subject to things like PCI, I'd expect to see that as well.
However as you say in the small company space, it's very hit or miss as to what effort can be put into this kind of work.
The thing I'd say about services that do package vuln. scanning is that they can be useful but it's easy to get seduced by absolute sounding numbers (e.g. a CVSS 10 oh that must be much worse than a 4).
All that's not to say there's no value in that kind of work, it's definitely a piece of the programme, but it's important to get it in the appropriate context :)
To add a bit of detail here, one of the most surprising things I found that I'm saving for my next post is: 24% of recent vulnerabilities in the NVD have no rating, and that doesn't even include the ones that weren't posted to NVD at all.
On top of this, the fact that the rating systems used by the different vendors/sources of vulnerabilities are quite different, and like you mentioned, the implicit subjectivism... it's a mess. But a solvable one! That's what I'm working on.
- Quay.io[1] offers scanning as a standard feature on all accounts including free open source accounts. This also includes notifications to external services like Slack. This is what it looks like when you ignore an image[1].
- The Kubernetes community has started automating scans of all of the containers that are maintained by that community to ensure that they are patched and bumped to the latest versions. A recent example[2].
The cool thing is that both of these systems utilize Clair[3] Open Source Project as a way of gathering up data sources from all of the various distribution projects. This all leads to the reason we feel automated updates of distributed systems are so critical and why CoreOS continues to push forward these concepts in CoreOS Tectonic[4].
[0] https://blog.quay.io/quay-secscanner-clair1/
[1] https://quay.io/repository/philips/host-info?tag=latest&tab=...
[2] https://github.com/kubernetes/kubernetes/pull/42933
[3] https://github.com/coreos/clair
[4] https://coreos.com/tectonic