I’ve spent a few posts talking about the ecosystem required to keep a container secure; hands off automation, code provenance, and the like.
But a number of people have asked me about the techology. Mostly they talk about “docker” and the security concerns. I’ve been loathe to talk about technology specifically because it changes. Yesterday docker daemon runs as root; tomorrow it may not. Yesterday the kernel exposed a problem, tomorrow it won’t. I think that by focusing on the technology implementation you’re missing the bigger picture; the management of the ecosystem is more important than the technology itself.
But with this in mind, here’s a few thoughts.
Containers share a kernel. The container technology tries to isolate each instance from each other. This means that all the interfaces exposed by the kernel to the code form part of the attack surface. There are over 300 system calls. Each of them need to ensure the isolation is maintained. And any new syscall added needs to maintain the isolation.
Containers make use of various kernel technologies to try and minimise the risk. Obvious techniques include having a separate network interface that’s only visible inside the container (let’s hope there’s no data leak between packets on different interfaces), having a separate view of the filesystem (chroot on steroids), a separate view of the process space and so on.
In addition, with latter kernels, user and group ids can be mapped inside containers. User zero inside a container need not match user zero outside of it.
Then we can start to look at AppArmour and SELinux to add to the security. Mandatory Access Controls (MAC). If you are rolling your own solution then you might also want to look at something like Computer Associates “Control Minder” (previous known as eTrust Access Control, or even SEOS; it’s changed names a lot). This is a highly flexible MAC system; a lot more flexible than SELinux. But not cheap :-)
Now a clever one; seccomp. This allows you to drop permissions to use specific syscalls. If you can’t use them then you can’t leverage any isolation bugs; basically reduce the attack surface area.
Use of CAPabilities inside the container may allow some actions to be done without needing root (eg “ping”). Ensure you drop unwanted capabilities before starting your app.
Of course these are all kernel oriented, and so themselves form part of the attack surface. Older Android machines have had kernel level exploits due to SELinux bugs!
You can see that there’s a lot to juggle and keep in mind. It’s a full time job. Unless you have a specific requirement to build your solution, I’d avoid it and use something more “off the shelf”. After all, how many people build their own OS?
By using something common (Docker, Rocket, Cloud Foundry, etc) you’ll have the advantage of a lot of developers, constantly improving the product. Docker 1.10 makes use of many of the tools listed above; it works with seccomp, SELinux, AppArmour. It deals with name spaces…
Don’t get too hung up with the technology implementation. Be aware of what you’re using (e.g. docker daemon runs as root) and the best practices around it. There’s hundreds of pages on the web about that!
Pick a technology stack that meets your use case (Docker? Rocket? Mesosphere? Cloud Foundry? Google Function? Amazon Lambda?) and maintain it like you’d maintain any vendor product.
You’ll gain a bigger win by focusing on your processes and automation. Which you need to do, anyway, to ensure the content is secure. And the nice part of that is by fixing this part you can scale your solution from micro-service containers all the way up to VMs or even physical server builds. Hands-off automation all the way!