Part of any good backup strategy is to ensure a copy of your backup is stored in a secondary location, so that if there is a major outage (datacenter failure, office burns down, whatever) there is a copy of your data stored elsewhere. After all, what use is a backup if it gets destroyed at the same time as the original? A large enterprise may do cross-datacenter backups, or stream them to a “bunker”; smaller business may physically transfer media to a storage location (in my first job mumble years ago, the finance director would take the weekly full-backup tapes to her house so we had at most 1 week of data loss).
A typical cloud engagement has a dual responsibility model. There’s stuff that can be considered “below the line” and is the responsibility of the cloud service provider (CSP) and there’s stuff above the line, which is the responsibility of the customer. Amazon have a good example for their IaaS: Where the line lives will depend on the type of engagement; the higher up the abstraction tree (IaaS->PaaS->SaaS) the more the CSP has responsibility.
In a previous blog entry I described some of the controls that are needed if you want to use a container as a VM. Essentially, if you want to use it as a VM then you must treat it as a VM. This means that all your containers should have the same baseline as your VM OS, the same configuration, the same security policies. Fortunately we can take a VM and convert it into a container.
In a lot of this blog I have been pushing for the use of containers as an “application execution environment”. You only put the minimal necessary stuff inside the container, treat them as immutable images, never login to them… the sort of thing that’s perfect for 12 factor application. However there are other ways of using containers. The other main version is to treat a container as a light-weight VM. Sometimes this is called “OS container” because you’ve got a complete OS (except the kernel) here, and you treat it as if it was an OS.
A phrase you might hear around cloud computing is lift and shift. In this model you effectively take your existing application and move it, wholesale, into a cloud environment such as Amazon EC2. There’s no re-architecting of the application; there’s no application redesign. This make it very quick and very easy to move into the cloud. It’s not much different to a previous p2v (physical to virtual) activity that companies performed when migration to virtual servers (eg VMware ESX).
In this glorious new world I’ve been writing about, applications are non persistent. They spin up and are destroyed at will. They have no state in them. They can be rebuilt, scaled out, migrated, replaced and your application shouldn’t notice… if written properly! But applications are pointless if they don’t have data to work on. In traditional compute an app is associated with a machine (or set of machines). These machines have filesystems.
The core problem with a public cloud is “untrusted infrastructure”. We could get a VM from Amazon; that’s easy. What now? The hypervisor isn’t trusted (non company staff access it and could use this to bypass OS controls). The storage isn’t trusted (non company staff could access it). The network isn’t trusted (non company…). So could we store Personal Identifying Information in the cloud? Could a bank store your account data in a public cloud?