As an operator of a private cloud solution:
The operational procedures associated with managing a private cloud should include the following security functionality in relation to the resource pooling attribute of the private cloud:
The following sections describe in more detail how to provide this functionality in the private cloud.
In the private cloud, the monitoring that supports the event management and incident management processes must include monitoring for attacks on the infrastructure launched from the virtual machines hosted in the platform layer. Such attacks may be launched in an attempt to cause damage to the infrastructure or in an attempt to gain access to other services that may be hosted on the same physical device. An automated incident management response can then shut down the virtual machine that originated the attack, and notify operators so that they can investigate the problem. An alternate approach would be to reallocate resources temporarily to keep the service running while attempting to fix the problem.
Note: This document is part of a collection of documents that comprise the Reference Architecture for Private Cloud document set. The Solution for Private Cloud is a community collaboration project. Please feel free to edit this document to improve its quality. If you would like to be recognized for your work on improving this document, please include your name and any contact information you wish to share at the bottom of this page
Your operational procedures should ensure that host operating systems and software have security patches and updates applied in a timely manner to help mitigate the threat of any attack on the infrastructure from the hosted virtual environments (and elsewhere). These security updates should also include updates to computer BIOS, switch firmware, and virtualization environments.
In the private cloud, tenant applications and services can be hosted on any physical host device in the cloud, and load balancing mechanisms can dynamically move applications to other servers. Any operational controls over physical access to the data center for operators must assume that any host server could hold the most sensitive or business critical data or application.
You should ensure that you have operational procedures in place for the secure disposal of all physical hardware that may still contain data. In a private cloud, it will be difficult to track what data may have been stored on a device, so all hardware must be subject to the same rigorous disposal procedures. Although virtual machine images and virtual hard disks may reside in SAN storage, the servers that run the virtual machines may still cache information on local storage, or in some other form of persistent or volatile memory.
Operational processes and procedures (such as those that relate to service continuity, availability management, and incident management) must preserve and respect the authentication and authorization rules defined to control access to virtual cloud resources and to hosted applications and services.
For example, if access to an application's data in a virtual machine is not permitted for operators, then operators should not be able to access archived data from the same application. Similarly, access controls must be preserved if a tenant application moves to a different host or even to a different host in a different data center.
Although the infrastructure elements of the private cloud (such as the hypervisor) should ensure that virtual environments are isolated from each other, resource pooling opens up additional threats to the confidentiality, integrity and availability of data in virtual environments in the private cloud.
For example, an attacker with access to an application owned and managed by a tenant could exploit a weakness in the isolation provided by the hypervisor or virtual network infrastructure to gain access to another virtual machine on the same host. Therefore, the virtual machines in the private cloud must take steps to protect themselves, and operational procedures should ensure that the protection continues to be effective:
The allocation of pooled resources to tenants is handled automatically by the private cloud infrastructure and may be adjusted dynamically as part of the private cloud's load balancing function. Problem management often requires detailed information about the state of the environment in order to carry out root cause analysis and understand the consequences. Your logging solution must enable you to track which tenant applications were deployed on which physical servers at a particular time. This information will help you to track and contain any cross virtual machine attack or host-based attack that might have compromised tenant's data.
One of the specific goals of monitoring in the private cloud should be to identify attempts to gain unauthorized access to a tenant's data. The attack could be:
Alternatively, the attack might be a denial-of-service attack that attempts to over-allocate new resources and empty the shared pool of resources.
Monitoring should attempt to detect such attacks before they succeed. Automated incident management processes should trigger a response to contain the attack and to notify the appropriate operations staff and the relevant tenants.
For example, if an attack is detected that originates from another virtual machine in the private cloud, the automated response should shutdown that virtual machine, notify an operator and notify the owner of that virtual machine. You should also ensure that sufficient log information is collected to be able to understand what data might have been compromised should such an attack only be detected after it succeeded. Identifying what constitutes sufficient logging information is a major design process in itself, as you do not want the attack to compromise the cloud environment yet you also need enough information to help counter any repeated attacks.
In certain cases, you might leave a virtual machine running as a honeypot to attempt to identify what the attacker is doing. Your cloud environment may also include dedicated honeypot virtual machines specifically to trap and track attacks.
Rapid, automated incident responses are necessary in case an attack manages to spread to multiple virtualized environments in the private cloud compromising the isolation of other services and applications. However, you should be aware of the possibility that a false-positive detection of an attack, in combination with an automated incident response, the false-positive could shut down a number of tenant services unnecessarily.
All administrative access to the platform or guest operating system from operations staff and the owner of the virtual resource should be fully logged, auditable, and subject to role-based access controls.
In some scenarios, the operators and automated processes may not have access to the virtual environment, in which case responsibility for the security of the virtual machine lies entirely with the business unit owner. In this scenario, the SLA should specify what the owner must do to maintain the security of the environment, for example:
Tenant applications and services hosted in the cloud may take steps to ensure the confidentiality, integrity, and trust-worthiness of their own data by using encryption technologies. Operational procedures must ensure that data encrypted by a tenant's application remains available and usable, in addition to maintaining its confidentiality, availability, and trust-worthiness. For example:
Although the design and management of software running in virtualized environments is not typically the responsibility of the cloud service provider in the IaaS and PaaS models, there may be recommended or mandated processes and procedures for the tenant to follow in a private cloud. You should audit and verify that tenants are complying with any mandatory processes in order to ensure the overall security of the cloud environment.
For example, tenants may be mandated to change their storage access keys on a regular basis and because the software is owned and managed by the tenant, this process may not be easily automated. In this scenario, you should regularly audit the tenant to ensure that they are changing the keys.
Operational procedures such as those related to incident management and IT service continuity must also ensure that data continues to be protected from unauthorized access. For example:
All management operations, whether performed by the CSP or tenant must be logged and be auditable.
Operational procedures such as back-ups, planning for IT service continuity, collection of monitoring data for problem analysis must all comply with any legal requirements that affect data storage and data privacy. One of the consequences of the resource pooling behavior in private clouds is the difficulty in identifying what data is stored in what location at what time: in consequence, pooling may make it more difficult to verify compliance.
You should ensure that you implement a regular review of current industry and governmental regulations that affect your private or hybrid cloud environment. Changes to laws and regulatory requirements at state, country or supra-national level (such as with the European Union) can all affect the standard operating procedures of your cloud service and may require adjustments to your SLA. REFERENCES:
ACKNOWLEDGEMENTS LIST: If you edit this page and would like acknowledgement of your participation in the v1 version of this document set, please include your name below: [Enter your name here and include any contact information you would like to share]
Return to Private Cloud Security Operations Challenges Return to A Solution for Private Cloud Security Return to Reference Architecture for Private Cloud Move forward to Private Cloud Security Operations Challenges - Broad Network Access Table of Contents for A Solution for Private Cloud Security