The latest Grizzly offering represents the seventh major release of OpenStack, it just goes to show the power of Open Source and what can be achieved when we work together. Grizzly, for many reasons, is a significant milestone for the project, it’s being seen as a stable enterprise platform and the adoption within the industry is increasing exponentially. This blog post aims to detail some of the latest additions that the Grizzly release brought to OpenStack and I’ll attempt to answer why they are important.
Firstly, lets look at Nova. Nova provides the compute resources for an OpenStack based cloud, it emulates a lot of the functionality that is provided by Amazon’s EC2; it’s responsible for scheduling and managing the lifecycle of running instances. An important new feature is the ability to now provision physical resources; and I don’t just mean Linux containers, I’m talking about entire physical instances, skipping out the requirement for hypervisor integration. There are some limitations with this approach, predominantly around networking but it’s still early days. Being able to scale to massive quantities is a common goal across the OpenStack project, the Nova component is one of the first to start to tackle the big scalability problems that the largest OpenStack clouds are starting to hit.
One technology that was previewed in the Folsom release and is now more comprehensive in Grizzly is the concept of zones (think AWS Availability Zones), hosts can be grouped into these zones and end-users are permitted to list and select a specific zone to deploy their instances into. A typical use case for this would be to provide availability; an OpenStack cloud may span multiple datacenters, availability zones can allow users to deploy their infrastructure across these therefore introducing fault tolerance. Another concept, which sounds very similar at face value, is host aggregates; like availability zones it allows you to group a set of hosts but instead of grouping for availability we group based on a common feature. An example of this would be, all hosts within an aggregate group all have solid state disks (remember that ephemeral instance storage runs on local disk), a flavour, or ‘instance size/type’ is created that can reference this aggregate group so that end-users can be sure that they are exploiting the common feature.
In addition, brand new to Grizzly is the concept of Nova Cells. One of the biggest limitations to scale are the dependencies within Nova, for example within a cluster there’s a shared database, message queue and set of schedulers, whilst we can load balance and scale out, there are technical and physical limitations inherent to Nova. Nova Cells attempts to create multiple smaller ‘clouds’ within a larger OpenStack environment, each providing their own database, messaging queue and scheduler set but all ‘reporting’ to one global API in a tree-like structure. The problem here is that only Nova supports the Cells implementation, we’re going to need to solve these limitation problems for the rest of the components too.
Early releases of OpenStack relied on the compute nodes themselves to have direct database access to update instance information, this posed a security threat as there was concern that compromised hypervisors would have wider access to the OpenStack environment. A new implementation known as the nova-conductor provides a method of isolating the database access from the rest of the services. Not only does this alleviate the security concerns, it helps address the scalability problems. Rather than having the database being accessed by hundreds (or thousands!) of nodes, a smaller quantity of database workers can be utilised and load-balanced across.
Whilst we’re on the subject of databases, as you can imagine the database can grow exponentially when an OpenStack cloud gets bigger, Database Archiving in Grizzly attempts to address this by flushing old instance data into shadow tables, there’s no need for the table space to continue to spiral out of control with garbage records. The final point I wanted to make around Nova is the addition of the evacuate method, this allows administrators to ‘evacuate’ all instances from a particular host, e.g. it needs to be upgraded or it failing, allowing mass migration off said node for maintenance. Previously this would have been a lot more difficult to achieve.
Moving onto networking, Quantum (note, now being called OpenStack Networking) provides software defined networking or networking as a service to OpenStack clouds. It has become widely adopted and is typically the default networking mechanism in Grizzly. The prior implementation of networking utilised ‘nova-network’ which provided networking access via L2 bridges with basic L3 and security provided by iptables, it had limited multi-tenancy options (using VLANs) and didn’t scale well enough for cloud environments. Quantum is the evolution of this, it allows complete network abstraction by virtualising network segments. Quantum has received a lot of interest in the community with many vendors providing their own plugins, i.e. Quantum provides an abstract API but it relies on plugins to implement the networks.
There have been many interesting developments in Quantum for the Grizzly release cycle, one of the problems initially was the single point-of-failure architecture; an example being a single L3 agent or a single DHCP agent, obviously losing this node would mean a lack of external routing or DHCP for the instances. Grizzly has implemented multiple agent support for these therefore reducing these bottlenecks. The concept of Security Groups is not new to OpenStack, in previous versions of nova-network it allowed us to set inbound firewall rules for our instances (or groups of instances), Quantum is fully backwards compatible yet vastly enhances and extends the security group capabilities, allowing inbound as well as outbound regulations. Most importantly, it is now able to configure rules on a per-port basis, i.e. for every network adapter attached to an instance, previously it was on a per instances basis. All of the configuration is exposed within Horizon, giving end-users the ability to create and control networks and their topology. Additionally, Quantum is now able to support some higher layer features such as load balancers (LBaaS) and VPNs, but much of it is still a work in progress.
For those of you unfamiliar with Keystone, it provides an authentication and authorisation store for OpenStack, i.e. who’s who and can they do what they’re trying to do. Keystone makes use of tokens, they’re provided to a user after authenticating with a username and password combination. It saves passwords from being passed around the cluster and provides an easy way of revoking compromised sessions. One of the limitations with previous versions was not supporting multi-factor authentication, e.g. a password with a string plus a token-code. This feature has now been implemented in Keystone for Grizzly, vastly enhancing the security of OpenStack. There’s a brand-new API version (v3.0) available for Keystone too, providing additional features such as groups (not to be confused by tenants or projects) which allow select users to be grouped for reasons such as role-based access control.
Finally, the block-storage element, Cinder, has evolved considerably. Cinder provides block-storage to instances, typical use cases would be for data persistence or for tiered storage, e.g. ephemeral storage sitting on the hypervisors local disk but more performant persistent storage sitting on a SAN. Cinder started off by providing block device support over iSCSI to hypervisor hosts (which in turn presented the volumes as local SCSI disks), to bridge the gap between the current implementation and what enterprises want out of OpenStack, a lot of work has gone into providing FIbre Channel and FCoE storage support. The project has welcomed many new contributions from hardware and software vendors in the form of drivers to provide storage resource backends, the list of supported storage platforms is significantly bolstered with the latest Grizzly release. Previous versions of Cinder only permitted a single backend device to be used, Grizzly supports multiple drivers simultaneously, therefore allowing multiple tiers of storage for the end-users. Examples of this could be iSCSI-based storage for low-priority workloads and fully-multipathed FC storage for higher-priority workloads; all of which completely abstracted.
Up until Grizzly, backing up block devices was typically handed off to the storage platform and was not of concern to Cinder. With the latest code it’s now possible to backup volumes straight into OpenStack Swift (the completely scale-out, fault-tolerant object storage project). This vastly enhances the disaster recovery options for OpenStack, they’re true volume backups and are implementation agnostic.
This write-up wouldn’t be complete without mentioning the two latest project additions to OpenStack that became incubating components in Grizzly. Firstly is Heat (https://wiki.openstack.org/wiki/Heat) which provides an orchestration layer based around compatibility with AWS CloudFormation templates. It implements basic high availability as well as automatic scaling of applications. Secondly, Ceilometer (https://wiki.openstack.org/wiki/Ceilometer) provides a billing and metering framework, allowing monitoring of instances and what resources they are consuming. I’ve not given either of these projects justice in the previous sentences but I will write individual articles explaining how relevant and important they will become to the success of OpenStack.
Let me know if you’ve got any questions.