What’s new in OpenStack Grizzly?

The latest Grizzly offering represents the seventh major release of OpenStack, it just goes to show the power of Open Source and what can be achieved when we work together. Grizzly, for many reasons, is a significant milestone for the project, it’s being seen as a stable enterprise platform and the adoption within the industry is increasing exponentially. This blog post aims to detail some of the latest additions that the Grizzly release brought to OpenStack and I’ll attempt to answer why they are important.

Firstly, lets look at Nova. Nova provides the compute resources for an OpenStack based cloud, it emulates a lot of the functionality that is provided by Amazon’s EC2; it’s responsible for scheduling and managing the lifecycle of running instances. An important new feature is the ability to now provision physical resources; and I don’t just mean Linux containers, I’m talking about entire physical instances, skipping out the requirement for hypervisor integration. There are some limitations with this approach, predominantly around networking but it’s still early days. Being able to scale to massive quantities is a common goal across the OpenStack project, the Nova component is one of the first to start to tackle the big scalability problems that the largest OpenStack clouds are starting to hit.

One technology that was previewed in the Folsom release and is now more comprehensive in Grizzly is the concept of zones (think AWS Availability Zones), hosts can be grouped into these zones and end-users are permitted to list and select a specific zone to deploy their instances into. A typical use case for this would be to provide availability; an OpenStack cloud may span multiple datacenters, availability zones can allow users to deploy their infrastructure across these therefore introducing fault tolerance. Another concept, which sounds very similar at face value, is host aggregates; like availability zones it allows you to group a set of hosts but instead of grouping for availability we group based on a common feature. An example of this would be, all hosts within an aggregate group all have solid state disks (remember that ephemeral instance storage runs on local disk), a flavour, or ‘instance size/type’ is created that can reference this aggregate group so that end-users can be sure that they are exploiting the common feature.

In addition, brand new to Grizzly is the concept of Nova Cells. One of the biggest limitations to scale are the dependencies within Nova, for example within a cluster there’s a shared database, message queue and set of schedulers, whilst we can load balance and scale out, there are technical and physical limitations inherent to Nova. Nova Cells attempts to create multiple smaller ‘clouds’ within a larger OpenStack environment, each providing their own database, messaging queue and scheduler set but all ‘reporting’ to one global API in a tree-like structure. The problem here is that only Nova supports the Cells implementation, we’re going to need to solve these limitation problems for the rest of the components too.

Early releases of OpenStack relied on the compute nodes themselves to have direct database access to update instance information, this posed a security threat as there was concern that compromised hypervisors would have wider access to the OpenStack environment. A new implementation known as the nova-conductor provides a method of isolating the database access from the rest of the services. Not only does this alleviate the security concerns, it helps address the scalability problems. Rather than having the database being accessed by hundreds (or thousands!) of nodes, a smaller quantity of database workers can be utilised and load-balanced across.

Whilst we’re on the subject of databases, as you can imagine the database can grow exponentially when an OpenStack cloud gets bigger, Database Archiving in Grizzly attempts to address this by flushing old instance data into shadow tables, there’s no need for the table space to continue to spiral out of control with garbage records. The final point I wanted to make around Nova is the addition of the evacuate method, this allows administrators to ‘evacuate’ all instances from a particular host, e.g. it needs to be upgraded or it failing, allowing mass migration off said node for maintenance. Previously this would have been a lot more difficult to achieve.

Moving onto networking, Quantum (note, now being called OpenStack Networking) provides software defined networking or networking as a service to OpenStack clouds. It has become widely adopted and is typically the default networking mechanism in Grizzly. The prior implementation of networking utilised ‘nova-network’ which provided networking access via L2 bridges with basic L3 and security provided by iptables, it had limited multi-tenancy options (using VLANs) and didn’t scale well enough for cloud environments. Quantum is the evolution of this, it allows complete network abstraction by virtualising network segments. Quantum has received a lot of interest in the community with many vendors providing their own plugins, i.e. Quantum provides an abstract API but it relies on plugins to implement the networks.

There have been many interesting developments in Quantum for the Grizzly release cycle, one of the problems initially was the single point-of-failure architecture; an example being a single L3 agent or a single DHCP agent, obviously losing this node would mean a lack of external routing or DHCP for the instances. Grizzly has implemented multiple agent support for these therefore reducing these bottlenecks. The concept of Security Groups is not new to OpenStack, in previous versions of nova-network it allowed us to set inbound firewall rules for our instances (or groups of instances), Quantum is fully backwards compatible yet vastly enhances and extends the security group capabilities, allowing inbound as well as outbound regulations. Most importantly, it is now able to configure rules on a per-port basis, i.e. for every network adapter attached to an instance, previously it was on a per instances basis. All of the configuration is exposed within Horizon, giving end-users the ability to create and control networks and their topology. Additionally, Quantum is now able to support some higher layer features such as load balancers (LBaaS) and VPNs, but much of it is still a work in progress.

For those of you unfamiliar with Keystone, it provides an authentication and authorisation store for OpenStack, i.e. who’s who and can they do what they’re trying to do. Keystone makes use of tokens, they’re provided to a user after authenticating with a username and password combination. It saves passwords from being passed around the cluster and provides an easy way of revoking compromised sessions. One of the limitations with previous versions was not supporting multi-factor authentication, e.g. a password with a string plus a token-code. This feature has now been implemented in Keystone for Grizzly, vastly enhancing the security of OpenStack. There’s a brand-new API version (v3.0) available for Keystone too, providing additional features such as groups (not to be confused by tenants or projects) which allow select users to be grouped for reasons such as role-based access control.

Finally, the block-storage element, Cinder, has evolved considerably. Cinder provides block-storage to instances, typical use cases would be for data persistence or for tiered storage, e.g. ephemeral storage sitting on the hypervisors local disk but more performant persistent storage sitting on a SAN. Cinder started off by providing block device support over iSCSI to hypervisor hosts (which in turn presented the volumes as local SCSI disks), to bridge the gap between the current implementation and what enterprises want out of OpenStack, a lot of work has gone into providing FIbre Channel and FCoE storage support. The project has welcomed many new contributions from hardware and software vendors in the form of drivers to provide storage resource backends, the list of supported storage platforms is significantly bolstered with the latest Grizzly release. Previous versions of Cinder only permitted a single backend device to be used, Grizzly supports multiple drivers simultaneously, therefore allowing multiple tiers of storage for the end-users. Examples of this could be iSCSI-based storage for low-priority workloads and fully-multipathed FC storage for higher-priority workloads; all of which completely abstracted.

Up until Grizzly, backing up block devices was typically handed off to the storage platform and was not of concern to Cinder. With the latest code it’s now possible to backup volumes straight into OpenStack Swift (the completely scale-out, fault-tolerant object storage project). This vastly enhances the disaster recovery options for OpenStack, they’re true volume backups and are implementation agnostic.

This write-up wouldn’t be complete without mentioning the two latest project additions to OpenStack that became incubating components in Grizzly. Firstly is Heat (https://wiki.openstack.org/wiki/Heat) which provides an orchestration layer based around compatibility with AWS CloudFormation templates. It implements basic high availability as well as automatic scaling of applications. Secondly, Ceilometer (https://wiki.openstack.org/wiki/Ceilometer) provides a billing and metering framework, allowing monitoring of instances and what resources they are consuming. I’ve not given either of these projects justice in the previous sentences but I will write individual articles explaining how relevant and important they will become to the success of OpenStack.

Let me know if you’ve got any questions.

OpenStack Summit 2013 Report

NOTE: The views expressed below are my own and do not, in any way, represent the views of my employer.

The week before last I attended the OpenStack Summit in Portland, Oregon. It was a chance for me to get the latest information about one of the most exciting open-source projects for the past few years. I’ve been involved with OpenStack for quite a few months now, mainly deploying environments and writing documentation to aid in my understanding as well as providing it out for others to consume; one of the problems with OpenStack is that it’s very difficult to get started. Whilst I knew that OpenStack had an enormous market-hype with the vast majority of ISVs and hardware vendors jumping on-board, nothing could prepare me for the overwhelming turnout at the summit; the event had actually sold-out, which for an Open Source conference is an achievement. The majority of the sessions were actually overflowing, if you didn’t turn up 20-30 minutes beforehand then you had little chance of getting somewhere to sit. It just goes to show the level of interest in this project, people from all over the world and of all career paths attended as they were either actively involved with OpenStack in some way or knew they had to learn more; the event had a very refreshing buzz about it, people wanted to be there, were passionate about the technology and could really see it going somewhere.

The conference, with the exception of the daily keynotes, was split up into multiple tracks, for the active contributors/developers they had the design summit, but for those of us that wanted to gain an insight into the latest and greatest we found ourselves sitting through presentations covering the widest variety of topics. They also provided hands-on workshops all week, something that I found extremely valuable- for example, as I’m not a networking guy I sometimes find myself confused over complex networking and the concept of virtual networks, the hands-on Quantum lab enabled me to gain a good understanding of it. Even beginners were catered for, there were 101 sessions most days allowing people with very little experience of virtualisation and cloud computing to come away with an understanding of the OpenStack project and where it is headed. As a Red Hat employee myself, putting faces to the names of the colleagues I work with daily was a great opportunity, especially given our recent OpenStack distribution announcement; RDO, our community-supported offering (http://openstack.redhat.com). The ability to network with other OpenStack users (and potential future users) was extremely valuable, receiving feedback about what they wanted to use it for and what features they really wanted to see. In fact, when you look at the attendee list it goes to show the variety of attendees; it seemed to be a mix between the stereotypical LinuxCon attendees and VMworld attendees, a very dynamic environment.

The keynote sessions were extremely useful, the Rackspace keynote being the headline for many was, as expected, really good. The statistic which they keep using is that whilst they’re not reducing the amount of contributions they make, their overall commit percentage continues to decrease; clearly proving the success of the project in the open-source community. It was also good to see that Red Hat is out on top now with the latest Grizzly release and they’re not trying to keep that quiet! Rackspace is using OpenStack in production (no surprises there!) but the way in which they use it is very interesting, they deploy OpenStack on OpenStack for providing “private clouds” on-top of the public cloud, eating their own dog-food at every level. They’ve done this with an extremely high level of reliability and API uptime, so when people say that OpenStack isn’t ready for production, I really do beg to differ. Canonical’s keynote by Mark Shuttleworth was very good too, I think they’re at a set of crossroads, moving from the traditional ‘desktop environment’ to a more strategic cloud play, I’m just not sure they’ve got the resources to actually fulfill what they try and say, despite the fact that OpenStack has been predominantly written to run on Ubuntu (historically, anyway). Aside from the OpenStack “vendors”, organisations such as Best Buy, Bloomberg, Comcast and others presented what they use OpenStack for, how they’ve implemented it at scale and what they’ve learned (and contributed back!), it further goes to show that there’s real interest in many different areas if the use case is right, i.e. scale out, fault tolerant applications.

What interests me is that OpenStack is clearly viewed by many as a threat, you take VMware for example; they are actively contributing to the project to enable their ESXi hypervisor as a compute resource for OpenStack Nova, they’re concerned that when people see the benefits of OpenStack and the fact that it abstracts much of the underlying compute resource, the requirement for ESX will drop. VMware are at risk of becoming irrelevant in the long term where applications are written in different ways, i.e. to be more fault tolerant and not requiring hypervisor-oriented technologies such as HA, they have to try and retain a piece of the pie and leverage the existing investments that organisations have made on their technology. In addition, the next-step of the virtualisation piece is virtual-networking; Quantum provides a virtual network abstraction service with multiple plugins for software and hardware based networking stacks. VMware is active in this space also, providing their Nicira-based NVP plugin for Quantum, this is an emerging technology and will likely be the later piece of the puzzle that gets adopted. VMware and Canonical have just gone to market with a fully-supported offering, this joint venture provides a complete OpenStack environment for customers currently using VMware. As VMware didn’t previously have an operating-system, Ubuntu is able to step in and plug this gap, a potentially strategic opportunity for both organisations.

It’s not just VMware either, the likes of HP, Dell, IBM are all jumping on the OpenStack bandwagon, all with sets of developers contributing upstream as well as to their own product offerings, seeing OpenStack as a way of making money. Many vendors are providing additional components that allow OpenStack to integrate directly into existing environments, bridging the gap between upstream vanilla OpenStack and the modern-day datacenter. The VMware/Canonical offering is too early to tell whether it will be a success or not, but in my opinion they are doing the right thing, I am a firm believer that any vendor looking to provide an OpenStack offering should be open about the partnerships they forge and the additional fringe components that they choose to support; because of the vast support model in the OpenStack trunk it will attract a wide variety of customers with a disprate set of requirements, picking and choosing technologies to support will be extremely important.

OpenStack has opened the doors to a wide variety of new-startups, all providing additional layers or extensions to help integrate and ease the adoption of the product into organisations. Examples of these include Mirantis and SwiftStack. Mirantis is a company based out of Russia, they, more than anyone, impressed me at the summit. They’ve written a tool called Fuel which aims to provide a ground-up management platform for OpenStack; currently to implement OpenStack requires quite a lot of time, knowledge and experience, Fuel is able to configure an array of the underlying bare-metal technologies including networking and storage as well as to provision and manage entire OpenStack environments from a web-interface. If I was in a position to acquire a company, Mirantis would be at the top of my list right now! SwiftStack provide management tools and support for deploying a Swift-based object-storage cloud, their presentations were fascinating and for anyone interested in how Swift works then they’ve written a free book (http://www.swiftstack.com/book/) which I highly recommend. The exhibition room was full of vendors promoting either their distributions, their consultancy/architecture services or their additional add-on components, very rare for an open-source project… even at LinuxCon it’s usually full of the normal open-source companies plus hardware guys, this represented a huge mix.

What last week made me realise is that there’s an enormous opportunity for OpenStack, a lot of the community work has been done for the vendors wishing to pursue an OpenStack strategy, some vendors being in better positions than others to make it a successful venture; integration with existing enterprise technology will be extremely important for vendors to get right. There’s a lot of overlap between the capabilities of some existing products in the industry, however OpenStack, in my opinion, is addressing the problem to the next-generation architectures. There are lots of offerings/flavours/distributions out there, the most important thing about OpenStack-based clouds is interoperability… they must continue to be open, i.e. open API’s, open standards to allow portability between clouds. Whilst I fear that some organisations (especially proprietary vendors) are getting involved in OpenStack because of the hype, it represents a turning point in the way that big corporations think- it’s yet again proving the power of Open Source and what can be achieved when we work together.

Long term, there are many areas in which we can improve OpenStack, there aren’t many organisations out there that are ready to implement an OpenStack environment unless they go greenfield with it; integration is key but the switch from traditional data centers to a fully software-defined environment is a big step to take and this step will take years to fully embrace. It doesn’t mean that traditional enterprise virtualisation will go away either, there will always be a requirement for legacy applications but the convergence of these technologies will be an intriguing concept to watch. Over time I think that OpenStack will become a lot more than just a set of tools to build cloud environments; we see this with the latest tool sets such as the Heat API and nova-baremetal, tools that are implementing features that have been traditionally excluded. It’s an exciting time to be involved with OpenStack, I look forward to seeing what the future can bring for the platform and making it a success.

As I attended about 10 sessions per day (usually 30-40 minutes each), I won’t comment on them all, but some personal favourites to recommend people look into if they’re interested in learning more about what’s coming up in OpenStack:

Orchestration of Fibre Channel in Cinder – The default out of the box block storage configuration in OpenStack is typically iSCSI, whilst there’s plenty of additional options now Fibre Channel support has taken quite some time to make it into the code. The problems are typically around zoning, as far as Nova is concerned this is almost irrelevant as all it has to do is attach the underlying disk to the instance in the same fashion as iSCSI. This technology brings OpenStack closer to enterprise adoption. (http://www.openstack.org/summit/portland-2013/session-videos/presentation/orchestration-of-fibre-channel-technologies-for-private-cloud-deployments)

Ceilometer Metrics -> Metering – So, Ceilometer is a new project within OpenStack, it was introduced in Folsom as an incubated project but has made it into Grizzly as a full component. It enables organisations to implement a flexible chargeback model on pretty much anything they want to plug it into. It vastly extends the very basic quotas and utilisation that Folsom used to provide. There are still some limitations but it’s extremely powerful. (http://www.openstack.org/summit/portland-2013/session-videos/presentation/ceilometer-from-metering-to-metrics)

OpenStack HA with Mirantis – This talk actually turned into a product demonstration/pitch, but it was the one I was most impressed by. This demonstrates Mirantis’ Fuel implementation for managing OpenStack environments from bare-metal. Many organisations keep asking “how can we deploy OpenStack to scale?” or “How can we make OpenStack Highly Available?”. Mirantis attempts to solve the deployment and high availability problems with their tools and ammusingly they say it’s so easy, even a goat can do it! (http://www.openstack.org/summit/portland-2013/session-videos/presentation/standup-ha-openstack-with-open-puppet-manifests-in-under-20-minutes-for-goat)

Deploying and Managing OpenStack with Heat – This talk discusses how the “triple-o” or “OpenStack-on-OpenStack” project uses the Heat API (and nova-baremetal) to deploy entire OpenStack environments automatically, blurring the differences between the cloud layer and the physical world. (http://www.openstack.org/summit/portland-2013/session-videos/presentation/deploying-and-managing-openstack-with-heat)

Software Defined Networking (Scaling in the Cloud) – One of the things I mentioned earlier was not being a networking guy, these sorts of presentations helped me understand how things fit in, where things were going and how virtual networking was solving real world problems with scale. Gone are the days where we provide L2 bridges (+ VLAN tagging) to virtual machines, in the world of software defined networking we can remove a lot of underlying complexity and control it all in software. This is an area that will become extremely important in the future. (http://www.openstack.org/summit/portland-2013/session-videos/presentation/scaling-in-the-cloud-the-hype-and-happenings-of-software-defined-networking)

Note, ALL of the Summit videos are freely available online at: http://www.openstack.org/summit/portland-2013/session-videos/