Earlier this month at DevOpsDays here in Austin the Dell Crowbar crew hosted a session and gave a demo. If you’re not familiar with it, Crowbar is an open source software framework written at Dell. I grabbed some time with Crowbar architect Rob Hirschfeld and got him to recap how far we’ve come in its less than a year and where he sees us going over the next year.
Last week DevOpsDays was held here in Austin. It sold out in about day after it was announced and had a big waiting list. The two-day event, which was held at National Instruments (who did an awesome job as host), featured talks and panels in the mornings and “open space” discussions in the afternoons.
The panel on the first day, moderated by John Willis, was entitled: Provisioning Panel – Meet Juju, Crowbar, Puppet, Chef, Pallet + discussion. After the panel I caught up with each of the members for a follow-up chat. Here they are:
Juju – Mark Mimms of Canonical
Crowbar – Rob Booth of Zenoss
Puppet – Dan Bode of Puppet Labs
Chef – Matt Ray of Opscode
Pallet – Antoni Batchelli of Pallet Ops
Stay tuned for more DevOpsDays goodness in the days to come!
Last summer at OSCON Dell announced the availability of our OpenStack solution in the US and Canada. Today at World Hosting Days in Rust Germany we are now announcing that our OpenStack-Powered Cloud Solution is available in Europe and Asia.
If you’re not familiar with it, OpenStack is an open source cloud project built on a foundation of code initially donated by NASA and Rackspace. The project kicked off a little over a year and a half ago here in Austin and it has gained amazing traction since then.
Dell’s offering
Dell’s OpenStack cloud offering is an open source, on premise cloud solution based on the OpenStack platform running on Ubuntu. Its composed of:
The OpenStack cloud operating system
PowerEdgeC servers: C6100, C6105, C2100 and, coming soon, Dell’s new C6220 and R720
The Crowbar deployment and management software framework – developed and coded by Dell
Dell’s OpenStack reference architecture
Dell Services
Crowbar software framework
To give a little more background on the Crowbar software framework, its an open source project developed initially at Dell and you can grab it off github. The framework, which is under the Apache 2.0 license, manages the OpenStack deployment from the initial server boot to the configuration of the primary OpenStack components, allowing users to complete bare metal deployment of multi-node OpenStack clouds in hours, as opposed to days.
Once the initial deployment is complete, you can use Crowbar to maintain, expand, and architect the complete solution, including BIOS configuration, network discovery, status monitoring, performance data gathering, and alerting. Beyond Dell, companies like VMware, Dreamhost and Zenoss have built “barclamps” that allow them to utilize Crowbar’s modular design. Additionally, customers who buy the Dell OpenStack-Powered Cloud Solution get training, deployment, and support on Crowbar.
So as of today, customers in the UK, Germany and China can purchase the Dell OpenStack-Powered Cloud Solution. As customer demand grows in other regions we will be adding more countries so stay tuned. If the first 18 mos of the project are any indication of whats the pace is like to come, we are all going to be in for a lot more excitement.
With O’Reilly’s big data conference Strata coming up in just a couple of weeks, I thought I might as well get around to finally writing up my notes from Hadoop World . The event, which was put on by Cloudera, was held last November 8-9 in New York city. There were over 1,400 attendees from 580 companies and 27 countries with two thirds of the audience being technical.
Growing beyond geek fest
The event itself has picked up significant momentum over the last three years going from 500 attendees, to 900 the second year, to over 1400 this past year. The tone has gone from geek-fest to an event focused also on business problems e.g. one of the keynotes was by Larry Feinsmith, managing director of the office of the CIO at JP Morgan Chase. Besides Dell, other large companies like HP, Oracle and Cisco also participated.
As a platinum sponsor, Dell had both a booth and a technical presentation. At the event we announced that we would be open sourcing the Crowbar barclamp for Hadoop and at out booth we showed off the Dell | Hadoop Big Data Solution which is based on Cloudera Enterprise.
Cutting’s observations
Doug Cutting, the father of Hadoop, Cloudera employee and chairman of the Apache software foundation, gave a much anticipated keynote. Here are some of the key things I caught:
Still young: While Cutting felt that Hadoop had made tremendous progress he saw it as still young with lots of missing parts and niches to be filled.
Big Top: He talked about the Apache “Bigtop” project which is an open source program to pull together the various pieces of the Hadoop ecosystem. He explained that Bigtop is intended to serve as the basis for the Cloudera Distribution of Hadoop (CDH), much the same way Fedora is the basis for RHEL (Redhat Enterprise Linux).
“Hadoop” as “Linux“: Cutting also talked about how Hadoop has become the kernel of the distributed OS for big data. He explained that, much the same way that “Linux” is technically only the kernel of the GNU Linux operating system, people are using the word Hadoop to mean the entire Hadoop ecosystem including utilities.
Interviews from the event
To get more of the flavor of the event here is a series of interviews I conducted at the show, plus one where I got the camera turned on me:
Hadoop: An open source platform, developed at Yahoo that allows for the distributed processing of large data sets across clusters of computers using a simple programming model. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. It is particularly suited to large volumes of unstructured data such as Facebook comments and Twitter tweets, email and instant messages, and security and application logs.
MapReduce: a software framework for easily writing applications which process vast amounts of data (multi-terabyte data-sets) in parallel on large clusters of commodity hardware in a reliable, fault-tolerant manner. Hadoop acts as a platform for executing MapReduce. MapReduce came out of Google
HDFS: Hadoop’s Distributed File system allows large application workloads to be broken into smaller data blocks that are replicated and distributed across a cluster of commodity hardware for faster processing.
Major Hadoop utilities:
HBase: The Hadoop database that supports structured data storage for large tables. It provides real time read/write access to your big data.
Hive: A data warehousing solution built on top of Hadoop. An Apache project
Pig: A platform for analyzing large data that leverages parallel computation. An Apache project
ZooKeeper: Allows Hadoop administrators to track and coordinate distributed applications. An Apache project
Oozie: a workflow engine for Hadoop
Flume: a service designed to collect data and put it into your Hadoop environment
Whirr: a set of libraries for running cloud services. It’s ideal for running temporary Hadoop clusters to carry out a proof of concept, or to run a few one-time jobs.
Sqoop: a tool designed to transfer data between Hadoop and relational databases. An Apache project
Hue: a browser-based desktop interface for interacting with Hadoop
This afternoon Matt Ray, Technical Evangelist for Opscode, stopped by Dell’s Round Rock HQ to brief a gaggle of folks on what they are up to. Cote arranged the visit as well as one last month with Puppet labs, which I unfortunately wasn’t able to make.
After Matt, with some help from teammates on the phone, briefed the Dell gang I grabbed some time with him to get the 5 minute Reader’s Digest version. Here is the result.
Some of the ground Matt covers:
What are Opscode and Chef?
How did they come to be?
The hosted version of Chef (moving from EC2 to Rackspace)
Besides interviewing a bunch of people at Hadoop World, I also got a chance to sit on the other side of the camera. On the first day of the conference I got a slot on SiliconANGLE’s the Cube and was interviewed by Dave Vellante, co-founder of Wikibon and John Furrier, founder of SiliconANGLE.
As I mentioned in my previous entry, the code for the Hadoop barclamps is now available at our github repo.
To help you through the process, Crowbar lead architect Rob Hirschfeld has put together the two videos below. The first, Crowbar Build (on cloud server), shows you how to use a cloud server to create a Crowbar ISO using the standard build process. The second, Advanced Crowbar Build (local) shows how to build a Crowbar v1.2 ISO using advanced techniques on a local desktop using a virtual machine.
In the previous entry I mentioned that we have developed and will be opensourcing “barclamps” (modules that sit on top of Crowbar) for: Cloudera CDH/Enterprise, Zookeeper, Pig, Hbase, Flume and Sqoop. All these modules will speed and ease the deployment, configuration and operation of Hadoop clusters.
If you would like to get involved, check out this 1 min video from Rob Hirschfeld talking about how:
It wouldn’t be surprising if you were surprised to learn that Dell is developing software. To say that this is an area we haven’t been known for in the past would be an understatement. While we may not pose a direct threat to Microsoft any time soon, we have been coding in a few focused areas. One of those areas is cloud installation and management and is represented by our project Crowbar. While Crowbar began life simply as a way to install Openstack on Dell hardware, it has expanded from there.
Today’s news is that we have developed and will be opensourcing “barclamps” (modules that sit on top of crowbar) for: Cloudera CDH/Enterprise, Zookeeper, Pig, Hbase, Flume and Sqoop. All these modules will speed and ease the deployment, configuration and operation of Hadoop clusters. But don’t take my word for it. Take a listen to Crowbar’s architect Rob Hirschfeld as he explains Crowbar and today’s announcement:
Rob Hirschfeld, aka “Commander Crowbar,” recently posted a blog entry looking back at how Crowbar came to be, how its grown and where he hopes it will go from here.
What’s a Crowbar?
If you’re not familiar with Crowbar, its an open source software framework that began life as an installation tool to speed installation of OpenStack on Dell hardware. The project incorporates the Opscode Chef Server tool and was originally created here at Dell by Rob and Greg Althaus. Just four short months ago at OSCON 2011 the project took a big step forward when, along with the announcement of our OpenStack solution, we announced that we were opensourcing it.
DevOps-ilicous
As Rob points out in his blog, as we were delivering Crowbar as an installer a collective light bulb went off and we realized the role that Chef and tools like it play in a larger movement taking place in many Web shops today: the movement of DevOps.
The DevOps approach to deployment builds up systems in a layered model rather than using packaged images…Crowbar’s use of a DevOps layered deployment model provides flexibility for BOTH modularized and integrated cloud deployments.
On beyond installation and OpenStack
As the team began working more with Crowbar, it occurred to them that its use could be expanded in two ways: it could be used to do more than installation and it could be expanded to work with projects beyond OpenStack.
As for functionality, Crowbar now not only installs and configures but once the initial deployment is complete, Crowbar can be used to maintain, expand, and architect the instance, including BIOS configuration, network discovery, status monitoring, performance data gathering, and alerting.
The first project beyond OpenStack that we used Crowbar on was Hadoop. In order to expand Crowbar’s usage we created the concept of “barclamps” which are in essence modules that sit on top of the basic Crowbar functionality. After we created the Hadoop barclamp, others picked up the charge and VMware created a Cloud Foundry barclamp and DreamHost created a Ceph barclamp.
It takes a community
Crowbar development has recently been moved out into the open. As Rob explains,
If you’re planning on attending the OpenStack Design summit and conference next week in Beantown you’ll have to check us out. I’m bummed that I will be missing the summit for the first time, I have a big internal presentation next week, but the rest of the Dell OpenStack crew will be there in force. Dell is a sponsor at the event and we will have a keynote, speaking sessions and demos.
What have we got in the works?
Besides checking out Crowbar and our Openstack solution which we launched back at OSCON we will have a whisper suite where we will be showing our latest and greatest stuff that is currently in the works. If you’d like to see what we have up our sleeve, email us at OpenStack@Dell.com and we can schedule a time slot for you to come and see for yourself.
Updated: For more details what we’ll be doing at the summit check out Rob’s blog
Dell has been working for the last four plus years outfitting the biggest of the big web superstars like Facebook and Microsoft Azure with infrastructure. More recently we have been layering software such as Hadoop, OpenStack and crowbar on top of that infrastructure. This has not gone unnoticed by web pub GigaOm:
Want to become the next Amazon Web Services or Facebook? Dell could have sold you the hardware all along, but now it has the software to make those servers and storage systems really hum.
They also made the following observation:
Because [Dell] doesn’t have a legacy [software] business to defend, it can blaze a completely new trail that has its trailhead where Oracle, IBM and HP leave off.
Letting customers focus on what matters most
Its a pretty exciting time to be at Dell as we continue to move up the stack outfitting web players big and small. The idea is to get these players established and growing in an agile and elastic way so they can concentrate on serving customers rather than building out their underpinning software and systems.
In case you’re not familiar with Cloud Foundry, it’s an open source Platform as a Service project initiated at VMware. More specifically it provides a platform for building, deploying, and running cloud apps using Spring for Java developers, Rails and Sinatra for Ruby developers, Node.js and other JVM frameworks including Grails.
The project began two years ago when VMware’s CEO Paul Maritz recruited Derek Collison and Mark Lucovsky out of Google and set them to working on Cloud Foundry. Collison and Lucovsky, who built and maintained Google’s API services, were brought into leverage their experience of working with hugely scaled out architectures.
The Cloud Foundry project has only been public for a matter of months and one question that I’m sure has popped into your mind is what if I want to pilot Cloud Foundry in my own environment, won’t installation and configuration be a total pain?
Enter the Crowbar
Crowbar is an open source software framework developed at Dell to speed up the installation and configuration of open source cloud software onto bare metal systems. By automating the process, Crowbar can reduce the time needed for installation from days to hours.
The software is modular in design so while the basic functionality is in Crowbar itself, “barclamps” sit on top of it to allow it work with a variety of projects. The first use for crowbar was for OpenStack and the barclamp for that has been donated to the community. Next came The Dell | Cloudera solution for Apache Hadoopand, just recently, Dreamhostannounced that they currently working on a Ceph barclamp. And now…
Two great tastes that taste great together
Today’s big news is that VMware is working with Dell to release and maintain a Crowbar barclamp that, in conjunction with Crowbar, will install and configure Cloud Foundry. This capability, which will include multi-node configs over time, will allow organizations and service providers the ability to quickly and easily get pilots of Cloud Foundry up and running.
Once the initial deployment is complete, Crowbar can be used to maintain, expand, and architect the instance, including BIOS configuration, network discovery, status monitoring, performance data gathering, and alerting.
Data continues to grow at an exponential rate and no place is this more obvious than in the Web space. Not only is the amount exploding but so is the form data’s taking whether that’s transactional, documents, IT/OT, images, audio, text, video etc. Additionally much of this new data is unstructured/ semi-structured which traditional relational databases were not built to deal with.
Enter Hadoop, an Apache open source project which, when combined with Map Reduceallows the analysis of entire data sets, rather than sample sizes, of structured and unstructured data types. Hadoop lets you chomp thru mountains of data faster and get to insights that drive business advantage quicker. It can provide near “real-time” data analytics for click-stream data, location data, logs, rich data, marketing analytics, image processing, social media association, text processing etc. More specifically, Hadoop is particularly suited for applications such as:
Search Quality — search attempts vs. structured data analysis; pattern recognition
Recommendation engine — batch processing; filtering and prediction (ie use information to predict what similar users like)
Ad-targeting – batch processing; linear scalability
Thread analysis for spam fighting and detecting click fraud — batch processing of huge datasets; pattern recognition
Data “sandbox” – “dump” all data in Hadoop; batch processing (ie analysis, filtering, aggregations etc); pattern recognition
The Dell | Cloudera solution
Although Hadoop is a very powerful tool, it can be a bit daunting to implement and use. This fact wasn’t lost on the founders of Cloudera who set up the company to make Hadoop easier to used by packaging it and offering support. Dell has joined with this Hadoop pioneer to provide the industry’s first complete Hadoop Solution (aptly named “the Dell | Cloudera solution for Apache Hadoop”).
The solution is comprised of Cloudera’s distribution of Hadoop, running on optimized Dell PowerEdge C2100 servers with Dell PowerConnect 6248 switch, delivered with joint service and support. Dell offers two flavors of this big data solution: Cloudera’s distribution with the free download of Hadoop software, and Cloudera’s enterprise version of Hadoop that comes with a charge.
It comes with its own “crowbar” and DIY option
The Dell | Cloudera solution for Apache Hadoop also comes with Crowbar, the recently open-sourced Dell-developed software, which provides the necessary tools and automation to manage the complete lifecycle of Hadoop environments. Crowbar manages the Hadoop deployment from the initial server boot to the configuration of the main Hadoop components allowing users to complete bare metal deployment of multi-node Hadoop environments in a matter of hours, as opposed to days. Once the initial deployment is complete, Crowbar can be used to maintain, expand, and architect a complete data analytics solution, including BIOS configuration, network discovery, status monitoring, performance data gathering, and alerting.
The solution also comes with a reference architecture and deployment guide, so you can assemble it yourself, or Dell can build and deploy the solution for you, including rack and stack, delivery and implementation.
Dell has been a part of the OpenStack community since day one a little over a year ago and today’s news represents the first available cloud solution based on the OpenStack platform. This Infrastructure-as-a-service solution includes a reference architecture based on Dell PowerEdge C servers, OpenStack open source software, the Dell-developed Crowbar software and services from Dell and Rackspace Cloud Builders.
Crowbar, keeping things short and sweet
Bringing up a cloud can be no mean feat, as a result a couple of our guys began working on a software framework that could be used to quickly (typically before coffee break!) bring up a multi-node OpenStack cloud on bare metal. That framework became Crowbar. What Crowbar does is manage the OpenStack deployment from the initial server boot to the configuration of the primary OpenStack components, allowing users to complete bare metal deployment of multi-node OpenStack clouds in a matter of hours (or even minutes) instead of days.
Once the initial deployment is complete, Crowbar can be used to maintain, expand, and architect the complete solution, including BIOS configuration, network discovery, status monitoring, performance data gathering, and alerting.
Code to the Community
As mentioned above, today Dell has released Crowbar to the community as open source code (you can get access to it the project’s GitHub site). The idea is allow users to build functionality to address their specific system needs. Additionally we are working with the community to submit Crowbar as a core project in the OpenStack initiative.
Included in the Crowbar code contribution is the barclamp list, UI and remote API’s, automated testing scripts, build scripts, switch discovery, open source Chef server. We are currently working with our legal team to determine how to release the BIOS and RAID which leverage third party components. In the meantime since it is free (as in beer) software, although Dell cannot distribute it, users can directly go the vendors and download the components for free to get that functionality.
More Crowbar detail
For those who want some more detail, here are some bullets I’ve grabbed from Rob “Mr. Crowbar” Hirschfeld’s blog:
Important notes:
Crowbar uses Chef as it’s database and relies on cookbooks for node deployments
Crowbar has a modular architecture so individual components can be removed, extended, and added. These components are known individually as “barclamps.”
Each barclamp has it’s own Chef configuration, UI subcomponent, deployment configuration, and documentation.
On the roadmap:
Hadoop support
Additional operating system support
Barclamp version repository
Network configuration
We’d like suggestions! Please comment on Rob’s blog!
Last week on Day two of Structure the morning sessions ended with an interesting discussion moderated by James Urquhart. The session was entitled “DevOps – Reinventing the Developers Role in the Cloud Age” and featured Luke Kanies – CEO, Puppet Labs and Jesse Robbins – Co-Founder and CEO, Opscode.
After lunch I ran into Jesse and got him to sit down with me and provide some more insight into DevOps as well as explain what Opscode was doing with project Crowbar.
Some of the ground Jesse covers
(0:21) What is DevOps
(1:00) The shift that happens between developers and operations. Writing code and getting it into production faster and how it shifts responsibilities between the two groups.
(2:52) Who are the prime targets for DevOps and how has this changed over time.
How DevOps began in web shops who needed to do things differently than legacy-bound enterprises.
How enterprises faced with greenfield opportunities are now embracing devops
(5:36) The crowbar installer which employs Opscode’s Chef and allows the rapid provisioning of an OpenStack cloud.
Today at Citrix Synergy, Citrix announced “Project Olympus,” their up coming OpenStack distribution. In case you’re not familiar with it, OpenStack is an open source cloud platform based on the code from NASA’s Nebula cloud as well as Rackspace’s storage code. The OpenStack project kicked of last summer and already has gathered support from over 60 commercial hardware and software vendors.
Mt. Olympus and the Cloud
Citrix’s OpenStack Distro
Citrix’s Project Olympus will produce a commercial distribution of the OpenStack infrastructure-as-a-service platform. This “Olympian” distribution will be made up of two main components: a Citrix-certified version of OpenStack and a cloud-optimized version of XenServer. While Citrix will lead with their Xen technology, thanks to OpenStack the distro will support all leading hypervisors.
Project Olympus is targeted at both public cloud providers as well as enterprise customers looking to build out private clouds. The distribution will be available later this year.
But I want it now — The Citrix/Rackspace/Dell Early Access Program
For those who don’t want to wait until the official distribution is ready, don’t fret you can get started today through the Early Access Program (EAP). The EAP is designed to help customers kick-off pilots and proof-of-concept deployments. The program provides access to a beta version of the Citrix distro plus Dell hardware and deployment software as well as deployment services, training and on going customer support for customer clouds via Rackspace’s Cloud Builders program.
Dell’s above-mentioned deployment software, aka “crowbar,” was a big hit at the last OpenStack Design summit. The software which leverages Opscode’s Chef, allows folks to get an Openstack cloud up in running in less than four hours (instead of days). In addition to the deployment software and systems, to support the project Olympus EAP, Dell will also be providing reference architectures so keep your eyes peeled for those.
If you have any questions about what Dell is doing with OpenStack or want to get started, email us at OpenStack@Dell.com.