By Charles King, Pund-IT, Inc. July 30, 2014
Pivotal and Hortonworks announced that they will collaborate on the Apache Ambari project to help strengthen Hadoop as an enterprise offering and to further advance the Apache Hadoop ecosystem.
For its part, Pivotal said it would expand its open source investment by dedicating company engineers to contribute installation, configuration and management capabilities to Apache Ambari. It will collaborate closely with Hortonworks, ASF, and the broader Apache Hadoop community in this effort, and work with its own customers to ensure that they benefit from these enhancements.
Hortonworks noted that the collaboration will further innovation in Hadoop operations via Apache Ambari, which it believes is critical to enabling Hadoop as an enterprise offering. More broadly, it sees the effort with Pivotal as critical to helping its customers and partners fully capture Hadoop’s capabilities and benefits.
Pivotal and Hortonworks aim to make Hadoop enterprise-ready by enhancing Apache Ambari.
There is no shortage of partnerships in the technology industry, and the concept of “collaboration” has become such a messaging mainstay that it often verges on cliché. So does Pivotal and Hortonworks’ effort qualify as anything special?
In fact, it does.
While Hadoop is a core technology in many or most modern big data analytics solutions and strategies, it has struggled in terms of the practical features enterprises require for dependable deployment and performance. Despite Hadoop’s numerous attractions – and there are many – lacking or delaying those features would result in stumbling blocks to wider adoption.
This challenge is not unique to Hadoop. In fact, the IT industry witnessed similarly painful paths to maturity in Linux and some other open source efforts. While open source collaboration can encourage marvelous innovations, it often struggles in areas that benefit from more linear development strategies.
In his commentary on the collaboration with Pivotal, Hortonworks’ Shaun Connolly described what he called a broader blueprint that specifies how Apache projects can span “five distinct pillars to form a complete enterprise data platform: data access, data management, security, operations and governance.” Per the partners’ statements, this new effort aims to leverage both companies’ considerable skills and open source experience to improve Apache Ambari’s operations features and make it the standard management tool for Hadoop clusters.
So what exactly is Apache Ambari? Basically, a framework for provisioning, managing and monitoring Hadoop clusters. Using Ambari, administrators can:
- Easily provision Hadoop clusters of virtually any size
- Simplify Hadoop cluster management tasks, including controlling service and component lifecycles, modifying configurations and managing growth
- Efficiently monitor Hadoop clusters by preconfiguring alerts and visualizing operational data, and
- Effectively integrate Hadoop (via a RESTful API) with existing data center tools, like Microsoft System Center and Teradata Viewpoint and operational processes
The “wood behind the arrowhead” in this collaboration will be the Pivotal engineers who combine parts of the company’s installation and configuration manager (ICM) technologies in Ambari to expand its core capabilities.
Some may consider these features simplistic or even mundane, but without them, it seems unlikely that Hadoop could evolve enough or quickly enough to fully achieve what its proponents believe is possible. In many or even most cases, sublime achievement requires a mundane foundation. Without the foundational operations capabilities that enterprises require, and that Pivotal and Hortonworks plan to deliver together, there could be significant shrinkage in Hadoop’s big data future.
© 2014 Pund-IT, Inc. All rights reserved.