Editorial & Analysis
Also by this author
25 Mar 2013
Software giant finally releases public preview of HDInsight, its cloud-based big data service based on the Hortonworks distribution of Apache Hadoop.
Microsoft has unveiled its first public preview of HDInsight, a new service that combines the open-source Hadoop framework for big data processing with Microsoft’s own Windows Azure cloud platform.
This will enable customers to get started doing Hadoop-based big data analytics in the cloud, paying only for the storage and compute they use through their regular Windows Azure subscription, say company executives.
That could play well with companies keen to explore big data approaches - but terrified of the infrastructure costs and scalability demands involved. The Apache Hadoop framework, after all, is designed to run across massive clusters of commodity servers, but relatively few firms have the time, money or skills to invest in implementing and managing that kind of hardware environment on their own premises.
According to Elon Kelly, Microsoft’s general manager for SQL Server product management, setting up Hadoop clusters on Windows Azure takes mere minutes with HDInsight, instead of hours or days. Plus, during the preview period, the company is offering a 50 percent discount on Windows Azure HDInsight Service.
With HDInsight, a Hadoop-based cluster will need one ‘head node’ (sometimes referred to as a master node, but basically, a server used to track the progress of analytic jobs and tasks) and one or more ‘compute nodes’ (servers where the analytic jobs and tasks take place).
During the preview period, head nodes are available as Azure Extra Large instances at $0.48 per hour, per cluster. Compute nodes are available as Large instances at $0.24 per hour, per instance. More details of pricing have been made available here.
Meanwhile, Microsoft is also attempting to bust a second myth about big data: that users need considerable skills and experience to make it work for them. HDInsight clusters, according to Kelly, “integrate with simple, web-based tools and APIs to ensure customers can easily deploy, monitor and shut down their cloud-based cluster,” he writes. “In addition, Windows Azure HDInsight Service integrates with our business intelligence tools including Excel, PowerPivot and Power View, allowing customers to easily analyse and interpret their data to garner valuable insights for their organisation.
HDInsight is based on the Hortonworks distribution of Apache Hadoop. It’s a big coup for that company, which is battling rival Cloudera and, to a lesser extent, MapR, to become the enterprise Hadoop distribution of choice.