Monday, May 4, 2015

Microsoft Intros Azure Data Lake

Microsoft introduced Azure Data Lake, an enterprise wide repository of every type of data collected in a single place prior to any formal definition of requirements or schema.

Highlights:

  • HDFS for the Cloud: The Azure Data Lake is a Hadoop File System compatible with HDFS enabling Microsoft offerings such as Azure HDInsight, Revolution-R Enterprise, industry Hadoop distributions like Hortonworks and Cloudera all to connect to it.
  • Petabyte files, massive throughput: The goal of the data lake is to run Hadoop and advanced analytics on all data. Azure Data Lake has no fixed limits to how much data can be stored in a single account. It can also store very large files with no fixed limits to size . The design can handle high volumes of small writes at low latency making it optimized for near real-time scenarios like website analytics, Internet of Things (IoT), analytics from sensors, and others.
  • Enterprise ready:  Azure Data Lake leverages Azure Active Directory as well as providing data replication to ensure high durability and availability.

http://azure.microsoft.com/blog/2015/04/29/introducing-azure-data-lake/