Blog & Company Updates

Introduction to Azure Data Lake

We talk to a lot of customers about their data strategies, specifically their data cloud strategies. One great tool we have is Azure Data Lake. I’d like to introduce that tool and tell you about some benefits you will gain.

Azure Data Lake is Microsoft’s Platform as a Service (PaaS) big data solution running on Azure. This gives you the ability to handle large volumes of data, as well as unstructured data, such as CSV, flat or log files; these can all be processed through the Azure Data Lake service.


Azure Data Lake consists of two different resources within Azure:

  • Azure Data Lake Store – This is where the data resides. It’s a fully HDFS compliant file system that you can spin up and have it run on its own. One benefit is that it’s Azure Active Directory integrated, so we can secure our data and our hierarchies within the file structures we set up in Azure Active Directory.
  • Azure Data Lake Analytics – This is the compute piece of the big data solution. With this, you can take advantage of the common theme of Azure with the separation of storage and compute. This is where we process jobs and data and we do our transformation on our data. We create our views here, run scripts to pull data into new files and migrate our data around.

A benefit of running Azure Data Lake Analytics vs some of the other big data platforms, is that it uses a language called U-SQL, which is proprietary to Microsoft. This language is based off T-SQL (I call it a mash-up of T-SQL and C#). We utilize many of the functions and syntax that we use in C#, but we use it in the context of a T-SQL statement.

The benefit lies in the fact that we don’t have to learn some of the languages that are common to open source data platforms, such as PIG, HIVE, Spark or Python. We can take advantage of some big data capabilities and run them with some of the skill sets we already have in-house.

Need help with your data strategy? Our expert team and solution offerings can help your business with any Azure product or service, including Managed Services offerings. Contact us at 888-8AZURE or  [email protected].

Author

  • 3Cloud is a “born in the cloud” Gold-certified Microsoft Azure technology consulting firm and Azure Expert Managed Services Provider that provides cloud strategy, design, implementation, and managed services to clients across multiple industries. Founded by former Microsoft technology leaders, 3Cloud combines a team of highly experienced cloud architects and technologists with a strong network of Microsoft sales and engineering relationships to deliver the ultimate Azure experience for clients. 3Cloud is headquartered in Chicago, Illinois with offices in Dallas, Texas and Pittsburgh, Pennsylvania and supports clients throughout North America and Europe. To learn more, visit www.3cloudsolutions.com.

3CloudIntroduction to Azure Data Lake