It’s no secret that Big Data is big news. Every day there are new articles about how organizations are making use of previously untapped streams of data to gain new insights and make sweeping changes in the way they do business.
Technologies like Hadoop are at the center of this wave of change. Big data technologies have the potential to change (and indeed already have changed) the way we think about data management and analytics. The downside to these technologies is that they can be hard to learn and manage. When considering a big data program, organizations are faced with the challenge of acquiring new infrastructure and skill sets to support it. This can have the effect of either extending the timeline and cost or stalling the project altogether. If an organization doesn’t think it has enough data or doesn’t have a vision of where the big data project will lead, it may never become a reality.
Cloud-based implementations have some advantages in this regard. By using a cloud services provider, you can change your infrastructure investment from capex to opex. You can start small and scale as needed, and you may be able to take advantage of a managed big data solution to reduce the need to acquire new skill sets. For example, Microsoft’s HDInsight service is a full Hadoop implementation offered as a service. However, you still need some Hadoop knowledge to use it, and it does require some management.
If your goal is to simply focus on managing and analyzing lots of data, you want to get going quickly, and you don’t want to build a Hadoop skill set (at least not right away), Azure Data Lake might be for you! Azure Data Lake is a new set of services recently made available in public preview. They are designed for massive scale, highly performant parallel processing and are purely services. This means you only pay for what you use and the services scale dynamically to meet your requirements. For more details about Azure Data Lake, see “6 Key Features from Microsoft’s Azure Data Lake“.
In this short video, I’ll walk through setting up an Azure Data Lake account, load some sample data, perform some data manipulation to analyze multiple files in different formats, and finally visualize the results with Microsoft Power BI.
If you are considering a big data project, we can help! Please contact us today for a more in-depth discussion about Data Lakes and what they mean for your data and analytics strategy.