If you’re a data scientist or an AI aficionado, you already know the struggles of using your laptop or workstation to train complex models. From model parameter tuning and cross-validation to handling large amounts of data, completing your machine learning work can be a slow-moving process without the proper equipment. This is where Azure can help. By using various services on the cloud platform, your data and AI team can offload the work to a highly scalable cloud environment to complete the job faster than ever before.
Scalability is Key
Gone are the days when businesses must purchase millions of dollars in equipment in order to work with and innovate upon their data. The cloud has transformed the way that we can access amazing computing power by eliminating the need to be in the same building as the machines.
Sure, most of us can get by working on small amounts of data on our nice laptops. But what happens when you want to ramp up on your analyses? Azure allows us to scale out our analyses to the cloud to use the immense amount of computing power there.
As we all know, popular machine learning/AI languages such as R or Python have a memory barrier, which limits the amount of data you can handle at any given time. Plus, most CPUs on business-grade laptops only go up to 8 cores. This is good for some amount of parallel processing, but we can do better!
Even though machine learning, AI, and advanced analytics aren’t new concepts for most organizations, many companies simply don’t have the on-premises setup to handle AI-based problems. They are traditionally set up to do data processing and data storage, but not anything more advanced.
Azure allows users to deploy their code in the cloud in a managed environment that is pre-configured for their AI workload. Simply specify the size of compute power that is needed and let Azure do the heavy lifting.
Reasoning |
Understanding |
Interacting |
---|---|---|
Find new, unexpected ways to adapt and innovate based on your unique data | Interpret business and customer data in real time, and scale including text, docs, images, video, voice | Remove technology barriers for your customers and expand your employees’ capabilities |
GPU Goodness
The surge of cryptocurrency mining has caused a boom in graphics card prices lately. GPUs are often out of stock for the same reason. The training of certain types of machine learning models (namely, neural networks) is exponentially faster on a GPU rather than a CPU. The rise in popularity of deep learning on NVIDIA GPUs is due to the CUDA parallel computing platform.
In Azure, compute resources are available with pre-configured NVIDIA graphics cards. These resources allow you to train your complex models in the cloud while only paying for the time you use it. No more investment in expensive graphics cards that will be out of date in a few years.
In addition to Azure having GPUs at the ready, you can spin up specialized virtual machines (VMs) known as Deep Learning Virtual Machines (DLVMs). These VMs come ready with all sorts of AI-related tools pre-installed. From all the popular deep-learning frameworks, to RStudio and Anaconda, to SQL Server and Power BI, the DLVMs are great for any data scientist or AI pro! DLVMs run on Azure GPU NC-series VM instances. These GPUs use discrete device assignment, resulting in performance close to bare metal, which is perfect for solving deep-learning problems.
Azure Batch AI
Great hardware is only half of the experience you get with Azure. With Azure Batch AI, you can let Azure handle the provisioning and management of clusters of VMs while you just focus on running your AI experiments. Azure Batch AI is designed to allow you to run large AI training and testing workloads in the cloud without having to manage the underlying architecture. This supports the popular training toolkits like TensorFlow, the Microsoft Cognitive Toolkit (previously known as CNTK), Chainer, and others. In addition, you can also deploy and scale your own software stacks.
To begin, you simply describe the requirements of your job, where the inputs and outputs are located, and then Batch AI will handle the rest.
From the Azure Portal, search for “Batch AI Service” and click Create. Then you can specify the quantity and type of Nodes that you want. Plus, you can configure the cluster to automatically scale based on the workload it’s given.
Once the cluster is up and running, you can submit jobs through the command- line interface or through your own Python script. Batch AI will make it easy for you to work in parallel with multiple CPUs or GPUs and can scale to connect a large cluster of VMs together.
Azure Databricks
You may have heard of Microsoft’s latest partnership with Databricks, a cloud-based Spark platform for big data processing. This collaborative analytics platform allows for scalable analysis of your data in the cloud.
To make your Azure Databricks service, search for “Databricks” in the Azure Portal. Click Create and then fill out the information on the next blade to give your Databricks service a name and assign it to a subscription and resource group.
Once your Databricks service is successfully deployed, making a cluster is easy. Simply click on Clusters on the side menu and then Create Cluster. Then, specify the number and type of workers you need. Notice that there are large virtual machines (up to 256 GB of memory and 64 cores!) that can be selected. After you make your initial cluster, you can scale it up or down, depending on the workload that it needs to handle.
Using Azure Databricks has its advantages. From using a highly scalable and massively parallel Spark backend to using Python (PySpark), R (SparkR), or Scala to complete large machine learning workloads, Azure Databricks is extremely flexible. Not to mention, Azure Databricks’ interface is highly conducive to collaboration for your data science/AI team.
But Wait, There’s More…
The services I’ve outlined here are only a small part of the AI platform on Azure. Azure offers a comprehensive set of flexible AI services for any scenario and enterprise-grade AI infrastructure that runs AI workloads anywhere at scale. Plus, you can equip your team with modern AI tools designed to help you create AI solutions easily, with maximum productivity.
Training complex AI models on large amounts of data is often a computationally intense task that requires the support of more advanced computing resources to complete. Traditionally, on-premise solutions to handling data don’t scale well for handling AI workloads. So, why not let Azure do the heavy lifting?
To learn more about how your organization can take advantage of Azure Databricks for your AI workloads, contact us!