Rowland Gosling

Cloud Data Architect at 3Cloud. Cloud data and AI geek in learn-it-all mode.

Serverless Pools in Synapse Analytics Workspace

Hopefully, you’ve already heard about Azure Synapse. One of the newest Azure offerings, Synapse is a limitless analytics service that brings together, data warehousing, enterprise data warehousing, and big data analytics. I’d like to tell you about the “launchpad” for all things Synapse – Synapse workspaces.

With Synapse workspaces you can do many things including development, ETL, ELT, DevOps, Azure ML, and Power BI. When you create a new workspace, you automatically get a serverless pool.

  • When you go into your workspace, you’ll see Activity Hubs in Synapse Studio. These hubs organize the tasks needed for building analytics solutions.
  • Synapse Studio is divided into Activity hubs; there are currently 6 hubs:
    • Overview and data – where we can explore all our structured/unstructured data
    • Develop – the development hub where you can use workbooks, SQL, etc.
    • Orchestrate, monitor, and manage – these look like Azure Data Factory, they have the same look & feel and do the same jobs.
  • Pools are comparable to databases in Synapse Analytics. There are 3 kinds of pools; serverless, dedicated, and Spark.
  • Each workspace has a serverless pool by default.
    • In my serverless pool cheat sheet (To see this, please watch my video included), you’ll see we have Spark, Cosmos DB, and Azure Data Lake store as the 3 data sources you can use.
    • The input types are Parquet, CSV, and JSON. Parquet is the better way to do things as it’s compressed, so it only has to read a large piece of the data in a compressed format into memory. In other words, it doesn’t have to go back to the well as often.
    • Also, think about portioning your data in some logical way when you begin working with this. This way, if your data doesn’t belong to a certain partition, it won’t go looking for it somewhere it’s not.
    • Also, think about different landing zones for different data (refer to the flow chart on my cheat sheet).

In my video, I demo how to use a Synapse Analytics workspace so be sure to check that out. I’ll walk you through how a workspace, as well as how serverless, works.

My advice is to go in the Azure portal and give Azure Synapse Analytics workspace a try. Play around with it and see what you can do.

Quickstart in Microsoft docs: https://docs.microsoft.com/en-us/azure/synapse-analytics/quickstart-create-workspace

 


If you have questions about Azure Synapse Analytics, either how to use it or how to implement it in your organization, reach out to us. Our expert team and solution offerings can help your business with any Azure product or service, including Managed Services offerings. Contact us at 888-8AZURE or  [email protected].

Rowland GoslingServerless Pools in Synapse Analytics Workspace
Read More

Azure Synapse Analytics Now in GA and the Public Preview of Azure Purview

I’m here with some exiting news from Microsoft! Last week at a digital conference, Satya Nadella announced the general availability of Azure Synapse Analytics and the preview of Azure Purview, a unified data governance service. Azure Synapse Analytics has been gaining traction while in preview and adding Azure Purview gives businesses the ability to get the most of out their data and analytics.


Let’s talk about Azure Purview. This is a comprehensive data governance service that helps organizations discover all data across the organization. Demos at the digital conference showcased different ways you can use Purview for governance. Some key things are the ability to go multi-cloud, not only in Azure, but others as well. You can also connect with your on-prem environment and your Azure data assets.

For quite some time, those of us in the data disciplines have worked to inventory all the different aspects of data, like column, database and table names, etc., and put all those pieces into a common repository, often referred to as a data dictionary. Microsoft has been working for years to create a product that would be comprehensive enough to help most people with their governance and compliance needs. We’ve now got this with Azure Purview.

Some key highlights pointed out are:

  • A business glossary – no need to manually build a data dictionary.
  • Automated data classification – allows you to know things like data type (Social Security number for instance). You also have custom options and can schedule future scanning and classification on a routine basis. This way you’re getting continual updates, as opposed to a data dictionary where you get snapshot in time unless you manually update.
  • Cloud-based search facility – gives you the ability to find things quickly and easily across a broad series of data assets.
  • Data lineage and reporting – supports the end to end data lifecycle.
  • Power BI facilities

I feel Azure Purview is a very strong offering. Without it I would have either create my own versions of these pieces or using something like Embarcadero, which I used years ago. Another thing to note is that the experience is very similar to the canvas workspace experience in Azure Synapse Analytics, so if you’ve been working with that, it will feel very familiar.

The next part of Microsoft’s announcement is that Azure Synapse Analytics is now generally available. Azure Synapse Analytics is a limitless analytics service which brings together traditional data warehouse and big data analytics in one offering. It brings these together for a unified experience to ingest, prepare, manage, and serve data for immediate machine learning and BI applications. I, and many of our customers, have been using this great product a lot, so this going GA is surely exciting news.

Some noteworthy things with Azure Synapse Analytics are:

  • A new native cloud distributed SQL engine
  • Deep integration with Spark
  • Flexible query options such as serverless and dedicated
  • Integration with Power BI and machine learning
  • TPC-H benchmark at petabyte scale
  • Native Row Level Security (this is not possible with Amazon Redshift or Google BigQuery)
  • Native ML integration for the citizen data scientist
  • Code management – by that their talking about Azure DevOps as another piece that plays well with it.
  • Power BI integration to Teams which I found to be kind of cool

Again, great announcements with both the general availability of Azure Synapse Analytics and the public preview of Azure Purview. These two products combined empowers teams to remove data silos and leverage all data for analytics and data governance.

Need further help with these or any Azure product or service? Our expert team and solution offerings can help your business with any Azure product or service, including Managed Services offerings. Contact us at 888-8AZURE or  [email protected].

Rowland GoslingAzure Synapse Analytics Now in GA and the Public Preview of Azure Purview
Read More