In this post, I’d like to discuss some of the common Azure SQL Data Warehouse performance issues I come across personally or with clients. Sharing these issues is a way for all of us to gain the knowledge needed when you come across issues when utilizing Azure SQL Data Warehouse.
1. Statistics – One of the most underutilized or missed items that I see. Whether you’re an admin, SQL DBA or anyone that works with SQL databases in the Microsoft realm, we’re used to having these statistics put in place or created for us automatically and updated automatically as well with auto update stats or auto create stats.
With Azure SQL Data Warehouse, these don’t exist. So, we need to put them in place ourselves. We need to create these to alleviate performance issues that occur by missing these statistics. When working with clients, without knowing the group/order bys or aggregates they’re using, we’ll put together a SQL script, parse through all the table objects in the data warehouse and create a statistic on each of those individual columns within the data warehouse.
Oftentimes we’ll see a 30-40% increase in performance because we’re able to get a better explain plan within the SQL Data Warehouse for the queries we’re executing.
2. Distribution compatibility – Ensuring that our items we’re joining are compatible. If they are not, there’s a good chance that you’re going to get data movement within your data warehouse.
3. Overuse of round robin tables – As much as I love these, as well as how easy they are to maintain, overutilizing them can cause performance issues in our SQL DW. A good rule of thumb is to ensure that anything less than 60 million rows, maybe throw in a round table, only if it’s a dimension round table. If it’s a fact table, then we want to distribute that by some sort of hash key.
4. Overuse of the small resource class – When you’re loading your data or querying your data, make sure you are using a medium or large resource class, or whatever is the proper one for the query you’re executing.
5. Poor quality cluster column store indexes – Not maintaining our indexes with the data warehouse can cause really poor performance. Like any other database that we support, we need to ensure that our cluster column indexes are healthy.
Hopefully my tips on solving some common query issues that I deal with will help you if you come across them.
Need further help? Our expert team and solution offerings can help your business with any Azure product or service, including Managed Services offerings. Contact us at 888-8AZURE or [email protected].