Power BI real-time streaming is the perfect solution for IoT. Whether processing data through Azure Stream Analytics, the PubNub data stream network, or a custom Internet of Things solution, Power BI offers a fast, dynamic interface for displaying time-sensitive data. This is naturally what comes to mind when discussing Power BI real-time datasets.
However, I have found that Power BI real-time datasets have another great use. In situations where you have on-premise data you want to visualize and auto-refresh in Power BI but do not have a Power BI gateway setup, or you have a simple dataset that needs to be updated more frequently than the standard Power BI service refresh allows, Power BI real-time datasets can be the perfect solution! Of course, as with any technical solution, you must consider the positives and negatives of your approach before making a final decision. Below, I will go over some of the details of the different types of Power BI real-time datasets and demonstrate an example of using PowerShell to push data to a Power BI Push Dataset from a SQL Server data source.
Types of Real-Time Datasets
Power BI offers three types of real-time datasets:
- Push Dataset
- Streaming Dataset
- PubNub Streaming Dataset
Microsoft has a great article describing the three types of real-time datasets. Instead of repeating what is in that article, I will provide a high-level summary of the key points to understand.
With Push Datasets, your data is stored in the Power BI Service up to a maximum of 200k rows. If more than 200k rows are pushed to the dataset, rows are removed using a first-in first-out (FIFO) policy. Since the data is stored in a database within the service, Power BI Reports can be created with data from the Push Dataset. These Reports behave like any other Power BI report, making use of filters, custom visuals, etc.; however, there is one key difference. When visuals from a Push Dataset report are pinned to a Dashboard, the visual(s) will update in real time whenever the data is updated!
Power BI Streaming Datasets are great when your data is only relevant in the moment. Power BI only stores the data in a temporary cache which is used to display your visual(s). Only tiles added directly to a dashboard can be used to visualize this data, as it is specifically designed for low-latency reporting. Dashboard tiles are limited to visualization types: Card, Line Chart, Clustered Bar Chart, Clustered Column Chart, and Gauge.
PubNub Streaming Datasets are very similar to Power BI Streaming Datasets; however, PubNub Streaming Datasets are integrated with the PubNub data stream network. PubNub is a third-party company that offers real-time infrastructure-as-a-service for web, mobile, and IoT applications. If you are already using PubNub for your real-time streaming, connecting with Power BI is as simple as entering a dataset name, subscription key, channel name, and PAM authorization key.
Below is a matrix summarizing the capabilities of the different real-time dataset types.
A few other important things to note about real-time datasets in Power BI is they cannot be joined to other datasets or combined in any way. Also, only three data types are available in the dataset: Text, Number, and DateTime. Finally, be sure to review the Power BI REST API limitations to understand the capabilities. For example, there is a maximum of 75 columns and 10,000 rows per single POST request to the Power BI REST API.
Power BI Real-Time Push Dataset Demo
In my demo, I will do a quick walkthrough of creating a new real-time Push Dataset in Power BI. Then, I will query a local copy of the AdventureWorks2017DW database and push some data to my new dataset using a PowerShell script. You will see how easy it is to get started with Power BI real-time datasets. I’ve been able to use them in quick POCs to produce results in a matter of minutes.
To create a new real-time dataset in Power BI, click + Create from the Workspace you are working in, then click Streaming dataset. (Although the button is labeled Streaming dataset, you will be able to choose later whether you are creating a Streaming Dataset or a Push Dataset.)
Next, you will need to choose the source of your data for the real-time dataset: API, Azure Stream or PUBNUB. In the demo, I will be using the Power BI REST API (API).
After choosing your source, you are then able to define your real-time dataset using a unique Name and the list of Values/Fields in your stream. Also, here is where you will choose whether the dataset is a Push Dataset or Streaming Dataset. When Historic Data Analysis is off, the dataset will be a Streaming Dataset. When Historic Data Analysis is on, your data will be stored in the Power BI service and the dataset will be a Push Dataset.
The simple dataset I created from my AdventureWorks query contains six values/fields. When using a Push Dataset, it is often helpful to include a timestamp column to help with versioning your data. You will notice that as you enter new values/fields, a sample JSON document is created for you in the browser. After creating your real-time dataset, you will receive the Push URL for the dataset as well as sample scripts for pushing to your dataset (Raw, cURL, PowerShell). This Push URL is all you need to populate your dataset. Also, in the screenshot below, you will see the sample PowerShell script that was provided by the Power BI Service.
Below is a screenshot of the PowerShell script I will be using in my demo. The script makes use of the Invoke-Sqlcmd cmdlet, which is supported by the SQL Server SQLCMD utility. The cmdlet connects to a named instance of the SQL Database Engine on my machine and runs the T-SQL script I defined. The results are returned to a DataSet object that I then loop through and post to my Power BI Push Dataset using the Invoke-WebRequest cmdlet in PowerShell and the Push URL that was captured in a previous step. You can definitely get more sophisticated with your PowerShell scripting, if necessary (i.e. if you need to break your data into 10,000 row subsets or join multiple Dataset objects).
Your PowerShell script can be run manually or can be scheduled to run using Windows Task Scheduler or any other scheduling tool you use in your environment. Since the script makes use of the Power BI REST API, you are able to add up to 1,000,000 rows per hour, per dataset. Depending on your source query, this could allow you to update your dataset very rapidly since it does not use the same refresh intervals as the Scheduled Refresh in the Power BI Service. This can be very helpful when using a real-time Push Dataset in combination with a Power BI report containing visuals that update in real time. As previously mentioned, Push Datasets can be used to build full-featured Power BI reports, as shown in the screenshot below.
Power BI real-time datasets are perfect for solutions involving device telemetry, social network feeds, geospatial services, log files and gaming activity. They can also be great for quick proof of concepts or for situations where you need to see your data in real time or near-real time. Contact us today for assistance with your next Power BI project.