Set up incremental data refresh in Power BI to download not all data, but only new data for a selected period.
TABLE OF CONTENTS
- Create parameters
- Load Functions From the Connector
In this task, you will use Power Query Editor to create RangeStart and RangeEnd parameters with default values. The default values only apply when filtering the data to be loaded into the model in Power BI Desktop. The values you enter should include only a small amount of the most recent data from your data source. When published to the service, these time range values are overridden by the incremental refresh policy. That is, the policy creates windows of incoming data, one after another.
- In Power BI Desktop, select Transform data on the Home ribbon to open Power Query Editor.
- Select the Manage Parameters dropdown and then choose New Parameter.
- In the Name field, enter RangeStart (case-sensitive). In the Type field, select Date/Time from the dropdown. In the Current Value field, enter a start date and time value.
- Select New to create a second parameter named RangeEnd. In the Type field, select Date/Time, then enter an end date and time value in the Current Value field. Press OK.
Now that you have defined the RangeStart and RangeEnd parameters, you will filter the data to be loaded into the model based on those parameters.
Load Functions From the Connector
The next step is using a Precoro Power BI Connector to extract a table into Power BI. Once you have done that, the code for the data extraction will be automatically generated in the formula bar. Replace the bits of the resulting code in the formula bar with the parameters RangeStart/RangeEnd as per the screenshot below.
When you are in the Power Query interface, it is recommended to set RangeStart/RangeEnd parameters to have a 1-2 weeks difference between them. The smaller the difference, the quicker the Power Query will load.
Later on, you can change the value of those parameters inside of Power BI Desktop.
When you put your parameters inside the formula bar, it is essential to wrap them into the Date.From() function:
On the Home ribbon in Power Query Editor, select Close & Apply. Power Query loads data based on the filters defined by the RangeStart and RangeEnd parameters and any other filters you have defined.
Power Query loads only data specified between the RangeStart and RangeEnd parameters. The table should load quickly depending on the amount of data in that period.
After you have defined RangeStart and RangeEnd parameters and filtered data based on those parameters, you'll define an incremental refresh policy. This policy is only applied after the model is published to the service and a manual or scheduled refresh operation is performed.
In the Data view, right-click a table in the Data pane and select Incremental refresh.
In Incremental refresh and real-time data → Select table, verify, or select the table. The default value of the Select table listbox is the table you selected in the Data view.
Specify required settings:
- In Set import and refresh ranges → Incrementally refresh this table and move the slider to On. If the slider is disabled, it means the Power Query expression for the table does not include a filter based on the RangeStart and RangeEnd parameters.
- In the Archive data starting, specify the historical store period you want to include in the dataset. All rows with dates in this period will be loaded into the dataset in the service unless other filters apply.
- In Incrementally refresh data start, specify the refresh period. All rows with dates in this period will be refreshed in the dataset each time a manual or scheduled refresh operation is performed by the Power BI service.
Save and Publish to the Service
Now that your RangeStart and RangeEnd parameters, filtering, and refresh policy settings are complete, save your model and publish it to the service.
In the service, refresh the dataset. The first refresh will load both new and updated data in the refresh period, as well as historical data for the entire store period. Depending on the amount of data, this refresh can take quite a while. Subsequent refreshes, whether manual or scheduled, are typically much faster because the incremental refresh policy is applied, and only data for the period specified in the refresh policy setting is refreshed.