
I defined what SCD means in a Enterprise Intelligence resolution in my earlier put up. We additionally mentioned that whereas we don’t count on to deal with SCD2 in a Energy BI implementation, we are able to deal with eventualities just like SCD1. On this put up, I clarify how to take action.
State of affairs
We now have a retail firm promoting merchandise. The corporate releases the checklist of merchandise in Excel format, together with checklist worth and vendor worth, yearly. The product checklist is launched on the primary day of July when the monetary yr begins. We now have to implement a Energy BI resolution that retains the newest product knowledge to analyse the gross sales transactions. The next picture exhibits the Product checklist for 2013:

So annually, we obtain an identical Excel file to the above picture. The recordsdata are saved on a SharePoint On-line web site.
State of affairs Defined
Because the earlier put up explains, an SCD1 all the time retains the present knowledge by updating the outdated knowledge with the brand new knowledge. So an ETL course of reads the information from the supply, identifies the prevailing knowledge within the vacation spot desk, inserts the brand new rows to the vacation spot, updates the prevailing rows, and deletes the eliminated rows.
Right here is why our situation is just like SCD1, with one exception:
- We don’t really replace the information within the Excel recordsdata and don’t create an ETL course of to learn the information from the Excel recordsdata, establish the modifications and apply the modifications to an middleman Excel file
- We should learn the information from the supply Excel recordsdata, hold the newest knowledge whereas filtering out the outdated ones and cargo the information into the information mannequin.
As you see, whereas we’re taking a really completely different implementation strategy, the outcomes are very comparable with an exception: we don’t delete any rows.
Implementation
Here’s what we must always do to attain the objective:
- We get the information in Energy Question Editor utilizing the SharePoint Folder connector
- We combite the recordsdata
- We use the ProductNumber column to establish the duplicated merchandise
- We use the Reporting Date column to establish the newest dates
- We solely hold the newest rows
Getting Knowledge from SharePoint On-line Folder
As we get the information from a number of recordsdata saved on SharePoint On-line, we’ve got to make use of the SharePoint Folder connector. Comply with these steps:
- Login to SharePoint On-line and navigate to the positioning holding the Product checklist Excel recordsdata and duplicate the positioning URL from the browser
- From the Get Knowledge within the Energy BI Desktop, choose the SharePoint Folder connector
- Click on Join
- Paste the Website URL copied on step 1
- Click on OK
- Click on Rework Knowledge
- Click on the filter dropdown on the Folder Path column
- Discover the Merchandise Record folder internet hosting the Excel recordsdata and choose it
- Click on OK
- Rename the question to Product
Thus far, we’re linked to the SharePoint On-line Folder in Energy Question Editor. The subsequent step is to mix the Excel recordsdata.
Combining Recordsdata
We now have a number of choices to mix binary recordsdata in a desk from the Energy Question Editor. On this put up, we use essentially the most simple technique:
- Click on the Mix Recordsdata button from the Content material column
- Choose the ProductList desk
- Click on OK

The above course of creates a few queries grouped in separate folders, as proven within the following picture:
Thus far, we’ve got efficiently mixed the Excel recordsdata. The subsequent step is to maintain the newest knowledge solely.
Holding the Newest Knowledge
Within the subsequent few steps, we glance nearer on the knowledge, and we implement a mechanism to establish the newest knowledge, hold them and cargo them into the information mannequin.
- Wanting on the outcomes of the mixed knowledge exhibits a Supply.Identify column that we do now require to maintain, so we take away it by choosing it and clicking the Take away Columns button from the Residence tab
Thus far, we’ve got linked to the SharePoint On-line Folder and mixed the contained Excel recordsdata. Let’s take a look at the information and see what we’ve acquired. I sorted the information by ProductNumber to raised perceive the information modifications. The next picture exhibits the outcomes:
Because the above picture exhibits, there are a number of merchandise showing in a number of lists. That’s precisely what we anticipated to see. The objective is to maintain the newest product knowledge solely based mostly on the Reporting Date. So we must always get the ProductNumber and the utmost of the Reporting Date. To realize this, we use the Group By performance in Energy Question Editor. Utilizing the Group By from the UI within the Energy Question Editor makes use of the Desk.Group() perform in Energy Question. Because the Group By course of doesn’t want the information to be sorted we take away the Sorted Rows step. With that, let’s get the job performed.
- Choose the ProductNumber column
- Click on the Group By column from the Rework tab
- Enter Reporting Date for the New column title
- Choose Max from the Operation dropdown
- Choose the Reporting Date from the Column dropdown
- Click on OK
The next picture exhibits the outcomes:
We now have all product numbers with their newest reporting dates. The one remaining piece of the puzzle is to hitch the outcomes of the Grouped Rows step with the information of its earlier step. For that we use the Merge Queries performance which runs the Desk.NestedJoin() perform in Energy Question.
- Choose the Grouped Rows step from the Utilized Steps checklist from the Question Settings pane
- Click on the Merge Queries button from the Residence tab
- Choose the Product (Present) desk from the dropdown. Word that we’re choosing the present question (Product)
- On the highest desk, press the Ctrl button in your keyboard and choose the ProductNumber and the Reporting Date columns sequentially
- Do the identical for the underside desk. Word that the sequence of choosing the columns is essential
- Be certain that the Be a part of Sort is Left Outer (all from first, matching from second)
- Click on OK
As talked about earlier, the merge operation makes use of the Desk.NestedJoin() perform, which accepts two tables (highlighted in yellow within the expression beneath), a listing of their key columns to make use of on the be a part of (highlighted in pink within the expression beneath), a reputation for the brand new column of kind desk and the be a part of variety. Within the above operation, because the Grouped Rows is the final transformation step, we joined the outcomes of the Grouped Rows transformation step by itself. Right here is the code generated by Energy Question Editor after going via the step 21 to 27:
Desk.NestedJoin(#"Grouped Rows", {"ProductNumber", "Reporting Date"}, #"Grouped Rows", {"ProductNumber", "Reporting Date"}, "Grouped Rows", JoinKind.LeftOuter)
However that isn’t what we wish, we don’t want to hitch the outcomes of the Grouped Rows transformation step by itself. We have to be a part of the outcomes of the Grouped Rows transformation step by the outcomes of the Eliminated Columns step. Due to this fact, we’ve got to switch the above expression as follows:
Desk.NestedJoin(#"Grouped Rows", {"ProductNumber", "Reporting Date"}, #"Eliminated Columns", {"ProductNumber", "Reporting Date"}, "Grouped Rows", JoinKind.LeftOuter)
The next picture exhibits the modification made within the expression and the outcomes:

- Click on the Increase button on the Grouped Rows column
- Deselect the ProductNumber and Reporting Date columns to maintain the opposite columns chosen
- Untick the Use unique column title as prefix choice
- Click on OK
All performed! The next picture exhibits the ultimate outcomes:
We will now apply the modifications to load the information into the information mannequin. With this method, when a brand new Excel file (a brand new Product Record) lands in SharePoint On-line, Energy BI goes via the above transformation steps to make sure we all the time have the newest Product knowledge loaded into the information mannequin, which is similar to the behaviour of an SCD1.
Have you ever used this technique earlier than? Do you may have a greater method to deal with an identical situation? You may share your ideas within the feedback part beneath.
Associated
Uncover extra from BI Perception
Subscribe to get the newest posts despatched to your e-mail.














