Monday, December 1, 2025
HomeBusiness IntelligenceDatatype Conversion in Energy Question Impacts Information Modeling in Energy BI

Datatype Conversion in Energy Question Impacts Information Modeling in Energy BI


Datatype Conversion in Power Query Affects Data Modeling in Power BI

In my consulting expertise working with clients utilizing Energy BI, many challenges that Energy BI builders face are as a consequence of negligence to information sorts. Listed below are some frequent challenges which are the direct or oblique outcomes of inappropriate information sorts and information kind conversion:

  • Getting incorrect outcomes whereas all calculations in your information mannequin are right.
  • Poor performing information mannequin.
  • Bloated mannequin measurement.
  • Difficulties in configuring user-defined aggregations (agg consciousness).
  • Difficulties in establishing incremental information refresh.
  • Getting clean visuals after the primary information refresh in Energy BI service.

On this blogpost, I clarify the frequent pitfalls to forestall future challenges that may be time-consuming to determine and repair.

Background

Earlier than we dive into the subject of this weblog put up, I want to begin with a little bit of background. Everyone knows that Energy BI isn’t solely a reporting device. It’s certainly a knowledge platform supporting numerous points of enterprise intelligence, information engineering, and information science. There are two languages we should study to have the ability to work with Energy BI: Energy Question (M) and DAX. The aim of the 2 languages is kind of totally different. We use Energy Question for information transformation and information preparation, whereas DAX is used for information evaluation within the Tabular information mannequin. Right here is the purpose, the 2 languages in Energy BI have totally different information sorts.

The commonest Energy BI growth eventualities begin with connecting to the information supply(s). Energy BI helps lots of of knowledge sources. Most information supply connections occur in Energy Question (the information preparation layer in a Energy BI resolution) except we join dwell to a semantic layer equivalent to an SSAS occasion or a Energy BI dataset. Many supported information sources have their very own information sorts, and a few don’t. As an illustration, SQL Server has its personal information sorts, however CSV doesn’t. When the information supply has information sorts, the mashup engine tries to determine information sorts to the closest information kind accessible in Energy Question. Regardless that the supply system has information sorts, the information sorts may not be appropriate with Energy Question information sorts. For the information sources that don’t assist information sorts, the matchup engine tries to detect the information sorts primarily based on the pattern information loaded into the information preview pane within the Energy Question Editor window. However, there isn’t a assure that the detected information sorts are right. So, it’s best follow to validate the detected information sorts anyway.

Energy BI makes use of the Tabular mannequin information sorts when it hundreds the information into the information mannequin. The info sorts within the information mannequin could or will not be appropriate with the information sorts outlined in Energy Question. As an illustration, Energy Question has a Binary information kind, however the Tabular mannequin doesn’t.

The next desk reveals Energy Question’s datatypes, their representations within the Energy Question Editor’s UI, their mapping information sorts within the information mannequin (DAX), and the interior information sorts within the xVelocity (Tabular mannequin) engine:

Power Query and DAX (data model) data type mapping
Energy Question and DAX (information mannequin) information kind mapping

Because the above desk reveals, in Energy Question’s UI, Complete Quantity, Decimal, Mounted Decimal and Proportion are all in kind quantity within the Energy Question engine. The kind names within the Energy BI UI additionally differ from their equivalents within the xVelocity engine. Allow us to dig deeper.

Information Varieties in Energy Question

As talked about earlier, in Energy Question, we’ve got just one numeric datatype: quantity whereas within the Energy Question Editor’s UI, within the Rework tab, there’s a Information Kind drop-down button exhibiting 4 numeric datatypes, as the next picture reveals:

Data type representations in the Power Query Editor's UI
Information kind representations within the Energy Question Editor’s UI

In Energy Question method language, we specify a numeric information kind as kind quantity or Quantity.Kind. Allow us to take a look at an instance to see what this implies.

The next expression creates a desk with totally different values:

#desk({"Worth"}
	, {
		{100}
		, {65565}
		, {-100000}
		, {-999.9999}
		, {0.001}
		, {10000000.0000001}
		, {999999999999999999.999999999999999999}
		, {#datetimezone(2023,1,1,11,45,54,+12,0)}
		, {#datetime(2023,1,1,11,45,54)}
		, {#date(2023,1,1)}
		, {#time(11,45,54)}
		, {true}
		, {#length(11,45,54,22)}
		, {"This can be a textual content"}
	})

The outcomes are proven within the following picture:

Generating values in Power Query
Producing values in Energy Question

Now we add a brand new column that reveals the information kind of the values. To take action, use the Worth.Kind([Value]) perform returns the kind of every worth of the Worth column. The outcomes are proven within the following picture:

Getting a column's value types in Power Query
Getting a column’s worth sorts in Energy Question

To see the precise kind, we should click on on every cell (not the values) of the Worth Kind column, as proven within the following picture:

Click on a cell to see its type in Power Query Editor
Click on on a cell to see its kind in Energy Question Editor

With this methodology, we’ve got to click on every cell in to see the information sorts of the values that isn’t excellent. However there may be presently no perform accessible in Energy Question to transform a Kind worth to Textual content. So, to point out every kind’s worth as textual content in a desk, we use a easy trick. There’s a perform in Energy Question returning the desk’s metadata: Desk.Schema(desk as desk). The perform ends in a desk revealing helpful details about the desk used within the perform, together with column TitleTypeNameVariety, and so forth. We need to present TypeName of the Worth Kind column. So, we solely want to show every worth right into a desk utilizing the Desk.FromValue(worth as any) perform. We then get the values of the Variety column from the output of the Desk.Schema() perform.

To take action, we add a brand new column to get textual values from the Variety column. We named the brand new column Datatypes. The next expression caters to that:

Desk.Schema(
      Desk.FromValue([Value])
      )[Kind]{0}

The next picture reveals the outcomes:

Getting type values as text in Power Query
Getting kind values as textual content in Energy Question

Because the outcomes present, all numeric values are of kind quantity and the way in which they’re represented within the Energy Question Editor’s UI doesn’t have an effect on how the Energy Question engine treats these sorts. The info kind representations within the Energy Question UI are one way or the other aligned with the kind sides in Energy Question. A side is used so as to add particulars to a kind sort. As an illustration, we are able to use sides to a textual content kind if we need to have a textual content kind that doesn’t settle for null. We are able to outline the worth’s sorts utilizing kind sides utilizing Aspect.Kind syntax, equivalent to utilizing In64.Kind for a 64-bit integer quantity or utilizing Proportion.Kind to point out a quantity in share. Nevertheless, to outline the worth’s kind, we use the kind typename syntax equivalent to defining quantity utilizing kind quantity or a textual content utilizing kind textual content. The next desk reveals the Energy Question sorts and the syntax to make use of to outline them:

Defining types and facets in Power Query M
Defining sorts and sides in Energy Question M

Sadly, the Energy Question Language Specification documentation doesn’t embrace sides and there aren’t many on-line sources or books that I can reference right here apart from Ben Gribaudo’s weblog who totally defined sides intimately which I strongly suggest studying.

Whereas Energy Question engine treats the values primarily based on their sorts not their sides, utilizing sides is really useful as they have an effect on the information when it’s being loaded into the information mannequin which raises a query: what occurs after we load the information into the information mannequin? which brings us to the subsequent part of this weblog put up.

Information sorts in Energy BI information mannequin

Energy BI makes use of the xVelocity in-memory information processing engine to course of the information. The xVelocity engine makes use of columnstore indexing know-how that compresses the information primarily based on the cardinality of the column, which brings us to a important level: though the Energy Question engine treats all of the numeric values as the kind quantity, they get compressed otherwise relying on their column cardinality after loading the values within the Energy BI mannequin. Due to this fact, setting the right kind side for every column is vital.

The numeric values are one of the vital frequent datatypes utilized in Energy BI. Right here is one other instance exhibiting the variations between the 4 quantity sides. Run the next expression in a brand new clean question within the Energy Question Editor:

// Decimal Numbers with 6 Decimal Digits
let
    Supply = Checklist.Generate(()=> 0.000001, every _ <= 10, every _ + 0.000001 ),
    #"Transformed to Desk" = Desk.FromList(Supply, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Renamed Columns" = Desk.RenameColumns(#"Transformed to Desk",{{"Column1", "Supply"}}),
    #"Duplicated Supply Column as Decimal" = Desk.DuplicateColumn(#"Renamed Columns", "Supply", "Decimal", Decimal.Kind),
    #"Duplicated Supply Column as Mounted Decimal" = Desk.DuplicateColumn(#"Duplicated Supply Column as Decimal", "Supply", "Mounted Decimal", Forex.Kind),
    #"Duplicated Supply Column as Proportion" = Desk.DuplicateColumn(#"Duplicated Supply Column as Mounted Decimal", "Supply", "Proportion", Proportion.Kind)
in
    #"Duplicated Supply Column as Proportion"

The above expressions create 10 million rows of decimal values between 0 and 10. The ensuing desk has 4 columns containing the identical information with totally different sides. The primary column, Supply, comprises the values of kind any, which interprets to kind textual content. The remaining three columns are duplicated from the Supply column with totally different kind sides, as follows:

  • Decimal
  • Mounted decimal
  • Proportion

The next screenshot reveals the ensuing pattern information of our expression within the Energy Question Editor:

Generating 10 million numeric values and use different type facets in Power Query M
Producing 10 million numeric values and use totally different kind sides in Energy Question M

Now click on Shut & Apply from the Residence tab of the Energy Question Editor to import the information into the information mannequin. At this level, we have to use a third-party group device, DAX Studio, which could be downloaded from right here.

After downloading and putting in, DAX Studio registers itself as an Exterior Software within the Energy BI Desktop as the next picture reveals:

External tools in Power BI Desktop
Exterior instruments in Energy BI Desktop

Click on the DAX Studio from the Exterior Instruments tab which robotically connects it to the present Energy BI Desktop mannequin, and observe these steps:

  1. Click on the Superior tab
  2. Click on the View Metrics button
  3. Click on Columns from the VertiPaq Analyzer part
  4. Take a look at the CardinalityCol Measurement, and % Desk columns

The next picture reveals the previous steps:

VertiPaq Analyzer Metrics in DAX Studio
VertiPaq Analyzer Metrics in DAX Studio

The outcomes present that the Decimal column and Proportion consumed probably the most important a part of the desk’s quantity. Their cardinality can also be a lot larger than the Mounted Decimal column. So right here it’s now extra apparent that utilizing the Mounted Decimal datatype (side) for numeric values can assist with information compression, decreasing the information mannequin measurement and growing the efficiency. Due to this fact, it’s smart to all the time use Mounted Decimal for decimal values. Because the Mounted Decimal values translate to the Forex datatype in DAX, we should change the columns’ format if Forex is unsuitable. Because the title suggests, Mounted Decimal has mounted 4 decimal factors. Due to this fact, if the unique worth has extra decimal digits after conversion to the Mounted Decimal, the digits after the fourth decimal level can be truncated.

That’s the reason the Cardinality column within the VertiPaq Analyzer in DAX Studio reveals a lot decrease cardinality for the Mounted Decimal column (the column values solely hold as much as 4 decimal factors, no more).

Obtain the pattern file from right here.

So, the message is right here to all the time use the datatype that is sensible to the enterprise and is environment friendly within the information mannequin. Utilizing the VertiPaq Analyzer in DAX Studio is nice for understanding the assorted points of the information mannequin, together with the column datatypes. As a knowledge modeler, it’s important to grasp how the Energy Question sorts and sides translate to DAX datatypes. As we noticed on this weblog put up, information kind conversion can have an effect on the information mannequin’s compression fee and efficiency.


Uncover extra from BI Perception

Subscribe to get the newest posts despatched to your e-mail.

RELATED ARTICLES

Most Popular

Recent Comments