
You're staring at a sales dataset where customer names are jumbled together in a single column, product categories are scattered across multiple tables that need joining, and monthly revenue figures are spread horizontally when you need them vertically for analysis. Sound familiar? These are the exact scenarios where Power Query's core transformation capabilities—split, merge, pivot, and unpivot—become your lifeline.
These four operations represent the fundamental building blocks of data reshaping in Power Query. Master them, and you'll be able to tackle virtually any data structure challenge thrown your way. But here's the thing: while the UI makes these transformations look simple, understanding their underlying mechanics, performance implications, and advanced patterns is what separates competent analysts from true data transformation experts.
In this deep-dive lesson, we'll go beyond the basic button clicks to understand how these operations work under the hood, when to use each approach, and how to optimize them for enterprise-scale data processing.
What you'll learn:
This lesson assumes you're already comfortable with Power Query's interface and basic M language syntax. You should understand query folding concepts and have experience with basic data transformations. If terms like "Table.ExpandListColumn" or "query step dependencies" are unfamiliar, consider reviewing foundational Power Query concepts first.
Before diving into specific transformations, let's establish the conceptual framework. Every dataset has a "shape"—the arrangement of rows, columns, and nested structures that define how information is organized. The four transformations we're covering represent the primary ways to reshape data:
Split operations break single values into multiple parts, typically expanding horizontally (more columns) or vertically (more rows). Merge operations combine data from multiple sources based on matching keys. Pivot operations rotate data from a long format to a wide format, aggregating values in the process. Unpivot operations do the reverse, converting wide data to long format.
The key insight is that these aren't just mechanical operations—they're strategic choices that affect everything from query performance to downstream analysis possibilities. Let's examine each in detail.
The most common split operation involves breaking text columns on delimiters. While the UI presents this as a simple "split by delimiter" option, the underlying mechanics are more sophisticated than they appear.
let
Source = Table.FromRows({
{"John Smith, Jr.|Senior Manager|Sales"},
{"Mary Johnson-Brown|Director|Marketing|North Region"},
{"Robert Lee|Analyst|Finance"},
{"Sarah Wilson, PhD|VP|Operations|Special Projects|Innovation"}
}, {"FullInfo"}),
// Basic split by pipe delimiter
SplitBasic = Table.SplitColumn(Source, "FullInfo",
Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv),
{"Name", "Title", "Department", "Extra1", "Extra2"}),
// Advanced split with dynamic column detection
SplitDynamic = Table.SplitColumn(Source, "FullInfo",
Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv)),
// Split with custom handling for variable segments
SplitSmart = Table.AddColumn(Source, "Segments", each
Text.Split([FullInfo], "|")),
ParsedData = Table.AddColumn(SplitSmart, "StructuredData", each
let
segments = [Segments],
segmentCount = List.Count(segments),
name = if segmentCount >= 1 then segments{0} else null,
title = if segmentCount >= 2 then segments{1} else null,
department = if segmentCount >= 3 then segments{2} else null,
additionalInfo = if segmentCount > 3 then
Text.Combine(List.Range(segments, 3), "|") else null
in
[Name = name, Title = title, Department = department,
Additional = additionalInfo])
in
Table.ExpandRecordColumn(ParsedData, "StructuredData",
{"Name", "Title", "Department", "Additional"})
This example demonstrates three levels of split sophistication. The basic split works fine when you know the exact number of segments, but real-world data rarely cooperates. The dynamic approach lets Power Query determine column count automatically, while the smart approach handles variable-length data gracefully.
Split operations can become performance bottlenecks in large datasets. Here's why: each split creates new columns that must be evaluated for every row. When you're dealing with millions of records, this compounds quickly.
// Performance-optimized split for large datasets
let
Source = // your large table source,
// Pre-filter to reduce rows before expensive operations
FilteredSource = Table.SelectRows(Source, each [ColumnToSplit] <> null),
// Use Table.SplitColumn for better performance than Table.AddColumn
// with manual splitting logic
OptimizedSplit = Table.SplitColumn(FilteredSource, "ColumnToSplit",
Splitter.SplitTextByDelimiter(","),
{"Part1", "Part2", "Part3"}),
// Consider using query folding when possible
// This split might fold to SQL if source supports it
FoldableSplit = Table.SplitColumn(Source, "DatabaseColumn",
Splitter.SplitTextByDelimiter(","))
in
OptimizedSplit
The key insight here is that split operations often prevent query folding. When you split a column, Power Query typically needs to bring the data into memory to perform the text manipulation, even if the source database could theoretically handle it.
Real-world data often contains irregular delimiters that require more sophisticated handling:
let
Source = Table.FromRows({
{"Product: Electronics >> Category: Phones >> Brand: Samsung"},
{"Product: Clothing | Category: Shirts | Brand: Nike"},
{"Product: Books / Category: Fiction / Author: Stephen King / Year: 2020"}
}, {"ProductInfo"}),
// Handle multiple possible delimiters
NormalizeDelimiters = Table.TransformColumns(Source, {
"ProductInfo", each Text.Replace(Text.Replace(_, ">>", "|"), "/", "|")
}),
// Split on the normalized delimiter
SplitNormalized = Table.SplitColumn(NormalizeDelimiters, "ProductInfo",
Splitter.SplitTextByDelimiter("|")),
// Extract key-value pairs
ExtractKeyValues = Table.TransformColumns(SplitNormalized,
List.Transform(Table.ColumnNames(SplitNormalized), each
{_, (text) =>
if Text.Contains(text, ":") then
Text.Trim(Text.AfterDelimiter(text, ":"))
else text
}))
in
ExtractKeyValues
This pattern handles the common scenario where source systems use inconsistent delimiters or where data contains both delimiters and key-value structures.
Merge operations in Power Query support all standard join types, but understanding when to use each—and their performance characteristics—is crucial for enterprise applications.
let
// Sample sales data
Sales = Table.FromRows({
{"S001", "P001", "C001", 100, #date(2024, 1, 15)},
{"S002", "P002", "C002", 250, #date(2024, 1, 16)},
{"S003", "P001", "C003", 150, #date(2024, 1, 17)},
{"S004", "P003", "C001", 300, #date(2024, 1, 18)}
}, {"SaleID", "ProductID", "CustomerID", "Amount", "SaleDate"}),
// Product master data
Products = Table.FromRows({
{"P001", "Laptop", "Electronics", 800},
{"P002", "Shirt", "Clothing", 50},
{"P003", "Book", "Literature", 25}
}, {"ProductID", "ProductName", "Category", "ListPrice"}),
// Customer data with some missing customers
Customers = Table.FromRows({
{"C001", "John Smith", "Premium"},
{"C002", "Jane Doe", "Standard"},
{"C004", "Bob Wilson", "Premium"}
}, {"CustomerID", "CustomerName", "Tier"}),
// Inner join - only sales with matching products
InnerJoin = Table.NestedJoin(Sales, {"ProductID"}, Products, {"ProductID"},
"ProductInfo", JoinKind.Inner),
// Left join - all sales, with product info where available
LeftJoin = Table.NestedJoin(Sales, {"ProductID"}, Products, {"ProductID"},
"ProductInfo", JoinKind.LeftOuter),
// Full outer - all records from both tables
FullJoin = Table.NestedJoin(Sales, {"ProductID"}, Products, {"ProductID"},
"ProductInfo", JoinKind.FullOuter),
// Anti-join pattern - sales without matching products
AntiJoin = Table.SelectRows(
Table.NestedJoin(Sales, {"ProductID"}, Products, {"ProductID"},
"ProductCheck", JoinKind.LeftOuter),
each Table.IsEmpty([ProductCheck])
)
in
// Expand the joined columns
Table.ExpandTableColumn(LeftJoin, "ProductInfo",
{"ProductName", "Category", "ListPrice"})
Enterprise data often requires joining on multiple columns to establish proper relationships:
let
OrderLines = Table.FromRows({
{"ORD001", 1, "P001", 2},
{"ORD001", 2, "P002", 1},
{"ORD002", 1, "P001", 1},
{"ORD002", 2, "P003", 3}
}, {"OrderID", "LineNumber", "ProductID", "Quantity"}),
OrderLineDetails = Table.FromRows({
{"ORD001", 1, "Express Shipping", #date(2024, 1, 15)},
{"ORD001", 2, "Standard Shipping", #date(2024, 1, 16)},
{"ORD002", 1, "Express Shipping", #date(2024, 1, 17)}
}, {"OrderID", "LineNumber", "ShippingType", "ShipDate"}),
// Join on composite key (OrderID + LineNumber)
JoinedOrders = Table.NestedJoin(OrderLines,
{"OrderID", "LineNumber"},
OrderLineDetails,
{"OrderID", "LineNumber"},
"OrderDetails",
JoinKind.LeftOuter),
ExpandedResult = Table.ExpandTableColumn(JoinedOrders, "OrderDetails",
{"ShippingType", "ShipDate"})
in
ExpandedResult
Power Query's fuzzy matching capabilities enable joins on imperfect data—a common requirement when dealing with data from multiple systems:
let
CustomerData = Table.FromRows({
{"John Smith", "123 Main St"},
{"Jane Doe", "456 Oak Ave"},
{"Robert Johnson", "789 Pine Rd"}
}, {"CustomerName", "Address"}),
CRMData = Table.FromRows({
{"Jon Smith", "ABC Corp", "Premium"},
{"Jane Dough", "XYZ Inc", "Standard"},
{"Rob Johnson", "DEF Ltd", "Premium"}
}, {"Name", "Company", "Status"}),
// Fuzzy join with similarity threshold
FuzzyJoin = Table.NestedJoin(CustomerData, {"CustomerName"},
CRMData, {"Name"}, "CRMMatch", JoinKind.LeftOuter, [
SimilarityColumnName = "Similarity",
Threshold = 0.8,
IgnoreCase = true,
IgnoreSpace = true
]),
// Expand and evaluate match quality
WithMatches = Table.ExpandTableColumn(FuzzyJoin, "CRMMatch",
{"Company", "Status", "Similarity"}),
// Filter for high-confidence matches only
HighConfidenceMatches = Table.SelectRows(WithMatches,
each [Similarity] >= 0.8)
in
HighConfidenceMatches
Large-scale merge operations require careful performance tuning. Here are the key strategies:
let
// Strategy 1: Pre-sort tables on join keys when possible
LargeSalesData = // assume millions of rows
Table.Sort(YourLargeSalesTable, {"ProductID"}),
ProductMaster = Table.Sort(YourProductTable, {"ProductID"}),
// Strategy 2: Use buffer for lookup tables that will be referenced multiple times
BufferedProducts = Table.Buffer(ProductMaster),
// Strategy 3: Filter before join when possible
FilteredSales = Table.SelectRows(LargeSalesData,
each [SaleDate] >= #date(2024, 1, 1)),
// Perform the optimized join
OptimizedJoin = Table.NestedJoin(FilteredSales, {"ProductID"},
BufferedProducts, {"ProductID"}, "ProductInfo", JoinKind.Inner),
// Strategy 4: Expand only needed columns
FinalResult = Table.ExpandTableColumn(OptimizedJoin, "ProductInfo",
{"ProductName", "Category"}) // Only expand what you need
in
FinalResult
Performance Tip: Table.Buffer is particularly effective for lookup tables under 1GB that will be referenced multiple times. For larger lookup tables, consider using Table.Join instead of Table.NestedJoin when you only need simple column expansion.
Pivot operations transform row-based data into a columnar format, typically involving aggregation. The key challenge is that pivoting can explode memory usage if not handled carefully:
let
SalesData = Table.FromRows({
{"North", "Q1", "Laptops", 10000},
{"North", "Q1", "Phones", 8000},
{"North", "Q2", "Laptops", 12000},
{"North", "Q2", "Phones", 9000},
{"South", "Q1", "Laptops", 8000},
{"South", "Q1", "Phones", 6000},
{"South", "Q2", "Laptops", 9000},
{"South", "Q2", "Phones", 7000}
}, {"Region", "Quarter", "Product", "Sales"}),
// Basic pivot - quarters as columns
BasicPivot = Table.Pivot(SalesData,
List.Distinct(SalesData[Quarter]), "Quarter", "Sales"),
// Pivot with grouping - region and product combinations
GroupedPivot = Table.Group(SalesData, {"Region", "Product"}, {
{"PivotData", each Table.Pivot(_,
List.Distinct(SalesData[Quarter]), "Quarter", "Sales"), type table}
}),
// Expand the grouped pivot results
ExpandedPivot = Table.ExpandTableColumn(GroupedPivot, "PivotData",
{"Q1", "Q2"})
in
ExpandedPivot
Real-world pivoting often requires more sophisticated approaches:
let
// Complex sales data with multiple metrics
DetailedSales = Table.FromRows({
{"North", "Jan", "Laptops", 100, 10000, 8000},
{"North", "Jan", "Phones", 200, 8000, 6000},
{"North", "Feb", "Laptops", 120, 12000, 9000},
{"South", "Jan", "Laptops", 80, 8000, 6000}
}, {"Region", "Month", "Product", "Units", "Revenue", "Cost"}),
// Pivot multiple measures simultaneously
MultiMeasurePivot =
let
// Create measure-month combinations
MeasureTable = Table.ExpandListColumn(
Table.AddColumn(DetailedSales, "Measures", each {
[Measure = "Units", Value = [Units], Period = [Month]],
[Measure = "Revenue", Value = [Revenue], Period = [Month]],
[Measure = "Cost", Value = [Cost], Period = [Month]]
}), "Measures"),
ExpandedMeasures = Table.ExpandRecordColumn(MeasureTable, "Measures",
{"Measure", "Value", "Period"}),
// Create composite column for pivoting
WithPivotColumn = Table.AddColumn(ExpandedMeasures, "PivotKey", each
[Measure] & "_" & [Period]),
// Perform the pivot
PivotResult = Table.Pivot(
Table.RemoveColumns(WithPivotColumn, {"Month", "Units", "Revenue", "Cost", "Measure", "Period"}),
List.Distinct(WithPivotColumn[PivotKey]), "PivotKey", "Value"
)
in
PivotResult
in
MultiMeasurePivot
Large pivot operations can consume enormous amounts of memory. Here's how to manage this:
let
// For large datasets, consider streaming approach
LargeDataset = // Your large source table
// Strategy 1: Aggregate before pivoting
PreAggregated = Table.Group(LargeDataset, {"Region", "Product", "Quarter"}, {
{"TotalSales", each List.Sum([Sales]), type number}
}),
// Strategy 2: Limit pivot column count
TopQuarters = List.FirstN(List.Sort(List.Distinct(LargeDataset[Quarter]), Order.Descending), 8),
FilteredForPivot = Table.SelectRows(LargeDataset, each List.Contains(TopQuarters, [Quarter])),
// Strategy 3: Use streaming pivot for very large datasets
StreamingPivot = Table.Group(FilteredForPivot, {"Region", "Product"}, {
{"QuarterlyData", each
let
quarterData = Table.Pivot(_, TopQuarters, "Quarter", "Sales")
in
quarterData
}
})
in
StreamingPivot
Memory Warning: Pivot operations load the entire pivot result into memory. If you're pivoting to create thousands of columns, consider alternative approaches like creating multiple smaller pivots or using a different analytical approach altogether.
Unpivot operations convert wide data formats to long formats, making them analysis-ready:
let
WideData = Table.FromRows({
{"North", 10000, 12000, 11000, 13000},
{"South", 8000, 9000, 9500, 10000},
{"East", 7000, 8000, 8500, 9000},
{"West", 6000, 7000, 7500, 8000}
}, {"Region", "Q1_2023", "Q2_2023", "Q3_2023", "Q4_2023"}),
// Basic unpivot - all quarter columns
BasicUnpivot = Table.UnpivotOtherColumns(WideData, {"Region"}, "Quarter", "Sales"),
// Selective unpivot - only specific columns
SelectiveUnpivot = Table.Unpivot(WideData, {"Q1_2023", "Q2_2023"}, "Quarter", "Sales"),
// Clean up the quarter names
CleanedUnpivot = Table.TransformColumns(BasicUnpivot, {
"Quarter", each Text.Replace(Text.Replace(_, "_", " "), "Q", "Quarter ")
})
in
CleanedUnpivot
Complex datasets often have multiple measure types that need unpivoting:
let
MultiMeasureData = Table.FromRows({
{"North", 100, 10000, 120, 12000, 110, 11000},
{"South", 80, 8000, 90, 9000, 85, 8500}
}, {"Region", "Q1_Units", "Q1_Revenue", "Q2_Units", "Q2_Revenue", "Q3_Units", "Q3_Revenue"}),
// Step 1: Unpivot all measure columns
UnpivotedAll = Table.UnpivotOtherColumns(MultiMeasureData, {"Region"}, "MeasurePeriod", "Value"),
// Step 2: Extract period and measure type
WithParsedInfo = Table.AddColumn(UnpivotedAll, "ParsedInfo", each
let
parts = Text.Split([MeasurePeriod], "_"),
period = parts{0},
measure = parts{1}
in
[Period = period, Measure = measure]
),
// Step 3: Expand the parsed information
ExpandedInfo = Table.ExpandRecordColumn(WithParsedInfo, "ParsedInfo", {"Period", "Measure"}),
// Step 4: Pivot back to get measures as columns
FinalStructure = Table.Pivot(
Table.SelectColumns(ExpandedInfo, {"Region", "Period", "Measure", "Value"}),
List.Distinct(ExpandedInfo[Measure]), "Measure", "Value"
)
in
FinalStructure
When dealing with hierarchical column structures, unpivot operations become more complex:
let
HierarchicalData = Table.FromRows({
{"Product A", 100, 1000, 110, 1100, 200, 2000, 210, 2100},
{"Product B", 150, 1500, 160, 1600, 250, 2500, 260, 2600}
}, {"Product", "North_Q1_Units", "North_Q1_Revenue", "North_Q2_Units", "North_Q2_Revenue",
"South_Q1_Units", "South_Q1_Revenue", "South_Q2_Units", "South_Q2_Revenue"}),
// Unpivot all measure columns
UnpivotedHierarchy = Table.UnpivotOtherColumns(HierarchicalData, {"Product"}, "Hierarchy", "Value"),
// Parse the hierarchical structure
ParsedHierarchy = Table.AddColumn(UnpivotedHierarchy, "StructuredData", each
let
parts = Text.Split([Hierarchy], "_"),
region = parts{0},
quarter = parts{1},
measure = parts{2}
in
[Region = region, Quarter = quarter, Measure = measure]
),
// Expand and clean up
FinalHierarchy = Table.ExpandRecordColumn(
Table.RemoveColumns(ParsedHierarchy, {"Hierarchy"}),
"StructuredData", {"Region", "Quarter", "Measure"}
)
in
FinalHierarchy
Unpivot operations on large datasets require strategic approaches:
let
// For wide tables with many columns, consider selective unpivoting
WideTable = // Your wide source table with hundreds of columns
// Strategy 1: Identify columns to unpivot dynamically
ColumnNames = Table.ColumnNames(WideTable),
FixedColumns = {"ID", "Name", "Category"},
MeasureColumns = List.Difference(ColumnNames, FixedColumns),
// Strategy 2: Filter measure columns if needed
RelevantColumns = List.Select(MeasureColumns, each Text.StartsWith(_, "2024")),
// Strategy 3: Batch unpivot for very wide tables
BatchedUnpivot = if List.Count(RelevantColumns) > 1000 then
// For extremely wide tables, consider batching
let
BatchSize = 100,
BatchCount = Number.RoundUp(List.Count(RelevantColumns) / BatchSize),
Batches = List.Generate(
() => 0,
each _ < BatchCount,
each _ + 1,
each List.Range(RelevantColumns, _ * BatchSize, BatchSize)
),
ProcessedBatches = List.Transform(Batches, each
Table.Unpivot(Table.SelectColumns(WideTable, FixedColumns & _), _, "Attribute", "Value")
)
in
Table.Combine(ProcessedBatches)
else
Table.Unpivot(WideTable, RelevantColumns, "Attribute", "Value")
in
BatchedUnpivot
Let's work through a comprehensive scenario that combines all four transformation types. You're working with quarterly sales data from multiple regions, stored in a format that needs significant restructuring for analysis.
let
// Raw data: messy format with combined information
RawSalesData = Table.FromRows({
{"North|Northeast", "Q1_2024:Laptops:15000:1200|Q1_2024:Phones:8000:800|Q2_2024:Laptops:18000:1400"},
{"South|Southeast", "Q1_2024:Tablets:12000:600|Q2_2024:Tablets:14000:700|Q2_2024:Phones:9000:900"},
{"West|Pacific", "Q1_2024:Laptops:20000:1600|Q1_2024:Phones:10000:1000|Q2_2024:Laptops:22000:1800"}
}, {"RegionInfo", "SalesData"}),
// Additional lookup data for enrichment
RegionDetails = Table.FromRows({
{"North", "Northern Region", "John Smith"},
{"South", "Southern Region", "Jane Doe"},
{"West", "Western Region", "Bob Wilson"}
}, {"RegionCode", "RegionName", "Manager"}),
ProductCategories = Table.FromRows({
{"Laptops", "Computing", "High"},
{"Phones", "Mobile", "Medium"},
{"Tablets", "Mobile", "Medium"}
}, {"Product", "Category", "Priority"})
in
RawSalesData
// Split region info into separate columns
SplitRegions = Table.SplitColumn(RawSalesData, "RegionInfo",
Splitter.SplitTextByDelimiter("|"), {"RegionCode", "RegionSubCode"}),
// Split sales data and expand to rows
ExpandSalesData = Table.AddColumn(SplitRegions, "SalesRecords", each
Text.Split([SalesData], "|")),
// Convert to rows
SalesRows = Table.ExpandListColumn(ExpandSalesData, "SalesRecords"),
// Parse individual sales records
ParsedSales = Table.AddColumn(SalesRows, "SalesDetails", each
let
parts = Text.Split([SalesRecords], ":"),
quarter = parts{0},
product = parts{1},
revenue = Number.FromText(parts{2}),
units = Number.FromText(parts{3})
in
[Quarter = quarter, Product = product, Revenue = revenue, Units = units]),
// Expand the parsed details
ExpandedSales = Table.ExpandRecordColumn(
Table.RemoveColumns(ParsedSales, {"SalesData", "SalesRecords"}),
"SalesDetails", {"Quarter", "Product", "Revenue", "Units"}),
// Merge with region details
WithRegionInfo = Table.NestedJoin(ExpandedSales, {"RegionCode"},
RegionDetails, {"RegionCode"}, "RegionInfo", JoinKind.LeftOuter),
ExpandedRegionInfo = Table.ExpandTableColumn(WithRegionInfo, "RegionInfo",
{"RegionName", "Manager"}),
// Merge with product categories
WithProductInfo = Table.NestedJoin(ExpandedRegionInfo, {"Product"},
ProductCategories, {"Product"}, "ProductInfo", JoinKind.LeftOuter),
CompleteData = Table.ExpandTableColumn(WithProductInfo, "ProductInfo",
{"Category", "Priority"}),
// Create quarterly summary pivot
QuarterlySummary = Table.Group(CompleteData, {"RegionName", "Product"}, {
{"QuarterlyRevenue", each Table.Pivot(_,
List.Distinct(CompleteData[Quarter]), "Quarter", "Revenue")}
}),
FinalSummary = Table.ExpandTableColumn(QuarterlySummary, "QuarterlyRevenue",
{"Q1_2024", "Q2_2024"})
// Create a different analytical view using unpivot
MetricsWide = Table.Group(CompleteData, {"RegionName", "Product", "Quarter"}, {
{"Revenue", each List.Sum([Revenue]), type number},
{"Units", each List.Sum([Units]), type number}
}),
// Unpivot for trend analysis
MetricsLong = Table.UnpivotOtherColumns(MetricsWide,
{"RegionName", "Product", "Quarter"}, "Metric", "Value")
This exercise demonstrates how the four transformation types work together to reshape complex data into multiple analytical formats.
Problem: Inconsistent delimiter handling
// Wrong approach - assumes consistent delimiters
Table.SplitColumn(Source, "Data", Splitter.SplitTextByDelimiter(","))
// Better approach - handle multiple delimiters
Table.SplitColumn(Source, "Data",
Splitter.SplitTextByAnyDelimiter({",", ";", "|"}))
Problem: Not handling null values
// Wrong - will error on nulls
Table.AddColumn(Source, "Split", each Text.Split([Column], ","))
// Right - handle nulls gracefully
Table.AddColumn(Source, "Split", each
if [Column] = null then null else Text.Split([Column], ","))
Problem: Cartesian product explosion This happens when join keys aren't unique in one or both tables. Always check key uniqueness before large merges:
// Diagnostic query to check key uniqueness
let
KeyCounts = Table.Group(YourTable, {"JoinKey"}, {
{"Count", each Table.RowCount(_), type number}
}),
DuplicateKeys = Table.SelectRows(KeyCounts, each [Count] > 1)
in
DuplicateKeys
Problem: Memory exhaustion on large joins
// Wrong approach for large tables
Table.NestedJoin(LargeTable1, {"Key"}, LargeTable2, {"Key"}, "Joined")
// Better approach - filter first
let
FilteredTable1 = Table.SelectRows(LargeTable1, each [RelevantColumn] <> null),
FilteredTable2 = Table.SelectRows(LargeTable2, each [ActiveFlag] = true)
in
Table.NestedJoin(FilteredTable1, {"Key"}, FilteredTable2, {"Key"}, "Joined")
Problem: Memory overflow with too many pivot columns
// Dangerous - might create thousands of columns
Table.Pivot(LargeTable, List.Distinct(LargeTable[ManyValueColumn]), "Column", "Value")
// Safer - limit pivot values
let
TopValues = List.FirstN(
List.Sort(
Table.Group(LargeTable, {"ManyValueColumn"}, {
{"Count", each Table.RowCount(_)}
})[ManyValueColumn],
Order.Descending),
20)
in
Table.Pivot(
Table.SelectRows(LargeTable, each List.Contains(TopValues, [ManyValueColumn])),
TopValues, "Column", "Value")
Problem: Data type inconsistencies after unpivot
// Wrong - assumes all unpivoted columns have same type
Table.UnpivotOtherColumns(Source, {"ID"}, "Attribute", "Value")
// Better - handle type conversion explicitly
let
Unpivoted = Table.UnpivotOtherColumns(Source, {"ID"}, "Attribute", "Value"),
TypedValues = Table.TransformColumns(Unpivoted, {
"Value", each if Value.Is(_, type text) then _ else Text.From(_)
})
in
TypedValues
Problem: Operations preventing query folding Most transformation operations prevent query folding. To maintain performance:
// Better approach - filter at source before transformations
let
FilteredSource = Table.SelectRows(DatabaseSource, each [Date] >= #date(2024, 1, 1)),
Transformed = Table.SplitColumn(FilteredSource, "Column",
Splitter.SplitTextByDelimiter(","))
in
Transformed
You've now mastered the four fundamental data transformation operations in Power Query. These operations—split, merge, pivot, and unpivot—form the foundation of virtually every data reshaping task you'll encounter.
Key takeaways:
Advanced patterns to explore:
Next steps:
Consider exploring Power Query's M language more deeply to create custom functions that combine these transformations. Also investigate query folding optimization techniques and integration patterns with Azure Data Factory for enterprise data pipeline scenarios.
The real mastery comes from understanding not just how to perform these operations, but when to use each approach and how to optimize them for your specific data volumes and performance requirements. Practice with your own datasets, and always test performance implications before deploying to production systems.
Learning Path: Power Query Essentials