Microsoft Power BI Performance Best Practices - Bhavik Merchant - E-Book

Microsoft Power BI Performance Best Practices E-Book

Bhavik Merchant

0,0
43,19 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

This book comprehensively covers every layer of Power BI, from the report canvas to data modeling, transformations, storage, and architecture.
Developers and architects working with any area of Power BI will be able to put their knowledge to work with this practical guide to design and implement at every stage of the analytics solution development process. This book is not only a unique collection of best practices and tips, but also provides you with a hands-on approach to identifying and fixing common performance issues.
Complete with explanations of essential concepts and practical examples, you’ll learn about common design choices that affect performance and consume more resources and how to avoid these problems. You’ll grasp the general architectural issues and settings that broadly affect most solutions. As you progress, you’ll walk through each layer of a typical Power BI solution, learning how to ensure your designs can handle scale while not sacrificing usability. You’ll focus on the data layer and then work your way up to report design. We will also cover Power BI Premium and load testing.
By the end of this Power BI book, you’ll be able to confidently maintain well-performing Power BI solutions with reduced effort and know how to use freely available tools and a systematic process to monitor and diagnose performance problems.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 360

Veröffentlichungsjahr: 2022

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Microsoft Power BI Performance Best Practices

A comprehensive guide to building consistently fast Power BI solutions

Bhavik Merchant

BIRMINGHAM—MUMBAI

Microsoft Power BI Performance Best Practices

Copyright © 2022 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Publishing Product Manager: Ali Abidi

Senior Editor: David Sugarman

Content Development Editor: Priyanka Soam

Technical Editor: Devanshi Ayare

Copy Editor: Safis Editing

Project Coordinator: Aparna Ravikumar Nair

Proofreader: Safis Editing

Indexer: Rekha Nair

Production Designer: Jyoti Chauhan

Marketing Coordinator: Priyanka Mhatre

First published: April 2022

Production reference: 1240322

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-80107-644-9

www.packt.com

"Like so many other authors, I dedicate this book, first and foremost, to my wife and 5-year-old son. The little one especially, for being a real trooper and reluctantly allowing me many hours on weekends to write this book instead of spending time with him. I didn't realize how critical their support was until I reached the final chapters, as the long months with COVID were even more challenging for us with some personal and professional hurdles to overcome. Despite being isolated and in a new country, they kept encouraging me and celebrating my small wins each time a chapter draft or review was done. My deepest, heartfelt gratitude goes out to them both.

I also want to thank everyone I worked with during my time at Microsoft in the Power BI product team. I learned a lot from many experts, such as the CAT team, architects, engineering managers, and deep subject matter experts in areas such as reports. The list is long, and I prefer not to name anyone for fear of missing someone out! I really hope that all their knowledge, coupled with my own experiences serving innovative customers around the world, will help you take your Power BI solutions to the next level."

Foreword

Ask anyone who has ever presented at a database conference, written blog posts about databases, or recorded videos about them, what the most popular database-related topic is and they will all tell you the same thing: performance tuning. While a skilfully crafted conference presentation on database design best practices will draw a good number of attendees, a basic presentation on performance tuning will draw the crowds. Why is this? I think it's because the goal of performance tuning is so simple: you have something slow and you need to make it fast. It's the struggle that every DBA, report developer, or business analyst faces every day of their professional life. Speed represents usability, faster decisions, happy users, and, ultimately, success. Some people have gained minor celebrity status and undertaken lucrative careers by being able to make your queries and calculations go faster.

Power BI is no different from any other BI tool or database in this respect. One of the most common causes of failure, and certainly the most common cause of complaints, for any BI project is poor report performance. In general, Power BI is extremely fast, even when you're working with relatively large amounts of data, but if you make a mistake somewhere, write a complex calculation incorrectly, or fail to model your data in the right way, for example, you will end up in trouble. As a Power BI professional, it's essential that you know how to design for performance, how to troubleshoot your reports when things go wrong, and how to rectify your errors.

All of this is why Bhavik's book is so important. Even though performance tuning is such an important and popular topic, I believe this is the first book that has ever been dedicated to performance tuning Power BI. It brings together hints, tips, and best practices that have been scattered around the official documentation, blog posts, videos, and training courses and is enriched by the author's years of personal experience as a program manager on the Power BI development team working with some of the largest Power BI customers in the world. Rather than focusing on one particular aspect of performance tuning, such as DAX, it looks at the subject holistically. As a result, you're holding in your hands (or maybe viewing on your screen) an invaluable resource that could make the difference between the success or the failure of your Power BI project. Study it carefully and follow its advice!

Christopher Webb

Principal Program Manager, Power BI CAT Team; 13 year MVP and author of multiple SSAS and Power BI titles

Contributors

About the author

Bhavik Merchant has nearly 18 years of in-depth BI experience. He is a director of product analytics at Salesforce. Prior to that, he was at Microsoft, first as a cloud solution architect, and then as a product manager in the Power BI engineering team. At Power BI, he led the customer-facing insights program, being responsible for the strategy and technical framework to deliver system-wide usage and performance insights to customers. Before Microsoft, Bhavik spent years managing high-caliber consulting teams, delivering enterprise-scale BI projects. He has delivered much technical and theoretical BI training over the years, including expert Power BI performance training that he developed for top Microsoft partners globally.

About the reviewers

Suresh Datla has been in the IT Industry for over two decades and has vast experience across multiple Business and Technology domains. He is an Architect, Adviser, Evangelist, and Trainer. He has been working on Azure and Power Platform since their inception and he also works very closely with the Microsoft team developing industry vertical solutions. He is a speaker at Microsoft-sponsored events on Power Platform, Power BI, Power BI Premium, Security, and Performance. He organizes the Southern California Power Platform User Group every month and strongly believes that the success of a platform lies in the strength of the community. Suresh is the principal at Synergis Consulting and leads a group of Data Architects, Designers, Engineers, and Developers.

Vishwanath Muzumdar has more than 8 years' experience in information technology consulting, business analysis, business development, and business process management in the BI space.

He is an MS Power BI developer (champion) in the creation of powerful visual reporting for clients. His goal is to utilize his strong prioritization skills, analytical ability, team management skills, and expertise in the Microsoft Power BI reporting tool in order to achieve organizational objectives.

Table of Contents

Preface

Part 1: Architecture, Bottlenecks, and Performance Targets

Chapter 1: Setting Targets and Identifying Problem Areas

Defining good performance

Report performance goals

Setting realistic performance targets

Considering areas that can slow you down

Connecting data sources

The Power BI enterprise gateway

Network latency

The Power BI service

Which choices affect performance?

Summary

Chapter 2: Exploring Power BI Architecture and Configuration

Understanding data connectivity and storage modes

Choosing between Import and DirectQuery mode

When DirectQuery is more appropriate

LiveConnect mode

Reaching on-premises data through gateways

How gateways work

Good practices for gateway performance

General architectural guidance

Planning data and cache refresh schedules

Summary

Chapter 3: DirectQuery Optimization

Data modeling for DirectQuery

Optimizing DirectQuery relationships

Configuring for faster DirectQuery

Power BI Desktop settings

Optimizing external data sources

Summary

Part 2: Performance Analysis, Improvement, and Management

Chapter 4: Analyzing Logs and Metrics

Power BI usage metrics

Customizing the usage metrics report

Power BI logs and engine traces

Activity logs and unified audit logs

Analysis Services server traces with the XMLA endpoint

Integration with Azure Log Analytics

Monitoring Azure Analysis Services and Power BI embedded

Summary

Further reading

Chapter 5: Desktop Performance Analyzer

Technical requirements

Overview of Performance Analyzer

Actions and metrics in Performance Analyzer

Determining user actions

Spotting and mitigating performance issues

Achieving consistency in tests

Understanding Performance Analyzer's strengths and limitations

Interpreting and acting on Performance Analyzer data

Exporting and analyzing performance data

Summary

Chapter 6: Third-Party Utilities

Technical requirements

Power BI Helper

Identifying large columns in the dataset

Identifying unused columns

Identifying bi-directional and inactive relationships

Identifying measure dependencies

Tabular Editor

Using Tabular Editor's Best Practice Analyzer

DAX Studio and VertiPaq Analyzer

Analyzing model size with VertiPaq Analyzer

Performance tuning the data model and DAX

Summary

Chapter 7: Governing with a Performance Framework

Establishing a repeatable, pro-active performance improvement process

The performance management cycle

Knowledge sharing and awareness

Helping self-service users

Leveraging professional developers

Approaching performance improvement collaboratively

Applying the performance management cycle to different usage scenarios

Summary

Part 3: Fetching, Transforming, and Visualizing Data

Chapter 8: Loading, Transforming, and Refreshing Data

Technical requirements

General data transformation guidance

Data refresh, parallelism, and resource usage

Improving the development experience

Folding, joining, and aggregating

Leveraging incremental refresh

Using query diagnostics

Collecting Power Query diagnostics

Analyzing the Power Query logs

Optimizing dataflows

Summary

Chapter 9: Report and Dashboard Design

Technical requirements

Optimizing interactive reports

Controlling the visuals and associated queries

Optimizing dashboards

Optimizing paginated reports

Summary

Part 4: Data Models, Calculations, and Large Datasets

Chapter 10: Data Modeling and Row-Level Security

Technical requirements

Building efficient data models

The Kimball theory and implementing star schemas

Reducing dataset size

Avoiding pitfalls with row-level security (RLS)

Summary

Chapter 11: Improving DAX

Technical requirements

Understanding DAX pitfalls and optimizations

The process for tuning DAX

DAX guidance

Summary

Chapter 12: High-Scale Patterns

Technical requirements

Scaling with Power BI Premium and Azure Analysis Services

Leveraging Power BI Premium for data scale

Leveraging Azure Analysis Services for data and user scale

Using partitions with AAS and Premium

Scaling with composite models and aggregations

Leveraging composite models

Leveraging aggregations

Scaling with Azure Synapse and Azure Data Lake

The modern data warehouse architecture

Azure Data Lake Storage

Azure Synapse analytics

Summary

Further reading

Part 5: Optimizing Premium and Embedded Capacities

Chapter 13: Optimizing Premium and Embedded Capacities

Understanding Premium services, resource usage, and Autoscale

Premium capacity behavior and resource usage

Understanding how capacities evaluate load

Managing capacity overload and Autoscale

Capacity planning, monitoring, and optimization

Determining the initial capacity size

Validating capacity size with load testing

Monitoring capacity resource usage and overload

Summary

Chapter 14: Embedding in Applications

Improving Embedded performance

Measuring Embedded performance

Summary

Final Thoughts

Other Books You May Enjoy

Part 1: Architecture, Bottlenecks, and Performance Targets

In this part, we will have a high-level review of the Power BI architecture and identify areas where performance can be affected by design choices. After this part, you will know how to define realistic performance targets.

This part comprises the following chapters:

Chapter 1, Setting Targets and Identifying Problem AreasChapter 2, Exploring Power BI Architecture and ConfigurationChapter 3, DirectQuery Optimization

Chapter 1: Setting Targets and Identifying Problem Areas

Many people would consider report performance as the most critical area to focus on when trying to improve the speed of an analytics solution. This is largely true, because it is the most visible part of the system used by pretty much every class of user, from administrators to business executives. However, you will learn that there are other areas of a complete solution that should be considered if performance is to be managed comprehensively. For example, achieving good performance in the reporting layer might be of no consequence if the underlying dataset that powers the report takes a long time to be refreshed or is susceptible to failures due to resource limits or system limits being reached. In this case, users may have great-looking, fast reports that do not provide value due to the data being stale.

The author of this book has experienced the effects of poor report performance firsthand. In one project, a large utility company underwent a large migration from one reporting platform to another, from a different vendor. Even though the new platform was technically and functionally superior, the developers tried to copy the old reporting functionality across exactly. This led to poor design choices and very slow report performance. Millions of dollars in licensing and consulting fees were spent, yet most users refused to adopt the new system because it slowed them down so much. While it is extreme, this example demonstrates the potential ramifications when you do not build good performance into an analytical solution.

In this chapter, you will begin your journey to achieving good and consistent performance in Microsoft Power BI. To introduce the full scope of performance management, we will describe a Power BI solution as a stream of data from multiple sources being consolidated and presented to data analysts and information workers. We look at how data can be stored in Power BI and the different paths it can take before reaching a user. Many of the initial architectural design choices made in the early stages of the solution are very difficult and costly to change later. Hence, it is important to have a solid grasp of the implications of these choices and use a data-driven approach to help us decide what is best right at the start.

An area of performance management that is easily overlooked is that of setting performance targets. How do you know whether the experience you are delivering is great, merely acceptable, or poor? We will begin by exploring this theoretical area first to define our goals before diving into technical concepts.

This chapter is broken into the following sections:

Defining good performanceConsidering areas that can slow you downWhich choices affect performance?

Defining good performance

With the advent of ever-faster computers and the massive scale of processing available today by way of cloud computing, business users expect and demand analytical solutions that perform well. This is essential for competitive business decision making. Business Intelligence (BI) software vendors echo this need and tend to promise quick results in their sales and marketing materials. These expectations mean that it is uncommon to find users getting excited about how fast reports are or how fresh data is because it is something implicit to them having a positive experience. Conversely, when users have to wait a long time for a report to load, they are quite vocal and tend to escalate such issues via multiple channels. When these problems are widespread it can damage the reputation of both a software platform such as Power BI and the teams involved in building and maintaining those solutions. In the worst possible case, users may refuse to adopt these solutions and management may begin looking for alternative platforms. It's important to think about performance from the onset because it is often very costly and time-consuming to fix performance after a solution has reached production, potentially affecting thousands of users.

Report performance goals

Today, most BI solutions are consumed via a web interface. A typical report consumption experience involves not just opening a report, but also interacting with it. In Power BI terms, this translates to opening a report and then interacting with filters, slicers, and report visuals, and navigating to other pages explicitly or via bookmarks and drilling through. With each report interaction, the user generally has a specific intention, and the goal is to not interrupt their flow. A term commonly used in the industry is analysis at the speed of thought. This experience and the related expectations are very similar to navigating regular web pages or interacting with a web-based software system.

Therefore, defining good performance for a BI solution can take some cues from the many studies on web and user interface performance that have been performed over the past two or three decades; it is not a complex task. Nah, F. (2004) conducted a study focusing on tolerable wait time (TWT) for web users. TWT was defined as how long users are willing to wait before abandoning the download of a web page. Nah reviewed many previous studies that explored the thresholds at which users' behavioral intentions get lost and also when their attitudes begin to become negative. From this research, we can derive that a well-performing Power BI report should completely load a page or the result of an interaction ideally in less than 4 seconds and in most cases not more than 12 seconds. We should always measure report performance from the user's perspective, which means we measure from the time they request the report (for example, click a report link on the Power BI web portal) until the time the last report visual finishes drawing its results on the screen.

Setting realistic performance targets

Now that we have research-based guidance to set targets, we need to apply it to real-world scenarios. A common mistake is to set a single performance target for every report in the organization and to expect it to be met every single time a user interacts. This approach is flawed because even a well-designed system with heavy optimization could be complex enough to never meet an aggressive performance target. For example, very large dataset sizes (tens of GB) combined with complex nested DAX calculations that are then displayed on multiple hierarchical levels of granularity in a Table visual will naturally need significant time to be processed and displayed. This would generally not be the case with a report working over a small data model (tens of MB) containing a row of simple sum totals, each displayed within a Card visual.

Due to the variability of the solution complexity and other factors beyond the developer's control (such as the speed of a user's computer or which web browser they use) it is recommended that you think of performance targets in terms of typical user experience and acknowledge that there may be exceptions and outliers. Therefore, the performance target metric should consider what the majority of users experience. We recommend report performance metrics that use the 90th percentile of the report load or interaction duration, often referred to as P90. Applying the research guidance on how long a user can wait before becoming frustrated, a reasonable performance target would be P90 report load duration of 10 seconds or less. This means 90% of report loads should occur in under 10 seconds.

However, a single target such as P90 is still not sufficient and we will introduce further ideas about this in Chapter 7,Governing with a Performance Framework. For now, we should consider that there may be different levels of complexity, so it is recommended to set up a range of targets that reflect the complexity of solutions and the tolerance levels of users and management alike. The following table presents an example of a performance target table that could be adopted in an organization:

Figure 1.1 – Example Power BI report performance targets

Next, we will take a look at Power BI from a high level to get a broad understanding of the areas that need to be considered for performance improvement.

Considering areas that can slow you down

The next step in our performance management journey is to understand where time is spent. A Power BI solution is ultimately about exposing data to a user and can be thought of as a flow of data from source systems or data stores, through various Power BI system components, eventually reaching a user through a computer or mobile device. A simplified view of a Power BI solution is presented in Figure 1.2:

Figure 1.2 – Simplified overview of a Power BI solution

Next, we will briefly focus on the different parts of a typical solution to explain why each piece has important considerations for users and the effect poor performance can have. Some of these areas will be covered in more detail in Chapter 2,Exploring Power BI Architecture and Configuration.

Connecting data sources

The following diagram highlights the areas of the solution that are affected when data sources and connectivity methods do not perform well:

Figure 1.3 – Areas affected by data source and connectivity issues

Import mode

When using Import mode datasets, developers can experience sluggish user interface responsiveness when working with Power Query or M in Power BI Desktop. In extreme cases, this can extend data transformation development from hours to days. Once the solution is deployed, problems in this area can cause refresh times to extend or fail. The Power BI service has a refresh limit of 2 hours, while Power BI Premium extends this to 5 hours. Any refresh hitting this time limit will be canceled by the system.

DirectQuery mode

DirectQuery mode leaves the data at the source and needs to fetch data and process it within Power BI for almost every user interaction. Issues with this part of the configuration most often cause slow reports for users. Visuals will take a longer time to load, and users may get frustrated and then interrupt and interact with other views or filter conditions. This itself can issue more queries and ironically slow down the report even further by placing additional load on the external source system.

Live connection mode

Live connection mode originally referred exclusively to connections to external Analysis Services deployments, which could be cloud-native (Azure Analysis Services) or on-premises (SQL Server Analysis Services). More recently, this mode was extended to more use cases with the introduction of shared datasets and the ability to connect Power BI Desktop to build a report against a published dataset in the Power BI service. Since the underlying dataset could be Import or DirectQuery mode, the experience may vary as described in previous sections.

The Power BI enterprise gateway

The Power BI gateway is a middleware component used to connect to external data sources. It is usually part of the same physical or virtual network, and it establishes a secure outgoing connection to Power BI, over which it can send data to satisfy report queries and data refresh requests.

Figure 1.4 – Power BI enterprise gateway

The gateway is not just a conduit for data, which is a common misconception. In addition to providing authenticated and approved connections to data sources, it contains the mashup engine that performs data transformations and compresses data before sending it to the Power BI service. When the gateway is not optimized it can lead to long-running data refreshes, data refresh failures, slow report interactions, or visuals failing to load due to query timeouts.

Network latency

Network latency is about how long a piece of information takes to travel from one point to another in a network. Network latency is measured in milliseconds and is typically measured by performing a ping. A ping measures the time taken to send a small packet of information to a destination and receive a response acknowledging the message. Ping times that reach seconds can be problematic. The main drivers of network latency are geographical distance, the number of hops the information needs to take on the way, and how busy the networks are overall.

The following diagram highlights the possible paths that data takes within Power BI. It's worth noting that each individual arrow could have different latency, which means effects can be felt disproportionately by certain users or in certain parts of the solution.

Figure 1.5 – Data movement affected by network latency

High network latency is most felt when users are interacting with reports. It is a significant contributor of slow performance primarily when there are many visuals in a report and therefore many queries to be executed. This is because such configurations require many individual pieces of information to be sent and received, and each one is affected by the latency.

The Power BI service

The Power BI service is the central part of any Power BI solution. The system components in the service are largely out of the control of developers and users. The stability and performance of these are monitored by Microsoft. The exception is Power BI Premium and Embedded, where the underlying infrastructure is still managed by Microsoft, but your administrators have many choices available on how to manage their dedicated capacity. This will be covered in detail in Chapter 13,Optimizing Premium and Embedded Capacities.

Figure 1.6 – The Power BI service

The major component of the Power BI service that is under your control is the Analysis Services engine, which sits at the core of any Power BI solution. Even with the Power BI service running efficiently under Microsoft's management, poor design choices related to Analysis Services data modeling and DAX calculations can lead to very large datasets, high memory usage, and slow query execution. This generally translates to slow reports. On Premium/Embedded capacities Analysis Services issues can have an exponential effect because they can affect multiple datasets on the capacity.

The final section of this chapter identifies specific areas in Power BI where you can achieve the same result with different design patterns. The choices you make here can affect performance.

Which choices affect performance?

While there are many aspects of each individual Power BI component that can be optimized for performance, the following list is a good summary that can serve as a checklist for every solution:

Inappropriate use of DirectQuery/Import: Decisions here balance model size and refresh time with data freshness and report interactivity.Power Query design: Decisions here may fail to leverage the data source's native capabilities and therefore also fail to avoid additional work in the mashup engine.Data modeling: Decisions here may make the data model unnecessarily large, waste memory, consume more computing resources, and affect usability.Inefficient DAX calculations: Decisions here may fail to leverage the highly efficient internal VertiPaq Storage Engine and force operations in the Formula Engine.Complex or inefficient row-level security: Decisions here may create intensive calculations to resolve which rows the user can see.Poorly designed reports: Decisions here can put too much load on the user's device.Data source or network latency: Decisions here may place the data far away from the user.

Now that you have learned about the high-level areas of a solution that we need to consider to fully optimize performance, let's summarize the key learnings from this chapter.

Summary

As we have seen in this chapter, interacting with analytical reports is very similar to other web applications, so the user's level of engagement and satisfaction can be measured in similar ways. Studies of user interfaces and web browsing suggest that a report that is generated in less than 4 seconds is ideal. They also suggest that reports completing in 10-12-second durations or higher should be considered carefully as this is the point of user frustration.

You should set performance targets and be prepared for outliers by measuring against the 90th percentile (P90). Success may still require setting the right expectations by having different targets if you have highly complex reports.

It is important to remember that each component of Power BI and even the network itself can contribute to performance issues. Therefore, performance issues cannot be solved in isolation (for example, by only adjusting reports). This may require coordination with multiple teams and external vendors, particularly in large organizations.

In the next chapter, we will focus on the internal VertiPaq Storage Engine in Power BI to learn how to we can get it to optimize storage for us. We will also look at gateway optimization and general architectural advice to make sure the environment does not become a bottleneck.

Chapter 2: Exploring Power BI Architecture and Configuration

In the previous chapter, we established guidelines for setting reasonable performance targets and gained an understanding of the major solution areas and Power BI components that should be considered for holistic performance management.

In this chapter, you will dig deeper into specific architectural choices, learning how and why these decisions affect your solution's performance. You will learn to consider broad requirements and make an informed decision to design a solution that meets the needs of different stakeholders. Ultimately, this chapter will help you choose the best components to host your data within Power BI. We will focus mainly on the efficient movement of data from the source system to end users by improving data throughput and minimizing latency.

We will begin by looking at data storage modes in Power BI and how the data reaches the Power BI dataset. We will cover how to best deploy Power BI gateways, which are commonly used to connect to external data sources. These aspects are important because users often demand up-to-date data, or historical data, and can number in the thousands of parallel users in very large deployments.

This chapter is broken down into the following sections:

Understanding data connectivity and storage modesReaching on-premises data through gatewaysGeneral architectural guidance

Understanding data connectivity and storage modes

Choosing a data connectivity and storage mode is usually the first major decision that must be made when setting up a brand-new solution in Power BI. This means choosing between Import and DirectQuery, which we introduced in the previous chapter. Within Power BI Desktop, you need to make this decision as soon as you connect to a data source and before you can see a preview of the data to begin modeling.

Important Note

Not every data connector in Power BI supports DirectQuery mode. Some only offer Import mode. You should be aware of this because it means you may need to use other techniques to maintain data freshness when a dataset combines different data sources.

Figure 2.1 shows a SQL Server data source connection offering both Import and DirectQuery modes:

Figure 2.1 – Data connectivity options for a SQL Server source

Excel workbooks can only be configured as Import mode. Figure 2.2 demonstrates this, where we can only see a Load button without any choices for data connectivity mode. This implies that it is Import mode.

Figure 2.2 – Data connection for Excel showing no Import/DirectQuery choice

Choosing between Import and DirectQuery mode