Microsoft Fabric Preview Projects: There’s a Different Fabric for Each Use Case, and That’s a Good Thing
Since Microsoft Fabric was released in public preview earlier this year, iLink has worked with our customers to develop end-to-end analytics projects using the new platform. We have had the opportunity to put the claims of an all-in-one enterprise-scale software as a service (SaaS) analytics platform to the test, and in this post, we’ll share what we’ve learned, what we’re excited about, and our overall impression.
What is Microsoft Fabric?
Microsoft Fabric is a single SaaS platform that combines seven core analytics workloads into one product, simplifying and unifying governance, security, administration, billing, navigation, monitoring, etc. It includes data engineering, data warehousing, data science, real-time analytics, Power BI, and a new tool to automate actions from data called Data Activator. The cherry on top of this multi-layered platform is Copilot, which promises to simplify the creation of Power BI reports, more easily provide insights, support the development of code, and more.
What new tools and features does Fabric offer?
Fabric takes many of the analytics products previously offered by Microsoft and brings them all together under one roof. These include Power BI, Azure Data Factory, Synapse Analytics, etc. However, Fabric is more than just a combination and rebranding of existing tools. A few of the new tools and features that are part of the service include:
- OneLake: OneLake (the OneDrive for data) is the storage layer and the foundation for all of the workloads in Fabric. OneLake is a multi-cloud data lake that is automatically provisioned for an entire tenant. Delta, the storage format for Fabric workloads is open, meaning that other tools beyond Fabric can easily access and work with data stored in the OneLake. Additionally, to reduce duplication of data, OneLake provides two key capabilities: shortcuts and mirroring. Shortcuts allow seamless connections between OneLake and external storage. Mirroring (in preview) provides near real time data duplication from other cloud data warehouses to Fabric, without needed to build pipelines. Finally, all of the Fabric workloads have been optimized to store and retrieve data quickly in the delta format, including Power BI.
- Direct Lake mode for Power BI datasets: In addition to import mode and DirectQuery mode for Power BI datasets, Fabric brings us the best of both worlds with Direct Lake mode. Direct Lake mode gives us the real-time capability of DirectQuery but with the speed of import. This can be a huge relief in those cases where import mode dataset refreshes become challenging, for example.
- Data warehouses and lakehouses: Storing data in a Fabric data warehouse and/or lakehouse within the familiar environment of Power BI allows both data engineering experts and non-experts to meet their data storage, transformation, and access needs in one place, without the need for complex set-up or configuration. It also provides many more methods and tools to access data used for Power BI reports beyond DAX and visualization tools. For example, when data is stored in a lakehouse and Power BI uses Direct Lake mode to access the data, other personas can choose to query it with T-SQL via the SQL endpoint or with Python via a Spark notebook.
- Single compute model for all workloads: With Fabric, you purchase a certain level of compute and all of the workloads come from the same pool. This simplifies planning for workloads and billing. Instead of creating different subscriptions and groups for each workload, you simply purchase an F SKU (stock keeping unit), and all the workloads draw from that compute. With the Fabric Capacity Metrics app, it’s possible to see how much of your compute is being used by each workload over time and modify accordingly. With both a pay-as-you-go model and a reserved instance pricing model, this has the potential to simplify monitoring and save money.
- Spark notebooks: Notebooks open all kinds of possibilities, from data engineering to data analytics and data science. Spark compute is incredibly fast and increasingly used for many workloads. Being able to combine notebooks with orchestration through data factory pipelines is one of many ways to bring in and transform data to OneLake.
- Semantic Link: Semantic link is a new feature that gives us a way to establish a connection between Power BI datasets and data science using Fabric notebooks. This opens many possibilities including documentation and governance scenarios through querying which datasets exist in a tenant, development of machine learning models using Power BI data, and data quality validation through querying of both Power BI datasets and source data within a single tool.
What doesn’t change?
Because all of the non-Power BI workloads were brought into the interface of Power BI, Power BI users have the home-court advantage of a familiar working environment. Power BI itself still functions as before, but now has many added features such as DirectLake mode. Fabric just brings new features, tools, and licensing options to the game.
Lessons learned during the public preview
From iLink’s implementations of Fabric with our customers and through our own internal tinkering, we’ve had the opportunity to kick the tires and get a look under the hood. We’re happy to share some of the things we’ve learned, including:
- DirectLake mode really is a game-changer for Power BI datasets. By being able to query billions of rows of data and get nearly instantaneous results without needing an imported dataset refresh, our customers gain simplicity at a massive scale. In cases where we might have needed to build out complicated incremental refresh policies, custom partitions, or aggregate tables in the past, DirectLake mode instead allows us to directly access the data we need.
- It’s also still very important to follow data modeling best practices to ensure optimal performance.
- There are row limits in DirectLake datasets where the dataset will fall back to DirectQuery if they are passed. When these limits are published, it will be important to pay attention
- Different workloads, particularly within engineering, have different levels of performance. For example, data transformation through Spark notebooks performs much better than through Dataflows Gen2 currently.
- With Fabric there is often more than one way to accomplish a goal. It’s important to test out various methods to determine which one is the right fit for your scenario based on factors such as cost, performance, and team skills.
- With the release of a greater range of skus available, there is a much more affordable entry point to get started with Fabric. The lowest price for an F SKU is $263 (in US West)/month in a pay-as-you-go scenario. With reserved instance pricing this will be even lower. Compared to the lowest Power BI Premium capacity of a P1 at $4,995/month, this opens Power BI premium and Fabric features to a much broader range of organizations.
- It’s critical to monitor your workloads with the capacity metrics app to ensure that you’re using the right capacity or capacities.
- A well-thought-out workspace and capacity strategy is a must if cost is important.
In working with Fabric for the past year, both while it was in private and public preview, iLink has had the opportunity to work with customers and the Fabric product team directly. Our successful implementations with both larger (Fortune 500) and smaller organizations has allowed iLink to become a Microsoft Fabric Featured Partner. This has given us a unique perspective on the product.
From our experiences and conversations, we see great potential for Microsoft Fabric, both for new and existing customers. Microsoft understands that many customers have already implemented large-scale analytics projects and have an existing foundation to work with. We have seen that Fabric plays well with others in scenarios like this, but that it also has all the components to support an end-to-end analytics solution at a massive scale. What this means is that Fabric doesn’t require you to fit into a specific use case for it to work for you. Instead, all the Fabric building blocks can be put together to fit your unique needs.
Fabric has the potential to increase developer and analyst productivity while reducing overall costs of ownership. On top of that, the introduction of Copilot within the various workloads of Fabric has the potential to be a true game changer. The 2023 “Total Economic Impact of Microsoft Fabric” Forrester report concludes that Microsoft Fabric will help increase productivity and reduce spend (by making it an end-to-end platform), leading to overall TCO reduction and much higher ROI. From what we’ve seen so far, we agree with their assessment.
We’re excited to continue working with customers on Fabric implementations and feel confident that Fabric is going in the right direction.
Contact us today to learn more about Microsoft Fabric and what it can do in your unique analytics scenarios.