cancel
Showing results for 
Search instead for 
Did you mean: 
Reply

Limit Dimension Data based on Fact Table in Power Query

There are 2 tables:

1. Dimension Table D - EMPNO (25,000,000 rows) sourced from Oracle DB

2. Fact Table F - Excel source

join on D.EMPNO=F.EMPNO using Merge queries in Power Query.

 

The performance is extremely slow using Import/DirectQuery mode. I tried to limit the rows in Dimension table by following the below steps:

1. Buffered List to extract unique list of EMPNO in F - EmpList 

Emplist = let
Source = List.Distinct(F[EMPNO]),
#"Buffered List" = List.Buffer(Source)
in
#"Buffered List"

2. Filter Dimension D  i.e. Table.SelectRows(#"Changed Type",each List.Contains(EmpList, [EMPNO]))

 

It works but it is extremely slow when "Enable Load" is enabled for Dimension D. Although the Step (2) returns few rows(129 rows), when I apply the changes in Power Query, it loads all the rows for Dimension D.

 

I was hoping to get some guidance if there is anything wrong or can be done differently.

 

 

22 REPLIES 22

Hi @BA_Pete 

The report works fine based on the established relationship.

But the filter the Dim table based on Buffered list and save the changes steps takes a long time.

Hi @LD2022 ,

 

Ok. My guess is that you're applying a query step to your dimension table that is breaking query folding or causing unnecesary processing. Using the method that we have, Power Query should be streaming the F[EMPNO] values to your SQL server in a native query, and should be really fast. Can you right-click the final step in your dimension query and check whether the 'Native Query' option is lit up or greyed-out please?

 

If it's greyed-out, select your dimension table as we have it set up now, go to the Home tab, select Advanced Editor and copy all the code in there. Then paste the whole lot into a code window here using this button:

 

Before you close the code window, please just overtype any sensitive connection string values (file paths, server paths etc.) with 'XXX', but keep the code structure itself, as this may be important.

I should be able to see if there's anything breaking query folding or causing unnecessary processing time that we can work around.

 

Pete

Hi @BA_Pete 

Below are the steps I have done:

DIM_EMPLOYEE=

let
    Source = Oracle.Database("XXX", [HierarchicalNavigation=true, Query="SELECT * FROM XX.DIM_EMPLOYEE"])
    
in
    Source

 

Create a reference table as below:

Filtered_Emp=

Filtered_Emp= let
    Source = DIM_EMPLOYEE,
    #"Filtered Rows" = Table.SelectRows(Source, each List.Contains(List.Buffer(EmpList),[CUSTOMER_NUMBER]))
in
    #"Filtered Rows"

 

The native query is greyed out.

I have disabled load for everything except the F and  Filtered_Emp. When I hit apply in Power Query, it tries to load all the 2 million records athough the Filtered_Emp has 200 records only.

 

 

Ok. Your current query setup won't allow query folding.

Can you try connecting without using an explicit SQL statement please?

 

Connect to the DB again, and leave the SQL statement blank:

 

Just hit OK, then select your dim_employee table from the table list and import.

 

Assuming this goes as planned, select the [EMPNO] column in your new dimension table and filter just one value. This should be fast to implement and sets us up a new step with the correct structure.

Edit that new step in the formula bar, so it uses our streaming list filter, something like this:

= Table.SelectRows(
    previousStepName,
    each List.Contains( List.Buffer( factTableName[EMPNO] ), [EMPNO] )
)

 

This setup should allow PQ to stream the [EMPNO] values from your Excel source into the SQL native query sent to the SQL source for your dim_employee table.

 

Pete 

Hi @BA_Pete 

I followed the steps that you suggested but still no change in the performance.

let
    Source = Oracle.Database("XXXX", [HierarchicalNavigation=true]),
    XX = Source{[Schema="XX"]}[Data],
    DIM_EMPLOYEE = XX{[Name="DIM_EMPLOYEE"]}[Data],
    #"Filtered Rows" = Table.SelectRows(DIM_EMPLOYEE, each List.Contains(List.Buffer(Fact[EMPNO]),[EMPNO]))
in
    #"Filtered Rows"

 

Hi @LD2022 ,

 

If you right-click your #"Filtered Rows" step, is 'Native Query' now selectable, or is it still greyed-out?

If it's still greyed out, try changing one of the column data types to the same type, but not a text column. For example, change a date type column to date type, or a decimal type column to decimal. This is purely to force PQ to re-evaluate the query and hopefully force it to generate a native query to source.

If native query IS selectable, then you're probably at the limit of optimisation. The Oracle Data Access Client (ODAC) requirement isn't the best in this regard.

 

Pete

 

Pete

Hi @BA_Pete 

I tried to perform the suggested steps from scratch and when I am applying the filter to limit the rows in the dimension, it says"This stepsresults in a query that is not supported in DirectQuery mode".

Hi @LD2022 ,

 

You shouldn't need to use Direct Query mode any more.

I believe you only wanted to use DQ due to the large number of rows in your dimension table. If we can get this method to fold to the source, it will work very quickly in Import mode as you will only actually be importing the number of rows that correlate to to values in your fact table (about 126 IIRC).

Go to your Model view in PBI Desktop, select your DQ dimension table, go to the Properties pane on the right, expand 'Advanced' at the bottom of this pane and choose Import. Then go and check your dimension query in PQ to see if the last step has 'Native Query' lit up.

 

Pete

Hi @BA_Pete 

It worked 🙂 But I also noticed that the moment I add any steps to the Dimension i.e. rename column/change datatype, the Native query option is greyed out and it started loading the data again. I will probably add a note in PQ not to add any steps in future else it might stop working.

 

Thank you so much @BA_Pete for the solution.

Hi @BA_Pete 

It worked 🙂 But I also noticed that the moment I add any steps to the Dimension i.e. rename column/change datatype, the Native query option is greyed out and it started loading the data again. I will probably add a note in PQ not to add any steps in future else it might stop working.

 

Thank you so much @BA_Pete for the solution.

Helpful resources

Announcements

Community will be READ ONLY July 16th, 5p PDT -July 22nd

Dear Community Members,   We'd like to let you know of an upcoming change to the community platform: starting July 16th, the platform will transition to a READ ONLY mode until July 22nd.   During this period, members will not be able to Kudo, Comment, or Reply to any posts.   On July 22nd, please be on the lookout for a message sent to the email address registered on your community profile. This email is crucial as it will contain your unique code and link to register for the new platform encompassing all of the communities.   What to Expect in the New Community: A more unified experience where all products, including Power Apps, Power Automate, Copilot Studio, and Power Pages, will be accessible from one community.Community Blogs that you can syndicate and link to for automatic updates. We appreciate your understanding and cooperation during this transition. Stay tuned for the exciting new features and a seamless community experience ahead!

Summer of Solutions | Week 4 Results | Winners will be posted on July 24th

We are excited to announce the Summer of Solutions Challenge!   This challenge is kicking off on Monday, June 17th and will run for (4) weeks.  The challenge is open to all Power Platform (Power Apps, Power Automate, Copilot Studio & Power Pages) community members. We invite you to participate in a quest to provide solutions in the Forums to as many questions as you can. Answers can be provided in all the communities.    Entry Period: This Challenge will consist of four weekly Entry Periods as follows (each an “Entry Period”)   - 12:00 a.m. PT on June 17, 2024 – 11:59 p.m. PT on June 23, 2024 - 12:00 a.m. PT on June 24, 2024 – 11:59 p.m. PT on June 30, 2024 - 12:00 a.m. PT on July 1, 2024 – 11:59 p.m. PT on July 7, 2024 - 12:00 a.m. PT on July 8, 2024 – 11:59 p.m. PT on July 14, 2024   Entries will be eligible for the Entry Period in which they are received and will not carryover to subsequent weekly entry periods.  You must enter into each weekly Entry Period separately.   How to Enter: We invite you to participate in a quest to provide "Accepted Solutions" to as many questions as you can. Answers can be provided in all the communities. Users must provide a solution which can be an “Accepted Solution” in the Forums in all of the communities and there are no limits to the number of “Accepted Solutions” that a member can provide for entries in this challenge, but each entry must be substantially unique and different.    Winner Selection and Prizes: At the end of each week, we will list the top ten (10) Community users which will consist of: 5 Community Members & 5 Super Users and they will advance to the final drawing. We will post each week in the News & Announcements the top 10 Solution providers.  At the end of the challenge, we will add all of the top 10 weekly names and enter them into a random drawing.  Then we will randomly select ten (10) winners (5 Community Members & 5 Super Users) from among all eligible entrants received across all weekly Entry Periods to receive the prize listed below. If a winner declines, we will draw again at random for the next winner.  A user will only be able to win once overall. If they are drawn multiple times, another user will be drawn at random.  Individuals will be contacted before the announcement with the opportunity to claim or deny the prize.  Once all of the winners have been notified, we will post in the News & Announcements of each community with the list of winners.   Each winner will receive one (1) Pass to the Power Platform Conference in Las Vegas, Sep. 18-20, 2024 ($1800 value). NOTE: Prize is for conference attendance only and any other costs such as airfare, lodging, transportation, and food are the sole responsibility of the winner. Tickets are not transferable to any other party or to next year’s event.   ** PLEASE SEE THE ATTACHED RULES for this CHALLENGE**   Week 1 Results: Congratulations to the Week 1 qualifiers, you are being entered in the random drawing that will take place at the end of the challenge. Community MembersNumber of SolutionsSuper UsersNumber of Solutions @anandm08  23 @WarrenBelz  31 @DBO_DV  10 @Amik  19 AmínAA 6 @mmbr1606  12 @rzuber  4 @happyume  7 @Giraldoj  3@ANB 6 (tie)   @SpongYe  6 (tie)     Week 2 Results: Congratulations to the Week 2 qualifiers, you are being entered in the random drawing that will take place at the end of the challenge. Community MembersSolutionsSuper UsersSolutions @anandm08  10@WarrenBelz 25 @DBO_DV  6@mmbr1606 14 @AmínAA 4 @Amik  12 @royg  3 @ANB  10 @AllanDeCastro  2 @SunilPashikanti  5 @Michaelfp  2 @FLMike  5 @eduardo_izzo  2   Meekou 2   @rzuber  2   @Velegandla  2     @PowerPlatform-P  2   @Micaiah  2     Week 3 Results: Congratulations to the Week 3 qualifiers, you are being entered in the random drawing that will take place at the end of the challenge.   Week 3:Community MembersSolutionsSuper UsersSolutionsPower Apps anandm0861WarrenBelz86DBO_DV25Amik66Michaelfp13mmbr160647Giraldoj13FLMike31AmínAA13SpongYe27     Week 4 Results: Congratulations to the Week 4 qualifiers, you are being entered in the random drawing that will take place at the end of the challenge.   Week 4:Community MembersSolutionsSuper UsersSolutionsPower Apps DBO-DV21WarranBelz26Giraldoj7mmbr160618Muzammmil_0695067Amik14samfawzi_acml6FLMike12tzuber6ANB8   SunilPashikanti8

Check Out | 2024 Release Wave 2 Plans for Microsoft Dynamics 365 and Microsoft Power Platform

On July 16, 2024, we published the 2024 release wave 2 plans for Microsoft Dynamics 365 and Microsoft Power Platform. These plans are a compilation of the new capabilities planned to be released between October 2024 to March 2025. This release introduces a wealth of new features designed to enhance customer understanding and improve overall user experience, showcasing our dedication to driving digital transformation for our customers and partners.    The upcoming wave is centered around utilizing advanced AI and Microsoft Copilot technologies to enhance user productivity and streamline operations across diverse business applications. These enhancements include intelligent automation, AI-powered insights, and immersive user experiences that are designed to break down barriers between data, insights, and individuals. Watch a summary of the release highlights.    Discover the latest features that empower organizations to operate more efficiently and adaptively. From AI-driven sales insights and customer service enhancements to predictive analytics in supply chain management and autonomous financial processes, the new capabilities enable businesses to proactively address challenges and capitalize on opportunities.    

Updates to Transitions in the Power Platform Communities

We're embarking on a journey to enhance your experience by transitioning to a new community platform. Our team has been diligently working to create a fresh community site, leveraging the very Dynamics 365 and Power Platform tools our community advocates for.  We started this journey with transitioning Copilot Studio forums and blogs in June. The move marks the beginning of a new chapter, and we're eager for you to be a part of it. The rest of the Power Platform product sites will be moving over this summer.   Stay tuned for more updates as we get closer to the launch. We can't wait to welcome you to our new community space, designed with you in mind. Let's connect, learn, and grow together.   Here's to new beginnings and endless possibilities!   If you have any questions, observations or concerns throughout this process please go to https://aka.ms/PPCommSupport.   To stay up to date on the latest details of this migration and other important Community updates subscribe to our News and Announcements forums: Copilot Studio, Power Apps, Power Automate, Power Pages

Users online (1,816)