06-26-2022 18:20 PM - last edited 06-27-2022 08:49 AM
Update & Create Excel Records 50-100x Faster
I was able to develop an Office Script to update rows and an Office Scripts to create rows from Power Automate array data. So instead of a flow creating a new action API call for each individual row update or creation, this flow can just send an array of new data and the Office Scripts will match up primary key values, update each row it finds, then create the rows it doesn't find.
And these Scripts do not require manually entering or changing any column names in the Script code.
• In testing for batches of 1000 updates or creates, it's doing ~2000 row updates or creates per minute, 50x faster than the standard Excel create row or update row actions at max 50 concurrency. And it accomplished all the creates or updates with less than 4 actions or only .4% of the standard 1000 action API calls.
• The Run Script code for processing data has 2 modes, the Mode 2 batch method that saves & updates a new instance of the table before posting batches of table ranges back to Excel & the Mode 1 row by row update calling on the Excel table.
The Mode 2 script batch processing method will activate for updates on tables less than 1 million cells. It does encounter more errors with larger tables because it is loading & working with the entire table in memory.
Shoutout to Sudhi Ramamurthy for this great batch processing addition to the template!
Code Write-Up: https://docs.microsoft.com/en-us/office/dev/scripts/resources/samples/write-large-dataset
Video: https://youtu.be/BP9Kp0Ltj7U
The Mode 1 script row by row method will activate for Excel tables with more than 1 million cells. But it is still limited by batch file size so updates on larger tables will need to run with smaller cloud flow batch sizes of less than 1000 in a Do until loop.
The Mode 1 row by row method is also used when the ForceMode1Processing field is set to Yes.
Be aware that some characters in column names, like \ / - _ . : ; ( ) & $ may cause errors when processing the data. Also backslashes \ in the data, which are usually used to escape characters in strings, may cause errors when processing the JSON.
Version 7 Note
Diverting from what is shown in the video, I was able to remove almost all the flow actions in the "Match new and existing data key values then batch update" scope after replicating their functions in the scripts themselves. The flow now goes directly from the "SelectGenerateData" action to the "Run script Update Excel rows" action and the script handles matching up the UpdatedData JSON keys/field names to the destination table headers.
Also version 7 changes the primary key set up in the SelectGenerateData and changes the logic for skipping cell value updates & blanking out cell values.
Now the primary key column name from the destination table must be present in the SelectGenerateData action with the dynamic content for the values you want to match up to update. No more 'PrimaryKey' line, the update script will automatically reference the primary key column in the SelectGenerateData data based on the PrimaryKeyColumnName input on the update script action in the flow.
Now leaving a blank "" in the SelectGenerateData action will make that cell value in the Excel table empty, while leaving a null in the SelectGenerateData action will skip updating / not alter the cell value that currently exists in the Excel table there.
Version 6 which looks closer to the version shown in the video can still be accessed here: https://powerusers.microsoft.com/t5/Power-Automate-Cookbook/Excel-Batch-Create-Update-and-Upsert/m-p...
Version 7 Set-Up Instructions
Go to the bottom of this post & download the BatchExcel_1_0_0_xx.zip file. Go to the Power Apps home page (https://make.powerapps.com/). Select Solutions on the left-side menu, select Import solution, Browse your files & select the BatchExcel_1_0_0_xx.zip file you just downloaded. Then select Next & follow the menu prompts to apply or create the required connections for the solution flows.
Once imported, find the Batch Excel solution in the list of solution & click it to open the solution. Then click on the Excel Batch Upserts V7 item to open the flow. Once inside the flow, delete the PlaceholderValue Delete after import action.
Open the Office Script Batch Update compose action to the BatchUpdateV7 script code. Select everything inside the compose input & control + C copy it to the clipboard.
Then find & open an Excel file in Excel Online. Go to the Automate tab, click on All Scripts & then click on New Script.
When the new script opens, select everything in the script & control + V paste the BatchUpdateV7 script from the clipboard into the menu. Then rename the script BatchUpdateV7 & save it. That should make the BatchUpdateV7 reference-able in the later Run script flow action.
Do the same process to import the BatchCreateV7 script.
Then go to the List rows Sample source data action. If you are going to use an Excel table as the source of updated data, then you can fill in the Location, Document Library, File, & Table information on this action. If you are going to use a different source of updated data like SharePoint, SQL, Dataverse, an API call, etc, then delete the List rows Sample source data placeholder action & insert your new get data action for your other source of updated data.
Following that, go to the Excel batch update & create scope. Open the PrimaryKeyColumnName action, remove the placeholder values in the action input & input the column name of the unique primary key for your destination Excel table. For example, I use ID for the sample data.
Then go to the SelectGenerateData action.
If you replaced the List rows Sample source data action with a new get data action, then you will need to replace the values dynamic content from that sample action with the new values dynamic content of your new get data action (the dynamic content that outputs a JSON array of the updated data).
In either case, you will need to input the table header names from the destination Excel table on the left & the dynamic content for the updated values from the updated source action on the right. You MUST include the column header for the destination primary key column & the the primary key values from the updated source data here for the Update script to match up what rows in the destination table need which updates from the source data. All the other columns are optional / you only need to include them if you want to update their values.
After you have added all the columns & updated data you want to the SelectGenerateData action, then you can move to the Run script Update Excel rows action. Here add the Location, Document Library, File Name, Script, Table Name, Primary Key Column Name, ForceMode1Processing, & UpdatedData. You will likely need to select the right-side button to switch the UpdatedData input to a single array input before inserting the dynamic content for the SelectGenerateData action.
Then open the Condition If new records continue else skip action & open the Run script Create Excel rows action. In this run script action input the Location, Document Library, Script, Table Name, & CreateData. You again will likely need to select the right-side button to change the CreateData input to a single array input before inserting the dynamic content for the Filter array Get records not found in table output.
If you need just a batch update, then you can remove the Filter array Get records not found in table & Run script Create Excel rows actions.
If you need just a batch create, then you can replace the Run script Batch update rows action with the Run script Batch create rows action, delete the update script action, and remove the remaining Filter array Get records not found in table action. Then any updated source data sent to the SelectGenerateData action will just be created, it won't check for rows to update.
Thanks for any feedback,
Please subscribe to my YouTube channel (https://youtube.com/@tylerkolota?si=uEGKko1U8D29CJ86).
And reach out on LinkedIn (https://www.linkedin.com/in/kolota/) if you want to hire me to consult or build more custom Microsoft solutions for you.
Office Script Code
(Also included in a Compose action at the top of the template flow)
Batch Update Script Code: https://drive.google.com/file/d/1kfzd2NX9nr9K8hBcxy60ipryAN4koStw/view?usp=sharing
Batch Create Script Code: https://drive.google.com/file/d/13OeFdl7em8IkXsti45ZK9hqDGE420wE9/view?usp=sharing
(ExcelBatchUpsertV7 is the core piece, ExcelBatchUpsertV7b includes a Do until loop set-up if you plan on updating and/or creating more than 1000 rows on large tables.)
watch?v=HiEU34Ix5gA
Version 1.2
I figured out how to set things up so if someone leaves an input column blank or uses a '' double single quote blank in a formula, then it doesn't send anything to the Script to update, so it doesn't touch each cell of that column that got the ''.
But if someone enters a null value from the expressions, then it will send the Script to update that column or cell and put in an empty value. I believe that is the same way it is for most standard SharePoint & Excel actions.
I then found a better way to structure some expressions & allow users to update on up to 50 columns anywhere spread across any sized table. (The 50 limit can be expanded by adding more expressions on one of the last pieces, but I don't anticipate many people needing more than that.)
I also cleaned up some bits of the flow & expressions to help readability.
@Yutao I finally feel pretty satisfied with this. I'll plan on making the video this weekend.
Version 2
I was fortunate enough to find Sudhi Ramamurthy's wonderful code for batch processing on the script side and incorporate it into the template for certain update batch & table sizes.
So the Run Script is now set with 2 modes, the V2 batch method that saves & updates a new instance of the table before posting batches of table ranges back to Excel & the V1 row by row update calling on the Excel table.
The V2 script batch processing method will activate for updates on tables less than 1 million cells. It does encounter more errors with larger tables because it is loading & working with the entire table in memory.
Shoutout to Sudhi Ramamurthy for this great batch processing addition to the template!
Code Write-Up: https://docs.microsoft.com/en-us/office/dev/scripts/resources/samples/write-large-dataset
Video: https://youtu.be/BP9Kp0Ltj7U
The V1 script row by row method will run for Excel tables with more than 1 million cells. But it is still limited by batch file size so updates on larger tables will need to run with smaller cloud flow batch sizes in a Do until loop.
And thanks again @Yutao!
Version 2 Upsert
I was able to modify the batch update script a bit to also make a batch create component. We now have the ability to batch update the rows that are already in the table and to batch create the rows that are not in the table yet (Batch Upsert).
@Yutao
I went to do my video demo of this and I started having weird errors with the batch create portion...
It started just posting duplicates of the last value read from the updated source data:
There's nothing wrong with the cloud flow and I've checked the Office Script. The error occurs on the part that generates the new rows in the in-memory table. Specifically this code:
//-1 to 0 index the RowNum
RowNum = RowNum - 1;
//Iterate through each object item in the array from the flow
for (let i = 0; i < CreateData.length; i++) {
//Create an empty row at the end of the 2D table array & increase RowNum by 1 for the next line updating that row
TableData.push(EmptyRow);
RowNum++;
//Iterate through each item or line of the current object
for (let j = 0; j < Object.keys(CreateData[i]).length; j++) {
//Create each value for each item or column given
TableData[RowNum][Number(Object.keys(CreateData[i])[j])] = CreateData[i][Object.keys(CreateData[i])[j]];
}
}
Which is very strange because essentially the same code works for the V1 method of batch create
//Iterate through each object item in the array from the flow
for (let i = 0; i < CreateData.length; i++) {
//Create an empty row at the end of the 2D table array & increase RowNum by 1 for the next line updating that row
table.addRow()
RowNum = RowNum + 1
//Iterate through each item or line of the current object
for (let j = 0; j < Object.keys(CreateData[i]).length; j++) {
//Create each value for each item or column given
TableRange.getCell(RowNum, Number(Object.keys(CreateData[i])[j])).setValue(CreateData[i][Object.keys(CreateData[i])[j]])
}
}
And for the V2 method of batch update:
//Iterate through each object item in the array from the flow
for (let i = 0; i < UpdatedData.length; i++) {
//If the record's Primary Key value is found continue, else post to error log
if (ArrayPK.indexOf(UpdatedData[i].PK) > 0) {
//Get the row number for the line to update by matching the foreign key from the other datasource to the primary key in Excel
RowNum = ArrayPK.indexOf(UpdatedData[i].PK)
//Iterate through each item or line of the current object
for (let j = 0; j < Object.keys(UpdatedData[i]).length - 1; j++) {
//Update each value for each item or column given
TableData[RowNum][Number(Object.keys(UpdatedData[i])[j])] = UpdatedData[i][Object.keys(UpdatedData[i])[j]]
}
}
And I'm not seeing anything logically wrong with it.
In fact, if I change out the i reference in the code for a static number so it references the same item each time, then it duplicates the values at that other row index. It's as if in this specific case when the i reference is in a nested loop, it is just going to the last value in the updatedata or createdata array of values (usually the maximum value i will ever have) from the cloud flow and never changing from that value in each loop run.
Why is this happening?
EDIT: It seems if I manually fill out the EmptyRow bit with an array of empties like ["", "", "", ""] then the error goes away and it enters the new rows normally. But it makes no sense. I pulled both the manual entry & the auto0generate entry of a blank row and they both have the exact same structure, output, & length.
It just doesn't seem to work if I try to use an EmptyRow array variable for an in-memory table. I had to adjust that part of the code to calculate a new array of blanks in each loop.
See the updated code below:
//Iterate through each object item in the array from the flow
for (let i = 0; i < CreateData.length; i++) {
//Create an empty row at the end of the 2D table array & increase RowNum by 1 for the next line updating that row
TableData.push(",".repeat(TableNumberOfColumns - 1).split(","));
RowNum++;
//Iterate through each item or line of the current object
for (let j = 0; j < Object.keys(CreateData[i]).length; j++) {
//Create each value for each item or column given
TableData[RowNum][Number(Object.keys(CreateData[i])[j])] = CreateData[i][Number(Object.keys(CreateData[i])[j])];
}
}
Download V2.1 for this new Batch Create patch.
The original post is now updated with a demonstration video. I decided to make this more of a demonstration than an explanation so the video doesn't get too long.
I'll have to do an explanation or "How it's built" type of video later after some planned videos on Excel Batch Delete & on some other use-cases for these templates.
https://youtu.be/HiEU34Ix5gA
@Yutao
Version 3
I noticed that the Mode 2 method of creating & updating Excel table rows does not maintain the table formulas as it is posting all the raw values back to the table. For this update, I've added another input field, "KeepFormulas", to the Office Script code. I input a default value of "No" in the template flows. If you input "Yes" or really anything other than "No", then the scripts will use the Mode 1 method of creating & updating the table rows that does maintain the table formulas. However, remember that Mode 1 is a little slower and does require smaller batch sizes. You may need to get the version b of the template flow if you want to update or create more than 1000 rows and you want to maintain your formulas in those rows.
I should be able to set something up to save & re-apply any formulas that span entire columns, maybe by saving the formulas of the 1st row to an array, then re-applying them to their respective columns & flashfilling to the end of the table. That way we could use the faster & larger processing method for tables under 1 million cells even if we have table formulas. I'll work on that and make another update here when I am done.
Version 4
I updated Mode 2 to save the formulas from the 1st table row & re-apply them to each entire column after the batch processing for the updates and/or creates. So we can now use the faster & larger batch size method for tables with formulas that are less than 1 million cells. But I did keep an input field option "ForceMode1Processing" just incase anyone has any issues or some use-case where that method of re-applying the formulas does not work for them.
I also made sure to list the expression some may need to use for their get data action within a loop in version b...
take(skip([InsertDataHere], sub(mul(iterationIndexes('Do_until_Update'), variables('BatchSize')), iterationIndexes('Do_until_Update'))), variables('BatchSize'))
I followed the video but i keep running the error: We were unable to run the script. Please try again.
Office JS error: Line 14: Table getColumn: The argument is invalid or missing or has an incorrect format.
clientRequestId: 25f085e3-af2c-4894-b41c-c9fe2b753a37
help?
@Anonymous
Check the name of the primary key you put in to the Run Script action.
Does it exactly match the primary key column name?
Does it have any special characters?