06-13-2022 09:31 AM - last edited 05-23-2024 10:32 AM
Update & Create Excel Records 50-100x Faster
I was able to develop an Office Script to update rows and an Office Scripts to create rows from Power Automate array data. So instead of a flow creating a new action API call for each individual row update or creation, this flow can just send an array of new data and the Office Scripts will match up primary key values, update each row it finds, then create the rows it doesn't find.
And these Scripts do not require manually entering or changing any column names in the Script code.
• In testing for batches of 1000 updates or creates, it's doing ~2000 row updates or creates per minute, 50x faster than the standard Excel create row or update row actions at max 50 concurrency. And it accomplished all the creates or updates with less than 4 actions or only .4% of the standard 1000 action API calls.
• The Run Script code for processing data has 2 modes, the Mode 2 batch method that saves & updates a new instance of the table before posting batches of table ranges back to Excel & the Mode 1 row by row update calling on the Excel table.
The Mode 2 script batch processing method will activate for updates on tables less than 1 million cells. It does encounter more errors with larger tables because it is loading & working with the entire table in memory.
Shoutout to Sudhi Ramamurthy for this great batch processing addition to the template!
Code Write-Up: https://docs.microsoft.com/en-us/office/dev/scripts/resources/samples/write-large-dataset
Video: https://youtu.be/BP9Kp0Ltj7U
The Mode 1 script row by row method will activate for Excel tables with more than 1 million cells. But it is still limited by batch file size so updates on larger tables will need to run with smaller cloud flow batch sizes of less than 1000 in a Do until loop.
The Mode 1 row by row method is also used when the ForceMode1Processing field is set to Yes.
Be aware that some characters in column names, like \ / - _ . : ; ( ) & $ may cause errors when processing the data. Also backslashes \ in the data, which are usually used to escape characters in strings, may cause errors when processing the JSON.
Version 7 Note
Diverting from what is shown in the video, I was able to remove almost all the flow actions in the "Match new and existing data key values then batch update" scope after replicating their functions in the scripts themselves. The flow now goes directly from the "SelectGenerateData" action to the "Run script Update Excel rows" action and the script handles matching up the UpdatedData JSON keys/field names to the destination table headers.
Also version 7 changes the primary key set up in the SelectGenerateData and changes the logic for skipping cell value updates & blanking out cell values.
Now the primary key column name from the destination table must be present in the SelectGenerateData action with the dynamic content for the values you want to match up to update. No more 'PrimaryKey' line, the update script will automatically reference the primary key column in the SelectGenerateData data based on the PrimaryKeyColumnName input on the update script action in the flow.
Now leaving a blank "" in the SelectGenerateData action will make that cell value in the Excel table empty, while leaving a null in the SelectGenerateData action will skip updating / not alter the cell value that currently exists in the Excel table there.
Version 6 which looks closer to the version shown in the video can still be accessed here: https://powerusers.microsoft.com/t5/Power-Automate-Cookbook/Excel-Batch-Create-Update-and-Upsert/m-p...
Version 7 Set-Up Instructions
Go to the bottom of this post & download the BatchExcel_1_0_0_xx.zip file. Go to the Power Apps home page (https://make.powerapps.com/). Select Solutions on the left-side menu, select Import solution, Browse your files & select the BatchExcel_1_0_0_xx.zip file you just downloaded. Then select Next & follow the menu prompts to apply or create the required connections for the solution flows.
Once imported, find the Batch Excel solution in the list of solution & click it to open the solution. Then click on the Excel Batch Upserts V7 item to open the flow. Once inside the flow, delete the PlaceholderValue Delete after import action.
Open the Office Script Batch Update compose action to the BatchUpdateV7 script code. Select everything inside the compose input & control + C copy it to the clipboard.
Then find & open an Excel file in Excel Online. Go to the Automate tab, click on All Scripts & then click on New Script.
When the new script opens, select everything in the script & control + V paste the BatchUpdateV7 script from the clipboard into the menu. Then rename the script BatchUpdateV7 & save it. That should make the BatchUpdateV7 reference-able in the later Run script flow action.
Do the same process to import the BatchCreateV7 script.
Then go to the List rows Sample source data action. If you are going to use an Excel table as the source of updated data, then you can fill in the Location, Document Library, File, & Table information on this action. If you are going to use a different source of updated data like SharePoint, SQL, Dataverse, an API call, etc, then delete the List rows Sample source data placeholder action & insert your new get data action for your other source of updated data.
Following that, go to the Excel batch update & create scope. Open the PrimaryKeyColumnName action, remove the placeholder values in the action input & input the column name of the unique primary key for your destination Excel table. For example, I use ID for the sample data.
Then go to the SelectGenerateData action.
If you replaced the List rows Sample source data action with a new get data action, then you will need to replace the values dynamic content from that sample action with the new values dynamic content of your new get data action (the dynamic content that outputs a JSON array of the updated data).
In either case, you will need to input the table header names from the destination Excel table on the left & the dynamic content for the updated values from the updated source action on the right. You MUST include the column header for the destination primary key column & the the primary key values from the updated source data here for the Update script to match up what rows in the destination table need which updates from the source data. All the other columns are optional / you only need to include them if you want to update their values.
After you have added all the columns & updated data you want to the SelectGenerateData action, then you can move to the Run script Update Excel rows action. Here add the Location, Document Library, File Name, Script, Table Name, Primary Key Column Name, ForceMode1Processing, & UpdatedData. You will likely need to select the right-side button to switch the UpdatedData input to a single array input before inserting the dynamic content for the SelectGenerateData action.
Then open the Condition If new records continue else skip action & open the Run script Create Excel rows action. In this run script action input the Location, Document Library, Script, Table Name, & CreateData. You again will likely need to select the right-side button to change the CreateData input to a single array input before inserting the dynamic content for the Filter array Get records not found in table output.
If you need just a batch update, then you can remove the Filter array Get records not found in table & Run script Create Excel rows actions.
If you need just a batch create, then you can replace the Run script Batch update rows action with the Run script Batch create rows action, delete the update script action, and remove the remaining Filter array Get records not found in table action. Then any updated source data sent to the SelectGenerateData action will just be created, it won't check for rows to update.
Thanks for any feedback,
Please subscribe to my YouTube channel (https://youtube.com/@tylerkolota?si=uEGKko1U8D29CJ86).
And reach out on LinkedIn (https://www.linkedin.com/in/kolota/) if you want to hire me to consult or build more custom Microsoft solutions for you.
Office Script Code
(Also included in a Compose action at the top of the template flow)
Batch Update Script Code: https://drive.google.com/file/d/1kfzd2NX9nr9K8hBcxy60ipryAN4koStw/view?usp=sharing
Batch Create Script Code: https://drive.google.com/file/d/13OeFdl7em8IkXsti45ZK9hqDGE420wE9/view?usp=sharing
(ExcelBatchUpsertV7 is the core piece, ExcelBatchUpsertV7b includes a Do until loop set-up if you plan on updating and/or creating more than 1000 rows on large tables.)
watch?v=HiEU34Ix5gA
Thank you for the script 4.5 @takolota , but this error apperas on the script in the line 55 😞
Apart from this, there are more errors on script.
Previous scripts to 4.2 works for me. For a Batchsize of 300, I get to paste 1700 rows x 50 col, to a large dataset >10M cells
@Vazquez_A
Thanks for the heads up. It's not any actual problem with the original script code, but for some reason copying it from a code-block on the MS forum creates errors. The same error doesn't seem to come up for the script code I put in the compose actions of the V4.5 flow zip files. I also updated that previous post with the code block to link to a text file with the code instead. That seems to also avoid the errors.
Patch 4.6
I found a slightly better way to format the "Select Reformat update data for script" and "Select Remove blank lines" actions to make it more efficient & easier to update if people need to add new columns. I then added 10 more columns per @Vazquez_A's request.
If you need to add new columns, just copy the expressions from the last line of the Select map into a text file, & use the Replace function to replace the last column number with the next column number. For example, replace 62 with 63 in each the left & right expression, then add those new 63 expressions to the next line in the mapping. That will add another column.
By making the only difference in each line it's line number in the expressions, this also brings us closer to a more significant overhaul in this Select action piece where I may be able to use some of the techniques from my Batch SQL template (https://powerusers.microsoft.com/t5/Power-Automate-Cookbook/SQL-Batch-Create-Read-Update-and-Delete/...) to more dynamically adjust to the number of update columns. If I can do that, then it will remove any of these manual input limitations on the number of columns that can be updated at once and the only other limiting factor for this piece would be the maximum input message size (100MB) for these Select actions (https://learn.microsoft.com/en-us/power-automate/limits-and-config#message-size).
I don't know how many times I'm going to think I've found a way to make it a little more efficient by sending a variable of a blank line to the next row in the Batch Create script, only to find doing so causes the last value to be inserted in every newly created row for some reason.
If you recently downloaded Patch 4.6, you'll probably want to update the Batch Create V4-5 script with this Batch Create V4-6 script. The only difference is instead of using the Blank Line variable, it generates a new blank array each time it needs to add an empty line for the next new item to be inserted. Sorry for any bugs or confusion.
https://drive.google.com/file/d/13OeFdl7em8IkXsti45ZK9hqDGE420wE9/view?usp=sharing
Awesome work and super helpful. The flow is very well explained and the import process worked well.
Could you point me to the best way to debug the script? My flow runs without error after I imported the batch update (and removed the create missing rows part). The script seems to be accepted by Office without error. But no matter what I do, nothing gets updated. It does seem that the json passed to the script is the right structure. But maybe I am not setting up the inputs properly:
Additional question for curiosity: if speed is the main focus, how did you decide between json passed to Office Script vs. calling the MS Graph API directly as the best method to batch update? I am not familiar enough with either.
@RombeNa
There is an update range property for MS Graph, but that may be even more complicated to get the entire table range, & transform your update data with the right notation & format. And most Power Automate actions max out around 100,000 to 150,000 rows, so if something requires loading the entire range in memory for a larger table, it may not work in Power Automate.
Also anytime someone would want to use it for something else, they would have to manually adjust the workbook & table ids in the call.
https://learn.microsoft.com/en-us/graph/api/range-update?view=graph-rest-1.0&tabs=http
Also I'm almost done with a more significant overhaul that will make this 4x faster. I'm doing around 10,000 records a minute in tests of this newer version with a faster way to format all the data into the destination JSON structure.
And on to your problem. Something about your inputs seems off. Why is there an extra object level labeled "0"?
This is what that part looks like for me, no extra object level, just a plain JSON array:
Could you please show what you have in the Generate update data action?
Here is the generate update data action and the last compose "remove blank lines"
And here is how I passed them to the script (output of the remove blank line action)
I've tried started from scratch with copy pasting the batch update office scripts from v4.6 from your link.
Thanks for the explanation!
Ahh I see. Thanks for that last picture.
In the script actions try clicking the map button with the small T in it to get the single box array input. Then put the data output in that single box. There shouldn’t be multiple lines for the data input.
Version 5 - Significant Update to Formatting Final JSON & Loop Error Handling
For version 5 I've redesigned the way the flow creates the destination JSON for the script actions and added a lot to error handling in the version b looping.
Generating JSON for the destination files in the scripts...
No more column limits! The flow no longer relies on a Select action with a line for each column. It just reformats all the JSON for a table with any number of columns.
Also this part of the workflow is now many times faster.
Version b loop error handling...
I did more testing on the loop version with larger files & noticed if the Office Scripts actions started to fail and retried the script per the default settings, then it would often result in a create batch being sent to the file multiple times. This would create many duplicate rows. Maybe a send that previously "failed" later went through, resulting in the duplicate batch action. To work around this, I turned off any retries on the Script actions & instead worked some retry logic directly into the Do Until loop. Now if a Script action fails, it just stops & errors without trying to send anything again, then the loop will move to its next iteration, but the loop count will remain the same so it will just call on that same batch again. That way the retries happen for the entire loop & not for the individual Script actions. That seemed to resolve this duplicate batch bug. You can control how many times the loop should retry a batch on error with the MaxRetries setting in the Compose action.
I also set the loop version b to turn on ForceMode1Processing on the script actions by default as that mode seems much faster & less error prone on the smaller batch sizes the loop version was made for.
@Vazquez_A
Version 5 should work a lot better for your larger tables with many columns. And no need to add more lines for each column.