07-22-2022 11:07 AM - last edited 07-22-2022 11:08 AM
Excel Batch Delete (40x Faster)
Instead of performing a slow 1 deletion per action, these flows send arrays of deletion primary keys to an Office Script. On the Excel side the script can then delete each row it finds with a primary key in the deletion input array.
In testing the flow completes 1500 row deletions per minute, about 40x faster than the standard Excel delete row actions at max 50 concurrency. And it accomplished each 750 deletions with less than 8 actions or about 1% of the standard 750 action API calls.
The "results" output of the Run script action returns the change in the number of rows, so the number of rows removed since the script started running. The last few actions will check if the number of rows in the deletion array match the change in number of rows & force a flow failure to make it clear that there was an error.
Version 3.1 closer to what is reviewed in the YouTube video is still available here. But for version 4 you can follow the Import & Set-Up instructions below.
Version 4 Import & Set-Up
Go to the bottom of this post & download the BatchExcel_1_0_0_xx.zip file. Go to the Power Apps home page (https://make.powerapps.com/). Select Solutions on the left-side menu, select Import solution, Browse your files & select the BatchExcel_1_0_0_xx.zip file you just downloaded. Then select Next & follow the menu prompts to apply or create the required connections for the solution flows.
Once imported, find the Batch Excel solution in the list of solution & click it to open the solution. Then click on the Excel Batch Deletes V4 item to open the flow. Once inside the flow, delete the PlaceholderValue Delete after import action.
Open the Office Script Batch Delete action to the script code. Select everything inside the compose input & control + C copy it to the clipboard.
Then find & open an Excel file in Excel Online. Go to the Automate tab, click on New Script & remove all the placeholder script code.
Control + V paste the Batch Delete script code from the clipboard into the menu. Then rename the script BatchDeleteV4 & save it. That should make the BatchDeleteV4 reference-able in the later Run script flow action.
Go to the Batch Delete Excel scope to the List rows & Filter array actions. Input your Location, Document Library, File, & Table for your dataset.
Then in the Filter array action you can add any more complex filters to filter down to just the rows/records you want deleted.
Then in the Run script Delete Excel Rows action input your Location, Document Library, File, & Script. Once you enter the correct script it should load more parameters where you can enter your Table Name, Primary Key Column Names, & Delete Data. The DeleteDataPrimaryKeyColumnName should be the column name of the primary key from the data that is being fed in the DeleteData input. DestinationPrimaryKeyColumnName should be the column name of the primary key from the destination Excel table where the data is being deleted. In the regular Excel Batch Delete V4 flow & any flow where one is pulling from the destination table itself in the List rows to tell what rows to delete, the two PrimaryKeyColumnName inputs will be the same because they are both the primary key column name from the destination table. So they may only be different when the DeleteData is not coming from the destination table data.
For the DeleteData input, when you initially open this script action you may need to select the right-side toggle to "Switch to input entire array" to get a single input box for an array. There you should enter the dynamic content output for the action immediately preceding the Run script Delete Excel rows action, like "Filter array More filters Excel" or "BatchRecords".
And please remember to check your dataset & who might be using parts of it before running a large batch delete that may affect their work.
Thanks for any feedback,
Please subscribe to my YouTube channel (https://youtube.com/@tylerkolota?si=uEGKko1U8D29CJ86).
And reach out on LinkedIn (https://www.linkedin.com/in/kolota/) if you want to hire me to consult or build more custom Microsoft solutions for you.
Office Script Code V4 (Also included in a Compose action at the top of the template flow)
https://drive.google.com/file/d/1yLqB9Od_r3Mk9wvzBqubBO2bMzm4Ye3U/view?usp=sharing
If the legacy import method does not work, see this alternate Solutions package import method: Re: Excel Batch Delete - Page 5 - Power Platform Community (microsoft.com)
watch?v=LrzjH9dI0is
@rbunge I always understood it as either…
Pagination is the number loaded & top count is the number returned
OR
Pagination is the maximum number returned if there is no lower top count set
Either way, I set the original top count to 750 because otherwise some piece of the JavaScript may not be able to handle the larger array/cell size & error out.
But it may be able to handle around 1000 in several circumstances.
Otherwise with larger deletions you have to either…
Set this flow to run twice
OR
Use the version B with the Do-until loop set-up
Alternative import method...
Excel Batch Delete V2 code to copy & paste into the "My clipboard" tab of a new action menu on your flow:
{"id":"70835b2d-50ad-42ea-bc46-d089297a6b73","brandColor":"#8C3900","connectionReferences":{"shared_excelonlinebusiness":{"connection":{"id":"/providers/Microsoft.PowerApps/apis/shared_excelonlinebusiness/connections/39f5641011a343e8aac23b8d01bee065"}},"shared_excelonlinebusiness_1":{"connection":{"id":"/providers/Microsoft.PowerApps/apis/shared_excelonlinebusiness/connections/shared-excelonlinebu-29fbd1aa-441a-4ca4-badb-fd92d9ce165f"}}},"connectorDisplayName":"Control","icon":"data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iMzIiIGhlaWdodD0iMzIiIHZlcnNpb249IjEuMSIgdmlld0JveD0iMCAwIDMyIDMyIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPg0KIDxwYXRoIGQ9Im0wIDBoMzJ2MzJoLTMyeiIgZmlsbD0iIzhDMzkwMCIvPg0KIDxwYXRoIGQ9Im04IDEwaDE2djEyaC0xNnptMTUgMTF2LTEwaC0xNHYxMHptLTItOHY2aC0xMHYtNnptLTEgNXYtNGgtOHY0eiIgZmlsbD0iI2ZmZiIvPg0KPC9zdmc+DQo=","isTrigger":false,"operationName":"Excel_Batch_Delete_Scope_Container","operationDefinition":{"type":"Scope","actions":{"Office_Script_Batch_Delete":{"type":"Compose","inputs":"function main(workbook: ExcelScript.Workbook,\n TableName: string,\n PrimaryKeyColumnName: string,\n DeleteData: deletedata[],\n) {\n let table = workbook.getTable(TableName);\n let RowNum: number;\n let TableRange = table.getRange();\n let TableData = TableRange.getValues();\n let ArrayPK = table.getColumn(PrimaryKeyColumnName).getRange().getValues().join(\"#|#\").split(\"#|#\");\n let ColumnCount = TableRange.getColumnCount();\n let InitialRowCount = TableRange.getRowCount();\n let TableSize = InitialRowCount * ColumnCount;\n\n console.log(`Starting table size is ${TableSize} cells.`);\n\n //Create a blank row at the end of the table to copy into each deletion row\n table.addRow();\n let TableRange2 = table.getRange()\n let EndTableBlankRow = TableRange2.getRow(TableRange2.getRowCount() - 1)\n let EndTableBlankRowPK = table.getColumnByName(PrimaryKeyColumnName).getRange().getRow(TableRange2.getRowCount() - 1).getValue()\n\n // Update tag deletion items row by row if the table is too large for the V2 batch update\n //Iterate through each object item in the array from the flow\n for (let i = 0; i < DeleteData.length; i++) {\n //Get the row number for the line to update by matching the foreign key from the other datasource to the primary key in Excel\n RowNum = ArrayPK.indexOf(DeleteData[i].PK);\n //Set each deletion row to blank. Can not directly delete the row in this step as that causes the index of all the other deletion rows to change.\n TableRange.getRow(RowNum).copyFrom(EndTableBlankRow, ExcelScript.RangeCopyType.values)\n }\n //RemoveDuplicates on all columns to remove the identical blank records\n table.resize(TableRange.getAbsoluteResizedRange(table.getRange().removeDuplicates([0, ColumnCount - 1], false).getUniqueRemaining(), ColumnCount));\n\n //Refresh after deletion index changes\n ArrayPK = table.getColumn(PrimaryKeyColumnName).getRange().getValues().join(\"#|#\").split(\"#|#\");\n let LastBlank = ArrayPK.indexOf(String(EndTableBlankRowPK));\n let FinalRowCount = workbook.getTable(TableName).getRange().getRowCount().valueOf();\n let RowsDeleted = InitialRowCount - FinalRowCount;\n if (LastBlank > 1 || FinalRowCount > 2) {\n //Delete the last blank row that wasn't removed by the duplicates function\t\n table.deleteRowsAt((LastBlank - 1), 1);\n }\n //If the deletion runs against the top of the table adjust the count to avoid a false negative error\n if (FinalRowCount = 2) { RowsDeleted = RowsDeleted + 1 };\n\n console.log(`Number of Rows Deleted: ${RowsDeleted} (Number of rows deleted is also provided as the result value)`);\n console.log(`Initial Row Count: ${InitialRowCount}`);\n console.log(`Final Row Count: ${workbook.getTable(TableName).getRange().getRowCount().valueOf()}`);\n console.log(`Final Table Size: ${workbook.getTable(TableName).getRange().getRowCount().valueOf() * ColumnCount} cells`);\n return RowsDeleted\n}\n \ninterface deletedata {\n 'PK': (string | undefined)\n}","runAfter":{},"description":"Go to an online Excel table, go to the Automate tab, select New Script, then replace the placeholder script code by pasting all this in the code editor. You may want to name the script BatchDeleteV2.","metadata":{"operationMetadataId":"7922f637-e66f-45ad-a5e1-c69106b4a210"}},"Batch_Delete_Excel":{"type":"Scope","actions":{"Run_script_Delete_Excel_rows":{"runAfter":{"Select_Get_only_key_values":["Succeeded"]},"limit":{"timeout":"PT15M"},"metadata":{"01PCEUDVFYX74MYI46EJCZF6AUHSUSPNMU":"/Book.xlsx","operationMetadataId":"713f593f-c5e4-43b6-ad08-3a71051bbf87","tableId":null,"01PCEUDVCCZ7IYLT5RUBC3N6VXZWCYFNNI":"/DestinationWorkbook.xlsx"},"type":"OpenApiConnection","inputs":{"host":{"connectionName":"shared_excelonlinebusiness","operationId":"RunScriptProd","apiId":"/providers/Microsoft.PowerApps/apis/shared_excelonlinebusiness"},"parameters":{"source":"me","drive":"b!CVcYUvLh8EGkvkCsqwCTp7Pqs2dO4IxIl2FLeQnvIk-zJXFeRmKbRIbbO8P0Lkhi","file":"01PCEUDVCCZ7IYLT5RUBC3N6VXZWCYFNNI","scriptId":"ms-officescript%3A%2F%2Fonedrive_business_itemlink%2F01PCEUDVEBU4GKVUC3LVHKKOFYPVYZQVWP","ScriptParameters/TableName":"Table1","ScriptParameters/PrimaryKeyColumnName":"@outputs('Input_Excel_Primary_Key_Column_Name')","ScriptParameters/DeleteData":"@body('Select_Get_only_key_values')"},"authentication":{"value":"@json(decodeBase64(triggerOutputs().headers['X-MS-APIM-Tokens']))['$ConnectionKey']","type":"Raw"}},"description":"Change the Location, Doc Library, File, Script, TableName, & PrimaryKey fields to match your use-case. Make sure the primary key column listed is the column you want to be matched with to determine deletions."},"Filter_array_More_filters_Excel":{"type":"Query","inputs":{"from":"@outputs('List_rows_Get_rows_to_delete')?['body/value']","where":"@equals('', '')"},"runAfter":{"List_rows_Get_rows_to_delete":["Succeeded"]},"description":"Use this action to apply additional filters if the single condition filter for the Excel List rows action is too limiting.","metadata":{"operationMetadataId":"2e9cc934-03de-4b20-824e-b03708dc6a22"}},"Condition_Check_input_array_vs_rows_deleted":{"type":"If","expression":{"or":[{"not":{"equals":["@length(body('Select_Get_only_key_values'))","@outputs('Run_script_Delete_Excel_rows')?['body/result']"]}},{"less":["@outputs('Run_script_Delete_Excel_rows')?['body/result']",1]},{"equals":["@outputs('Run_script_Delete_Excel_rows')?['body/result']",""]}]},"actions":{"Terminate":{"metadata":{"operationMetadataId":"f5ecd904-e4d5-4011-b66e-af4fe83fa5bf"},"type":"Terminate","inputs":{"runStatus":"Failed","runError":{"message":"The number of script input items does not match the number of rows deleted while the script was running. Check for errors.\n\nFor example, if the batch size is set too large, then some parts of the script may not be able to load everything and the script action will fail.\nOr if you have duplicates in your primary key column, then each batch run will only delete the 1st of those duplicates in each batch. That will mean the number of rows sent to be deleted will not match the number actually deleted in the results."}},"runAfter":{}}},"runAfter":{"Run_script_Delete_Excel_rows":["Succeeded"]},"description":"Check if the number of records in the input array & in the records deleted output results do not match. Check that anything was deleted.","metadata":{"operationMetadataId":"5f06f90a-d67f-40ac-ae02-557b3c3d3b5e"}},"Select_Get_only_key_values":{"type":"Select","inputs":{"from":"@body('Filter_array_More_filters_Excel')","select":{"PK":"@item()[outputs('Input_Excel_Primary_Key_Column_Name')]"}},"runAfter":{"Filter_array_More_filters_Excel":["Succeeded"]},"description":"Creates a JSON array with just the PK label and your Primary Key values from the filtered Excel data. Selecting just the primary key values of the items to delete makes the array going to the script more compact & improves performance.","metadata":{"operationMetadataId":"40133677-e3ba-411b-808c-47e725742e07"}},"Input_Excel_Primary_Key_Column_Name":{"type":"Compose","inputs":"Insert Primary Key Column Name\nForce an error if this action is not checked (@{div(1,0)})","runAfter":{},"description":"Insert Primary Key Column Name","metadata":{"operationMetadataId":"9874620f-697b-4a93-b98b-2cadd8779539"}},"List_rows_Get_rows_to_delete":{"type":"OpenApiConnection","inputs":{"host":{"connectionName":"shared_excelonlinebusiness","operationId":"GetItems","apiId":"/providers/Microsoft.PowerApps/apis/shared_excelonlinebusiness"},"parameters":{"source":"me","drive":"b!CVcYUvLh8EGkvkCsqwCTp7Pqs2dO4IxIl2FLeQnvIk-zJXFeRmKbRIbbO8P0Lkhi","file":"01PCEUDVCCZ7IYLT5RUBC3N6VXZWCYFNNI","table":"{391B5FAB-9599-4171-942A-F11041838A75}","$top":750},"authentication":{"type":"Raw","value":"@json(decodeBase64(triggerOutputs().headers['X-MS-APIM-Tokens']))['$ConnectionKey']"}},"runAfter":{"Input_Excel_Primary_Key_Column_Name":["Succeeded"]},"description":"Pagination set to 25000. Suggested batch size of 750 in the Top Count. Higher batch sizes often overload a piece of the script & cause a timeout error.","runtimeConfiguration":{"paginationPolicy":{"minimumItemCount":25000}},"metadata":{"01PCEUDVFYX74MYI46EJCZF6AUHSUSPNMU":"/Source-Data.xlsx","operationMetadataId":"408aee2b-05c0-44e2-8468-78c78fb0530d","tableId":"{391B5FAB-9599-4171-942A-F11041838A75}","01PCEUDVCCZ7IYLT5RUBC3N6VXZWCYFNNI":"/DestinationData.xlsx"}}},"runAfter":{"Office_Script_Batch_Delete":["Succeeded"]},"description":"Input the EXACT primary key column name in the 1st action. Change the List rows action to your dataset. Use the Excel filter &/or the Filter array to set deletion criteria. Adjust the script action to your dataset, data, & script.","metadata":{"operationMetadataId":"175119c4-e881-4899-81d6-14aaa7f741f8"}}},"runAfter":{}}}
I get an odd error in the Excel script and not sure how to overcome. I've reduced the 'Top count' and 'Pagination' to 100 but that has no impact.
@ACPriebe
I'm not immediately able to replicate that issue. Could you please share more screenshots of your data & flow? Preferably some of the primary key column, data, & flow input as I'm guessing if there is a problem with the RowNum variable, then it may be due to an issue with the primary key values and/or processing.
I am pretty sure the correct data is sent in the 'Run Script' command:
Sends:
If I run the script manually, it works well.
I found the issue! For some reason when I added the Select 'Output' to the Run Script command it added some extra JSON tags. I manually removed them and it now works! Thanks so much for this solution to quickly delete rows!
Updated to
@takolota I was playing around with the V2b flow to understand how you implemented "delete rows not in source" in that version, which is what I need for my use case. I don't need the chunking/iteration, because the spreadsheets I'm playing with are ~150 rows. While deleting the parallelism to get the flow down the basic data manipulation that I needed to understand, I noted what might be an issue with the parallel chunking in the "Source primary key and get data" scope. Each of the "List rows" actions has a note "Pagination set to 100000", but the "Top" parameter is set to 10k, not 100k, in each of the 3 parallel actions.
Thanks so much for designing this, it is just what I was looking for. Is there anyway that you would have time to also post the V2b and V2Sync with no connections? I assume that my business is somehow getting in the way of the .zip import. I have the V2 working using your no connections file but I have tables with many more than 750 lines.
Thank you for your time.
Chad
@m00ch
If you were able to import the regular V2 zip, then I'm guessing the issue may be the table references, not the Excel connections.
Try importing this version of the V2b flow without any of the table references & see if it works...