01-20-2024 10:48 AM - last edited 01-20-2024 10:49 AM
Find and Remove Duplicates From Any Datasource
Need to identify and/or remove records in a datasource based on duplicate values in one or more columns?
This template uses a SharePoint list to demonstrate, but this method ultimately pulls in the data from any source as a standard JSON array and processes it to identify all the duplicates based on the selected columns. From there the duplicates can be easily deleted.
So it will work for SharePoint, Excel, SQL, Dataverse, API GET/DELETE calls to external datasets, and any other dataset used in Power Automate.
Example...
Use any get data action to pull the data into the flow. Use built in action functions like Order By or later use a Power Automate Sort( ) expression and Power Automate Reverse expression to format your data in ascending or descending order by your chosen column.
Sorted by date, Ascending will keep the oldest & delete the newest, while Descending will keep the newest & delete the oldest.
Sorted by number, Ascending will keep the smallest & delete the largest, while Descending will keep the largest & delete the smallest.
Move to the Select DuplicateCheckFields action and fill the column name & dynamic content value for each data source column you want to include in the check for duplicates. Only records where all listed columns match will count as duplicates.
The middle actions will process the records, identify the duplicate records, & return a JSON array of only duplicate records.
Put whatever action you will use to delete the records inside the loop and use items('Apply_to_each')?['InsertColumnNameHere'] to reference fields within the apply to each loop. For example, this SharePoint Delete item action just required the ID primary key field for each item to delete, so I used items('Apply_to_each')?['ID'] in the expression editor on the Id input.
Starting Demo SharePoint List
Final Result on The Demo SharePoint List
Rows with IDs 14, 16, & 19 were removed as both the Title & Column 2 fields on those rows matched other rows in the dataset. Notice rows with IDs 15 & 20 were not removed. Although they contain duplicate Title values, the Column 2 values were still unique for those rows.
Thanks for any feedback, & please subscribe to my YouTube channel (https://youtube.com/@tylerkolota?si=uEGKko1U8D29CJ86)
V1.1 added a Reverse( ) expression to the processing actions to make the records kept by ascending or descending sort more intuitive for the user-inputs.
For anyone having trouble with the legacy import method, here is a post on an alternate import method: Re: Find and Remove Duplicates - Page 3 - Power Platform Community (microsoft.com)
watch?v=49xj4cN6OFI
Amazing post... I still can't believe after all these years of people asking, that Microsoft refuses to build an Array Expression that automatically detects and removes duplicates... Sad. It feels like no one at Microsoft is working on Power Automate.
@Patrick_DEN Paste in the following code instead, the empty apostrophe was missing in the above code - @takolota can you update that post please?
@OmarEl Thanks, it sometimes removes empty single quotes when copying & pasting.
Hopefully anyone who has any issues with the copying & pasting actions method can now use the Power Apps Solution import. It may become more important as the new designer edit menu doesn't have a copy to clipboard or pasting of actions yet.
Why releasing a new designer, when it's not even ready to replace the old one (tons of issues, missing copy and paste, etc...)...?
can you help with this error?
InvalidTemplate
Are you using the new designer?
I’ve noticed a reversion in functionality where the Select action will fail to type things correctly automatically placing double quotes around every value input, thus turning it into a string. You may need to switch to the JSON view & remove the double quotes around the expression.
Are you using the new designer?
I’ve noticed a reversion in functionality where the Select action will fail to type things correctly automatically place double quotes around every value input, thus turning it into a string. You may need to switch to the JSON view & remove the double quotes around the expression.
@takolota thanks so much for this process!
I've been following along, but I keep running into this error:
InvalidTemplate
The execution of template action 'Select_-_ResetToRecordWithIsDuplicateField' failed: The evaluation of 'query' action 'where' expression '@addProperty(item()?['RecordJSON'], 'IsDuplicate', if(greater(length(split(join(skip(split(string(body('Select_-_ReformatRecordAndDuplicateChecks')), string(item())), 1), ''), string(item()?['DuplicateFieldsJSON']))), 1), 1, 0))' failed: 'The template language function 'addProperty' expects its first parameter to be of type object. The provided value is of type 'String'.'.
It's taking input from the previous field:
I've tried copying and pasting this code many times and it just keeps getting caught up on that error.
This is my flow:
I've tried switching to classic designer too, since I know there are some irregularities between the two versions. Still no dice. Hopefully you see something that I'm missing!
@iramos85
If you toggle the Select view over to the single textbox view, did it automatically put double quotes around any of the RecordJSON or DuplicateFieldsJSON values?