cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
OllyL
New Member

Recursive Sharepoint archiving

Hi there,

 

I'm trying to build a flow that does the following:

 

  • Find all items in Library #1 older than # Months (recursively into sub directories)
  • Move to Library #2, recreating source folder structure from Library #1

I've managed to create a basic flow using get past time followed by Get files (properties only) with nested items included and using a filter Query on the 'Modified' property.  I'm a bit stuck on the next bit though. How could I go about build the folder structure in Library #2 then moving the file into the corresponding location?

 

Thanks very much in advance

 

Olly

1 ACCEPTED SOLUTION

Accepted Solutions
OllyL
New Member

Just as an FYI I have something which seems to be working though I'm not sure if it works in every scenario (I've only tested it so much). It was easier that expected as "Create new folder" does the hard work for you and creates the whole folder tree. My process is just:

 

  • Get past time (which controls the age of documents to be archived)
  • Get items
    • For some reason using the option "Limit entries to folder" returns zero results for me so don't set this option
    • Filter Query is (Modified lt '@{body('Get_past_time')}') and (ContentType ne 'Folder')
    • Top Count - Make sure this is set. I think Apply to each (below) has a 5000 item limit so I set this that
    • Limit Columns by view - not sure if this is 100% required but I created a view with only the basic columns in there
    • As a side note I also enabled a new index on the Library in Sharepoint for Modified. Again not 100% sure this is necessary but I had to try a lot of things in order to get any output from this step - Our document library is pretty big hence the need to archive. 
  • Apply to Each - operate on 'value' from Get items step
  • [AtE] Create new folder
    • Specify archive destination and use "Folder Path" value from Get items as Folder Path
  • [AtE] Get file content
    • Specify source and use "Identifier" from Get items as File Identifier
  • [AtE] Create file
    • Specify archive destination for site
    • Folder path is 'test/@{items('Apply_to_each')?['{Path}']}' where 'test' is the name of my Archive library
    • File name is "File name with extension" from Get items
    • File Content is "File Content" from Get file content
  • (Optional) [Ate] Delete File
    • As we're moving out of the source location, I use a delete here using the source site and the Identifier again to remove it from source library

 

So far in my testing this seems to work. It mirrors the source folder structure and seems to be able to deal with some partial folder structure already existing in the destination. The major drawback is that you can't target this at a specific folder in the source sharepoint but I think that's more an issue with us exceeding the max view list item threshold and all that pain rather than a particular issue with Power Automate. 

 

It produces errors too - Create File / Get File Content have a file size limit so it won't work with large files and other bits will also error out. For our purposes though the errors don't matter, so long as we are able to archive out some files it's still a success and we would plan to keep running the job on a loop to maintain the library. 

 

Hope this helps

 

Olly

 

 

View solution in original post

7 REPLIES 7

Hi @OllyL 

 

Can you explain what you mean by building the folder structure? I do not understand. 

I was thinking that you can use "Move File" or "Move Folder" actions.

 

Let me know if you need more help.

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

 

 

Anthony Amador
Power Platform Specialist.
OllyL
New Member

Hi Anthony,

 

Thanks for getting back to me. Sure:

 

So if we I have a file : /Library1/Some/Folder/Structure/file.docx

 

This is mirrored to: /Library2/Some/Folder/Structure/movedfile.docx

 

The flow would build any of the folder structure that is missing, leaving existing files/folders intact. So in some cases the folder structure would already exist, sometimes it might just have to make the top folder ("Structure"), sometimes it might have to create all the folders.

 

Does this make sense?

Thanks for the explanation. 

 

I was thinking on how to it, but the scenario seems complex, what happens is that you can use the file path of the file to be copied, save it in a variable and split it to know the names of the folders, but you have to compare them with the names in the new folders to make the decision to create a new one or not, the problem is I have no idea how to get those folder names and I do not see any action that can help us here.

 

Kind regards. 

 

 

 

 

Anthony Amador
Power Platform Specialist.
OllyL
New Member

Just as an FYI I have something which seems to be working though I'm not sure if it works in every scenario (I've only tested it so much). It was easier that expected as "Create new folder" does the hard work for you and creates the whole folder tree. My process is just:

 

  • Get past time (which controls the age of documents to be archived)
  • Get items
    • For some reason using the option "Limit entries to folder" returns zero results for me so don't set this option
    • Filter Query is (Modified lt '@{body('Get_past_time')}') and (ContentType ne 'Folder')
    • Top Count - Make sure this is set. I think Apply to each (below) has a 5000 item limit so I set this that
    • Limit Columns by view - not sure if this is 100% required but I created a view with only the basic columns in there
    • As a side note I also enabled a new index on the Library in Sharepoint for Modified. Again not 100% sure this is necessary but I had to try a lot of things in order to get any output from this step - Our document library is pretty big hence the need to archive. 
  • Apply to Each - operate on 'value' from Get items step
  • [AtE] Create new folder
    • Specify archive destination and use "Folder Path" value from Get items as Folder Path
  • [AtE] Get file content
    • Specify source and use "Identifier" from Get items as File Identifier
  • [AtE] Create file
    • Specify archive destination for site
    • Folder path is 'test/@{items('Apply_to_each')?['{Path}']}' where 'test' is the name of my Archive library
    • File name is "File name with extension" from Get items
    • File Content is "File Content" from Get file content
  • (Optional) [Ate] Delete File
    • As we're moving out of the source location, I use a delete here using the source site and the Identifier again to remove it from source library

 

So far in my testing this seems to work. It mirrors the source folder structure and seems to be able to deal with some partial folder structure already existing in the destination. The major drawback is that you can't target this at a specific folder in the source sharepoint but I think that's more an issue with us exceeding the max view list item threshold and all that pain rather than a particular issue with Power Automate. 

 

It produces errors too - Create File / Get File Content have a file size limit so it won't work with large files and other bits will also error out. For our purposes though the errors don't matter, so long as we are able to archive out some files it's still a success and we would plan to keep running the job on a loop to maintain the library. 

 

Hope this helps

 

Olly

 

 

@OllyL thanks for sharing your findings, this is valuable information, please mark your own answer as a solution to help more users find it more easily in the future. 

Cheers. 

Anthony Amador
Power Platform Specialist.
Anonymous
Not applicable

Any chance you can share screenshots of this Flow? It is exactly what I need but, I have never built a Flow before.

OllyL
New Member

Hi Boaby,

 

The Flow I had didn't actually work so well over large lists because SharePoint was failing internally and only returning some of the results, not all.

 

My solution in the end was to use Powershell and I have script now which I use to perform this kind of archiving. see script below (provided with no guarantees and full disclaimer, use at your own risk). You'll need the pnp.powershell module and it doesn't currently support 2FA on the login. I'd recommend setting up a new login for this as it will cause throttling over a large document library which stops you being able to use the account for other SharePoint admin tasks.

 

<#
    -----------------------------------------------------------
    SharePoint Archiving
    -----------------------------------------------------------

    Arguments:
    
    -Calling with zero arguments ./ArchiveDocumentLibrary.ps1 will run from root of doc libray
    -Use additional arguments to run from sub directories e.g. ./ArchiveDocumentLibrary.ps1 Directory1 ChildDirectory2 GrandChildDirectory3
    -Note that directories must be specified individually, in order, and with a blank space between each one

    Notes:

    - Moves files older than $ArchiveDate out of the source location into the Archive location
    - Deletes empty folders in the source location as it goes if $CleanupDirectories is set
    - Files are stored under a folder with $ListName in the archive location. In this way, you can archive multiple source lists to the same archive location (so long as they have different names). 
    - Liable to throttling by Microsoft so be patient, it will get there in the end!
#>

#Required Module for script
Update-Module -Name PnP.PowerShell

#-------------Setings----------------------

#Root Site Url
$SiteUrl = "https://<tenancy>.sharepoint.com"
$username = "<o365 user>"
$password = "<pwd>"

#Source Settings
$RelativeUrl = "/sites/<sitename>"
$ListName = "<documentlibrary>"

#Archive Settings
$ArchiveRelativeUrl = "/sites/<archivesite>"
$ArchiveListName = "<archivelist>"

#Job Settings
$ArchiveDate = get-date "2019-01-01" #Anything older than this will be archived
$FileLimit = -1 # Limits number of files processed at a time. use -1 to turn this off
$CleanupDirectories = $true # Whether to remove empty directories or not
$Debug = $false

#-------------------------------------------

function Recurse-Files($Folder, $Stack)
{        	
	#Write-Output $Folder
    $items = Get-PnPFolderItem -FolderSiteRelativeUrl $Folder
    if ($Debug -eq $true)
    {
        Write-Output ("Folder: " + $Folder)
        Write-Output ("Count: " + $items.count)
    }

        		
    foreach($item in $items)
    {         
        if ($Debug -eq $true)
        {
            Write-Output ("Item: " + $item.Name)
        }

        if ($item.Name -eq "Forms") { continue }                                        
        if ($item.Name -eq "_catalogs") { continue }                                                
        		
	    if ($item.TypedObject.ToString() -ne 'Microsoft.SharePoint.Client.Folder') 
	    { 			
		    $file = Get-PnPFile -AsListItem -Url $item.ServerRelativeUrl			 				
		    if ((get-date $file.FieldValues.Modified) -lt $ArchiveDate)
		    {				               
			    Archive-File $item $file $Folder $Stack
                $FileLimit = $FileLimit - 1
                if ($FileLimit -eq 0) { Exit }
		    }			
	    }		
	    else { 
		    $NewPath = ($Folder + "/" + $item.Name) 
                        
            $TempStack = $Stack.PSObject.Copy()            
            $TempStack.Push($item.Name)

		    Recurse-Files $NewPath $TempStack
	    }							    
    }
     
    if ($CleanupDirectories)
    {
        $items = Get-PnPFolderItem -FolderSiteRelativeUrl  $Folder
        if ($items.count -gt 0) {} else {
        Cleanup-Empty-Folder $Stack
        }
    }                  
}

function Archive-File($Item, $FileItem, $ParentFolder, $Stack)
{	
    Write-Output ("Archiving:  " + $Item.Name + " - " + $FileItem.FieldValues.Modified)    
    
    #Build Destination Path
    $temps = $Stack.PSObject.Copy()           
    $TargetUrl = ""
    while ($temps.count -gt 0) {  $TargetUrl = ("/" + $temps.Pop() + $TargetUrl) }    
    $TargetUrl = ($ArchiveListName + $TargetUrl)  

    #Generate Destination Folder if it doesn't exist
    if ($Debug -eq $false) 
    {
        Resolve-PnPFolder -SiteRelativePath $TargetUrl -Connection $ArchiveConnection | Out-Null
    }
    		
    #Move the File to Archive	    
    if ($Debug -eq $false) 
    {
        Move-PnPFile -SourceUrl ($RelativeUrl + "/" + $ParentFolder + "/" + $Item.Name) -TargetUrl ($ArchiveRelativeUrl + "/" + $TargetUrl) -Overwrite -AllowSchemaMismatch -IgnoreVersionHistory -Force
    }
}

function Cleanup-Empty-Folder($Stack)
{   
    $temps = $Stack.PSObject.Copy()           

    $Name = $temps.Pop()

    $TargetUrl = ""
    while ($temps.count -gt 1) {  $TargetUrl = ("/" + $temps.Pop() + $TargetUrl) }        
    $TargetUrl = ($ListName + $TargetUrl)

    Write-Output ("Clean-up:  " + $Name + " in " + $TargetUrl)    
    if ($Debug -eq $false) 
    {
        Remove-PnPFolder -Name $Name -Folder $TargetUrl -Force
    }
}

$encpassword = convertto-securestring -String $password -AsPlainText -Force
$credentials = new-object -typename System.Management.Automation.PSCredential -argumentlist $username, $encpassword

#Get Connection for connecting to the archive (need to do this before the other connection as it also connects interactive)
$ArchiveConnection = Connect-PnPOnline -Url ($SiteUrl + $ArchiveRelativeUrl) -ReturnConnection -Credentials $credentials

#Start (Main Method)
Connect-PnPOnline -Url ($SiteUrl + $RelativeUrl) -Credentials $credentials

#Prep stack
$StartStack = new-object system.collections.stack
$StartStack.Push($ListName)

#Load Path from args
$StartPath = $ListName
for ( $i = 0; $i -lt $args.count; $i++ ) {
    $StartPath = ($StartPath + "/" + $args[$i])
    $StartStack.Push($args[$i]) 
}


if ($Debug -eq $true) { Get-PnPList }

#Go!
Recurse-Files $StartPath $StartStack    

 

Helpful resources

Announcements

Community will be READ ONLY July 16th, 5p PDT -July 22nd

Dear Community Members,   We'd like to let you know of an upcoming change to the community platform: starting July 16th, the platform will transition to a READ ONLY mode until July 22nd.   During this period, members will not be able to Kudo, Comment, or Reply to any posts.   On July 22nd, please be on the lookout for a message sent to the email address registered on your community profile. This email is crucial as it will contain your unique code and link to register for the new platform encompassing all of the communities.   What to Expect in the New Community: A more unified experience where all products, including Power Apps, Power Automate, Copilot Studio, and Power Pages, will be accessible from one community.Community Blogs that you can syndicate and link to for automatic updates. We appreciate your understanding and cooperation during this transition. Stay tuned for the exciting new features and a seamless community experience ahead!

Summer of Solutions | Week 4 Results | Winners will be posted on July 24th

We are excited to announce the Summer of Solutions Challenge!    This challenge is kicking off on Monday, June 17th and will run for (4) weeks.  The challenge is open to all Power Platform (Power Apps, Power Automate, Copilot Studio & Power Pages) community members. We invite you to participate in a quest to provide solutions to as many questions as you can. Answers can be provided in all the communities.    Entry Period: This Challenge will consist of four weekly Entry Periods as follows (each an “Entry Period”)   - 12:00 a.m. PT on June 17, 2024 – 11:59 p.m. PT on June 23, 2024 - 12:00 a.m. PT on June 24, 2024 – 11:59 p.m. PT on June 30, 2024 - 12:00 a.m. PT on July 1, 2024 – 11:59 p.m. PT on July 7, 2024 - 12:00 a.m. PT on July 8, 2024 – 11:59 p.m. PT on July 14, 2024   Entries will be eligible for the Entry Period in which they are received and will not carryover to subsequent weekly entry periods.  You must enter into each weekly Entry Period separately.   How to Enter: We invite you to participate in a quest to provide "Accepted Solutions" to as many questions as you can. Answers can be provided in all the communities. Users must provide a solution which can be an “Accepted Solution” in the Forums in all of the communities and there are no limits to the number of “Accepted Solutions” that a member can provide for entries in this challenge, but each entry must be substantially unique and different.    Winner Selection and Prizes: At the end of each week, we will list the top ten (10) Community users which will consist of: 5 Community Members & 5 Super Users and they will advance to the final drawing. We will post each week in the News & Announcements the top 10 Solution providers.  At the end of the challenge, we will add all of the top 10 weekly names and enter them into a random drawing.  Then we will randomly select ten (10) winners (5 Community Members & 5 Super Users) from among all eligible entrants received across all weekly Entry Periods to receive the prize listed below. If a winner declines, we will draw again at random for the next winner.  A user will only be able to win once overall. If they are drawn multiple times, another user will be drawn at random.  Individuals will be contacted before the announcement with the opportunity to claim or deny the prize.  Once all of the winners have been notified, we will post in the News & Announcements of each community with the list of winners.   Each winner will receive one (1) Pass to the Power Platform Conference in Las Vegas, Sep. 18-20, 2024 ($1800 value). NOTE: Prize is for conference attendance only and any other costs such as airfare, lodging, transportation, and food are the sole responsibility of the winner. Tickets are not transferable to any other party or to next year’s event.   ** PLEASE SEE THE ATTACHED RULES for this CHALLENGE**   Week 1 Results: Congratulations to the Week 1 qualifiers, you are being entered in the random drawing that will take place at the end of the challenge.   Community MembersNumber SolutionsSuper UsersNumber Solutions Deenuji 9 @NathanAlvares24  17 @Anil_g  7 @ManishSolanki  13 @eetuRobo  5 @David_MA  10 @VishnuReddy1997  5 @SpongYe  9JhonatanOB19932 (tie) @Nived_Nambiar  8 @maltie  2 (tie)   @PA-Noob  2 (tie)   @LukeMcG  2 (tie)   @tgut03  2 (tie)       Week 2 Results: Congratulations to the Week 2 qualifiers, you are being entered in the random drawing that will take place at the end of the challenge. Week 2: Community MembersSolutionsSuper UsersSolutionsPower Automate  @Deenuji  12@ManishSolanki 19 @Anil_g  10 @NathanAlvares24  17 @VishnuReddy1997  6 @Expiscornovus  10 @Tjan  5 @Nived_Nambiar  10 @eetuRobo  3 @SudeepGhatakNZ 8     Week 3 Results: Congratulations to the Week 3 qualifiers, you are being entered in the random drawing that will take place at the end of the challenge. Week 3:Community MembersSolutionsSuper UsersSolutionsPower Automate Deenuji32ManishSolanki55VishnuReddy199724NathanAlvares2444Anil_g22SudeepGhatakNZ40eetuRobo18Nived_Nambiar28Tjan8David_MA22   Week 4 Results: Congratulations to the Week 4 qualifiers, you are being entered in the random drawing that will take place at the end of the challenge. Week 4:Community MembersSolutionsSuper UsersSolutionsPower Automate Deenuji11FLMike31Sayan11ManishSolanki16VishnuReddy199710creativeopinion14Akshansh-Sharma3SudeepGhatakNZ7claudiovc2CFernandes5 misc2Nived_Nambiar5 Usernametwice232rzaneti5 eetuRobo2   Anil_g2   SharonS2  

Check Out | 2024 Release Wave 2 Plans for Microsoft Dynamics 365 and Microsoft Power Platform

On July 16, 2024, we published the 2024 release wave 2 plans for Microsoft Dynamics 365 and Microsoft Power Platform. These plans are a compilation of the new capabilities planned to be released between October 2024 to March 2025. This release introduces a wealth of new features designed to enhance customer understanding and improve overall user experience, showcasing our dedication to driving digital transformation for our customers and partners.    The upcoming wave is centered around utilizing advanced AI and Microsoft Copilot technologies to enhance user productivity and streamline operations across diverse business applications. These enhancements include intelligent automation, AI-powered insights, and immersive user experiences that are designed to break down barriers between data, insights, and individuals. Watch a summary of the release highlights.    Discover the latest features that empower organizations to operate more efficiently and adaptively. From AI-driven sales insights and customer service enhancements to predictive analytics in supply chain management and autonomous financial processes, the new capabilities enable businesses to proactively address challenges and capitalize on opportunities.    

Updates to Transitions in the Power Platform Communities

We're embarking on a journey to enhance your experience by transitioning to a new community platform. Our team has been diligently working to create a fresh community site, leveraging the very Dynamics 365 and Power Platform tools our community advocates for.  We started this journey with transitioning Copilot Studio forums and blogs in June. The move marks the beginning of a new chapter, and we're eager for you to be a part of it. The rest of the Power Platform product sites will be moving over this summer.   Stay tuned for more updates as we get closer to the launch. We can't wait to welcome you to our new community space, designed with you in mind. Let's connect, learn, and grow together.   Here's to new beginnings and endless possibilities!   If you have any questions, observations or concerns throughout this process please go to https://aka.ms/PPCommSupport.   To stay up to date on the latest details of this migration and other important Community updates subscribe to our News and Announcements forums: Copilot Studio, Power Apps, Power Automate, Power Pages

Users online (760)