Re: Change CSV Delimiter

takolota · ‎02-01-2022

Hello All,

I ran into some issues when pulling CSV data in from Power Automate Desktop because of commas in the actual data. I wanted to change the delimiter so I could more easily parse the data in a single Select action without commas in the actual data messing things up. I also may be parsing CSV files with hundreds or thousands of rows, so I didn’t want to use all my daily actions on this in a slow Apply to each loop.

Attached is the scope/flow I built so anyone can easily select their CSV data that has quotes around the comma-containing records, enter a new delimiter, and get the new delimiter separated data from the final compose action without the usual errors. And it only takes a few actions to do this, even on very large files.
I've found that many CSV files don't put quotes around their records with in-data commas, and this only works when there are quotes around those records. But if the file is saved as a text file, then it often puts quotes around the right records.
If you are using Power Automate Desktop, program the file to be saved as .txt and read that into your output variable.

It’s currently set to handle up to 50 comma-containing columns, but you can expand that to as many columns as needed by adding extra lines & expressions to the 1st Select action. Just follow the pattern by replacing some of the array numbers, like [50] with [51].

Also if your data has more unique values like an array with mixed quoted string data, Ex: ["string1", 2, 03/05/2022, "string3"], then this will create errors in the output.

The template for parsing CSV to JSON & entering it into a dataset uses the same change delimiter set-up: https://powerusers.microsoft.com/t5/Power-Automate-Cookbook/CSV-to-Dataset/m-p/1508191#M584

*Copying the template scope into another flow may not work as it may mess up the expressions. You may need to start with a copy of this template and copy & paste the rest of your flow into this template flow.

Thanks for any feedback, & please subscribe to my YouTube channel (https://youtube.com/@tylerkolota?si=uEGKko1U8D29CJ86)

Version 3 Uploaded 03/26/2022 (Adjusted the 1st Select input so it can now also deal with in-data commas in the 1st column. Added more lines to the 1st Select so it can now handle up to 50 columns with commas in them.)

Google Drive Link: https://drive.google.com/file/d/11uT15hXY0VjnOKDvFxdVgkuMtXqTmA0c/view?usp=sharing

Version 4 Uploaded 04/09/2022

(More minor fixes & additions.

I adjusted several expressions so it can now handle a few more scenarios with arrays in the CSV data. It should handle any array that doesn't include double quotes and any array that is all strings with double quotes, so ["String1", "String2", "String3"], but it will have issues if it is a mixed array with some double-quoted strings and some other values, for example ["String", 4, 03/05/2022, "String2"] won't work.

I also adjusted how the LineBreak setting is set-up so it now uses the /r/n for the LineBreak. I also provided this link in the flow so anyone can look up the right string for the decodeUriComponent expression(s) if they happen to have different LineBreak characters. This change also made it possible to differentiate between in-data line-breaks and CSV row line-breaks on the files I tested, so it should now replace the in-data line-breaks, like the multiple-choice fields some sites use, with semi-colons. That should make those records much easier to deal with & parse in later actions.
I also looked over a problem with in-data trailing commas. I added a line in the settings where anyone can toggle whether they want it to adjust for trailing OR leading commas in the data, it just can't handle both in one dataset. So if one column in one row has ",String1 String2" and another column in another row has "String 3 String4," then it will have errors.)

Google Drive Link: https://drive.google.com/file/d/1ZbhFGVKHSpaH2Duv8qXwMnNww8czdgc4/view?usp=sharing

Version 5

More adjustments for comma edge cases and for instances with one comma-containing value following another in the file data.

Google Drive Link: https://drive.google.com/file/d/1il_wI9fJRk11YaI4EPQvk2efrbBNRBwr/view?usp=sharing

Update 06/01/2022
Microsoft Power Platform & Paul Murana recently did a video demonstration of how to handle CSV & other files in dataflows: https://youtu.be/8IvHxRnwJ7Q
But it currently only outputs to a dataverse or Dataverse for teams table.

Fernando21 · ‎04-25-2023

@takolota

It is indeed space delimited %20 and we also can see the linebreak %0A

I've already setup %20 as a delimiter, it's too many columns.

example, BLA%20995%20BLAZ%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%0ABLA%20995%20BLAY%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%0A

takolota · ‎04-25-2023

So does the actual data look like…

BLA 995 BLAZ

All in one column?

And how many other columns are there actually in the data?

Because if the true delimiter is like 5 or 6 or 7… spaces, then you should be able to use that many spaces as the delimiter.

Fernando21 · ‎04-26-2023

Unfortunately no, this is some lines

BLA 995 BLAY9
BLA 995 BLAZ1 2 11 8 10 1 14 13
BLA 995 BLAY2
BLA 995 BLAX3
BLA 995 BLAZ5 18 16 15 34 19 22 18 26 21

BLA is a column, 995 is a column, BLAZ0 is a column then it contains between 0 and 10 numbers, so a maximum of 13 columns, as you can see sometimes it's empty.

Sometimes you could have a number in the 1st column and 1 in the tenth column and nothing between.

Manually we use the excel convert fonction, with delimited columns, column 1 is always 3 charcaters, column 2 is always 3 numbers, column three is 5 characters, etc...

Is it something we could do ?

Thanks again.

takolota · ‎04-26-2023

@Fernando21

Okay, based on all that you have shown I don't know of an easy way to convert my whole templates to your use-case, but you will probably need to use some similar things to get what you want in a custom build if you decide to continue this way.

If you do not have any in-data spaces, so spaces are only used as delimiters in your data...
You should be able to use a Select action with the From field set to a split of the data on the linebreak character. That expression will separate everything into an item for each line of the data. Then switch the Select action to the single-input box mode by pressing the small button on the right of the action. Now in the single box Map input use the expression trim(item()) so it removes an excess spaces / columns at the end or beginning of your line (but you will still have the same error if you have data at the end of lines with the like 60 spaces in-between).

From there you can use another Select action & set the From field to the Body of the previous Select action. Then using the table Map you can input the column names of your destination columns on the left side of the table & use an expression like split(item(), ' ')?[INSERT COLUMN NUMBER HERE STARTING FROM 0] on the right side to get the values for each of your columns.

The last piece is also used in this template: https://powerusers.microsoft.com/t5/Power-Automate-Cookbook/CSV-to-Dataset/td-p/1508191

Ginko · ‎05-15-2023

This is a wonderful tutorial and almost works perfectly for my use case, but I'm encountering an issue with " in my dataset.

Many of my data rows have a double-quote ( " ) in the item name to indicate inches, such as NDLE,18GX3.5",PINK HUB,STRL,SNGL USE.

The flow fails because it ends up splitting the row into this:

"NDL00201,NDLE###18GX3.5\",",
"PINK HUB,STRL,SNGL USE",
"2022-08-05T00:00:00,True,,2023-05-10T00:00:00",

When the flow should split the row like this:

"NDL00201,NDLE,18GX3.5",PINK HUB,STRL,SNGL USE,2022-08-02T00:00:00,True,,2023-05-10T00:00:00"

I believe the "QuoteComma" section is what needs adjusting to accommodate for " in the data, but I cannot find the correct adjustment. Do you have any advice for when " is used in the csv data?

takolota · ‎05-15-2023

@Ginko

That’s the 1st time I’ve had someone with a “, or a ,” in their data.

I don’t have a way to adjust for that as it is exactly the same characters that identify the end of a column.

Any way to replace that double quote in all your data with something else? Like two single quotes ‘’ ?

Ginko · ‎05-22-2023

I changed the data to single quotes and its working now, working on putting in special character limits in my app so users cannot put in double-quotes. Thanks!

takolota · ‎05-22-2023

@Ginko

If they need double quotes, that should be fine. It’s just the specific character combinations of ,” or “,

Double quotes without a comma immediately before or after should be fine

cc_c · ‎06-09-2023

Hello,

I think your solution will match to my delimiter problem, but i'm a newbie and I don't know how to import your ZIP into Power Automate ?

Thank you very much

Cedric

takolota · ‎06-09-2023

@cc_c

Around minute 3:30 this video shows how to import a flow

https://youtu.be/eCDL1Ix3aGY

There may be a small UI change where you have to select Import (Legacy)