Re: CSV to Dataset

takolota · ‎03-18-2022

Template for converting large CSV files to JSON, then sending the data to a table or list.

This get data for CSV works even if there are commas in the CSV data. The only requirement is the the CSV file must follow the standard of placing double quotes “ , “ around any item eith in-data commas.

Set Up

Go to the bottom of this post & download the CSVToJSON_1_0_0_xx.zip file. Go to the Power Apps home page (https://make.powerapps.com/). Select Solutions on the left-side menu, select Import solution, Browse your files & select the CSVToJSON_1_0_0_xx.zip file you just downloaded. Then select Next & follow the menu prompts to apply or create the required connections for the solution flows.

Find the CSV To JSON solution in the list of solutions. Select it. Then find the CSV To JSON flow inside the solution package.

Once in the flow, go to the Get file content action & select the CSV you want to work with. Alternatively you could use a different action to get the CSV content as text.

After selecting your CSV, go to the Compose CSV data + Set-up action. If you used a different action to get the CSV file content as text, then you will need to input the output of that action in the CSV_Data parameter value (The get file content action you use may return the CSV file as base64, in which case you must use a base64tostring( ) expression on it in the input.

Adjust the rest of the parameters for your file. For example if your CSV has extra lines at the header or footer of the sheet/file, then account for those in the HeaderRowsToSkip & FooterRowsToSkip.

After that, move down to the Select CSV to JSON action. Input the header column names on the left side of the mapping & use the expressions outputs('Compose_CSV_data_+_Set-up')['NewDelimiter'])?[INSERT COLUMN NUMBER HERE STARTING FROM 0] on the right side of the mapping to match up their values. For example in the preceding volunteer contacts CSV example, First Name was the 0 index column so its value expression is outputs('Compose_CSV_data_+_Set-up')['NewDelimiter'])?[0] and Email is the 3rd index column so its value expression outputs('Compose_CSV_data_+_Set-up')['NewDelimiter'])?[3]

After that, we need to get the JSON schema for the Parse JSON action. So go & run the flow once. Then in the flow run, go to the outputs of the Select CSV to JSON action & copy the JSON output there.

Go back to the flow editor, go to the Parse JSON action, select Use sample payload to generate schema, & paste the JSON output to the sample payload menu. Select Done.

Now your CSV columns should be available as dynamic content from the Parse JSON action & you can insert them wherever you need in the rest of your flow.

(If you can export as a semicolon, tab, or other delimited file, then you can probably just use a simpler method like Paul’s here: https://www.tachytelic.net/2021/02/power-automate-parse-csv/?amp)

For more information on the delimiter change piece, visit this previous post:

https://powerusers.microsoft.com/t5/Power-Automate-Cookbook/Change-CSV-Data-Delimiter-From-a-Text-Fi...

To make a flow to send any CSV data to a new Excel table with given CSV header names without any column mappings, check this template:

https://powerusers.microsoft.com/t5/Power-Automate-Cookbook/CSV-To-New-Excel-Table/m-p/1826096#M964

*Copying the template into an existing flow may create issues with expressions. You may need to copy the rest of your existing flow into the CSV template flow, then move the template scope where you need it.

CSV To JSON Version 3

(More minor fixes & additions.

I adjusted several expressions so it can now handle a few more scenarios with arrays in the CSV data. It should handle any array that doesn't include double quotes and any array that is all strings with double quotes, so ["String1", "String2", "String3"], but it will have issues if it is a mixed array with some double-quoted strings and some other values, for example ["String", 4, 03/05/2022, "String2"] won't work.

I also adjusted how the LineBreak setting is set-up so it now uses the /r/n for the LineBreak. I also provided this link in the flow so anyone can look up the right string for the decodeUriComponent expression(s) if they happen to have different LineBreak characters. This change also made it possible to differentiate between in-data line-breaks and CSV row line-breaks on the files I tested, so it should now replace the in-data line-breaks, like the multiple-choice fields some sites use, with semi-colons. That should make those records much easier to deal with & parse in later actions.
I also looked over a problem with in-data trailing commas. I added a line in the settings where anyone can toggle whether they want it to adjust for trailing OR leading commas in the data, it just can't handle both in one dataset. So if one column in one row has ",String1 String2" and another column in another row has "String 3 String4," then it will have errors.)

CSV To JSON Auto Columns Version 1

An additional flow version that automatically does the JSON key-value matching. Useful if you may have use-cases with dynamic CSV files that may change each time the flow is run.

If you have any trouble with the standard legacy flow import method, you can also try an alternate Power Apps Solution import method here: Re: CSV to Dataset - Page 8 - Power Platform Community (microsoft.com)

Thanks for any feedback,

Please subscribe to my YouTube channel (https://youtube.com/@tylerkolota?si=uEGKko1U8D29CJ86).

And reach out on LinkedIn (https://www.linkedin.com/in/kolota/) if you want to hire me to consult or build more custom Microsoft solutions for you.

watch?v=-P-RDQvNd4A

takolota · ‎11-07-2022

@cdn4lf

My first guess is it may be a linebreak character issue since that action by default is set to split on the linebreak characters, then skip the 1st headers row.

Do you have the right linebreak characters set up for your data in the Data + Set-up action?

If they are something other than \r\n, then you may have to adjust the set linebreak characters there.

Otherwise it’s going to perform a split on the wrong characters, only return 1 row/item, then skip that 1 row/item. Then it probably returns a blank like the issue you have.

Virtualizeit · ‎12-02-2022

@takolota ,

I finally got this Flow to work to automate parts of my task and I'm looking to modify it a bit. Instead of having to select the specific csv file, I would like to point to a Onedrive folder and process all CSV files in it.

I'm still fairly new to Power Automation tools, and I've been attempting to make the "Apply to each" control, including changing the 'Body' reference to body('Find_files_in_folder'), with the following Flow components (essentially, I tried to embed the all the functional parts and nested them in the 'Apply to each' component):

This is my set of resulting errors:

@takolota , what approach would you suggest I take to accomplish this?

BTW @cdn4lf ,

I was getting the same results as you, which mirrors some issues I experienced in other flows. I finally took a look at the datasets I used for testing. Turns out, when I open them in TextEdit 8 (I keep forgetting to NOT use Notepad), I found there are double quotes being used in one dataset, and no quotes at all in the other dataset.

VS.

Maybe take a look at the Compose CSV data + Setup section. Try using only one of the functions that is suited to your data format. Mine required the DecodeUriComponent (A) function and I removed the (d) function.

Thoughts?

takolota · ‎12-05-2022

@Virtualizeit

I would need to see more of the flow to ensure I'm catching all the errors, but the major thing I see right now is you don't have anything getting the file content for each file.
So you can try redoing this set-up from the CSVtoJSON template, but instead of replacing the "Get file content" action, you should just add the List files in folder action above it. then insert the "Id" dynamic content in the file picking field for the "Get file content action". That should add the apply to each loop automatically so it can loop through each file Id & grab each file content (I know it can be a little confusing because most OneDrive actions do not actually grab the file content, just the meta-data & such and you have to get the content in a separate action).
From there you can add the rest of the CSV parsing actions below the "Get file content" action in the apply to each loop.

However, do all the CSV files have the same structure & columns? If not, then this still won't work very well as you have to manually change the column headers in the "Select CSV to JSON" action.
So what are you ultimately trying to do with this? Are you trying to enter everything into a database like SharePoint lists or SQL? Are you just trying to convert all these to Excel files?

Virtualizeit · ‎12-05-2022

@takolota ,

EXCELLENT suggestion!!

So I added the components and I think I made the right adjustments.

1. I ensured there are only dataset files that have identical structures in a specific folder.

2. I changed the file dynamic content (which I believe is where my issue is)*.

3. I changed the 'Compose CSV data + Setup' dynamic component from 'Body' to 'Id'.

4. Oddly, the 'value' dynamic component I believe was dynamically set, but I tested changing the value, but with no luck, so it stays.

When I remove the 'Scope CSV to JSON' object, the rest iterates without issues and I see some valid output data, So when I put it back in, I get the error. As I mentioned, I believe the issue is setting the 'Get file content' action 'File' field. I only see an 'Id' object, and NOT an 'Id dynamic content' object. Shouldn't I see one?

Nonetheless, this is the error (btw, it also doesn't want to save):

I was given a python script that I've scheduled to download (2) datasets ("applications.date.csv" and "sessions.date.csv") from AWS daily, so I now have a collection of CSV files, which have the same column structures respectively, that I need to put into Dataverse. So once I have a functional automated Flow process, I will duplicate it for a secondary dataset (i.e., sessions) that will also go into Dataverse, forming a database (each table shares a primary/secondary Idkey table). I will then need to figure out how to join the tables based on Idkey so I can do lookups that I would like to produce weekly.

So ULTIMATELY I want to automate this WHOLE thing to 1) collect data into Dataverse, 2) do a lookup of the last 7 days, and populate a few data points from that lookup into a master Excel spreadsheet (giving me 52 weekly recordings for the year), and 3) do this on an automated schedule.

As long as my dataset stays the same your CSV to JSON should work brilliantly for my project.

Clearly, I have some work ahead of me.

All your help is greatly appreciated.

Virtualizeit · ‎12-05-2022

@takolota ,

... and this may seem like adding complexity but, if the issue is getting actual dynamic content from Onedrive, could I add additional steps to move the files to SP and then build an 'Apply to each' from that list?

takolota · ‎12-05-2022

@Virtualizeit

Thanks for that context.

I see in your picture that you also put the Id in the CSV_Data input on the settings compose. You should leave that as the “File content” dynamic content (dynamic content is just the name for all the items in the pop-up menu).

Also looking at your intended use-case, you may have an easier time with the dataflows tool as that can process & send CSV data to Dataverse pretty easily, it just currently doesn’t send data anywhere else yet.

https://youtu.be/8IvHxRnwJ7Q

You also mentioned joining tables to do lookups. I have done some basic joining in Power Automate before but it is more complicated & tedious. Probably better to join anything by relating tables in Dataverse. Unless you’re just talking about appending, that may be easier to do.

But overall the end result of aggregating something by week for an entire year for some output table or visual sounds like a good use-case for Power BI if you’re able to use it.

Anonymous · ‎12-14-2022

Hi!

First of all thank you for this flow. But i cannot get it it to work.

my data input

my compose reformat back to CSV looks like this

but my select to JSON is empty..

my settings in select from JSON is:

split(item(), outputs('Compose_CSV_data_+_Set-up')['NewDelimiter'])?[0] and adds up every item +1

do you know where i need to look?

Virtualizeit · ‎12-19-2022

@takolota

I followed up with your suggestion of the other method (and a few others). I eventually came up to a stumbling block, so I revisited this method considering I feel it is not only more flexible for my other needs, but I've invested quite a bit of time getting this Flow developed this far, and I see light at the end of the tunnel here.

That being said, I believe I fixed the errors I was receiving in the earlier post. In fact, the flow now runs successfully without errors!!, BUT there isn't any output. More specifically, the input of the Compose CSV data + Setup has an extra set of quotes and a set of brackets added to the 1st transformation line. As you suggested, I simply added the List files in folder action, followed by an Apply to each action. Below is a copy of the successful flow passing data properly, followed by the modified Flow in which I inserted the List files in folder action that won't pass the data properly. I've narrowed down the DIFFERENCE between the two where this extra set of quotes and brackets show up. How do I get these removed?

Successful Flow pointing to a single source file:

Modified Flow Getting a list of files, and Apply to each:

The Inputs for the Compose CSV data + Setup scope are just slightly different, and I verified where the difference is by looking at the details of the input.

I think if I can get those extra quotes and brackets removed, this Flow will complete my automation task.

How do I accomplish this?

takolota · ‎12-19-2022

@Virtualizeit

Good to hear you have more of it working for you.
For this problem, if all the files have those extra characters at the front, you should be able to use a skip( ) expression to skip the first 2-3 characters on the input in that CSV_Data piece.

But I can guess that if those are at the front, then they are also likely at the back. For that you may need to do a
take(InputData, sub(length(InputData), Num#CharRemoveFromEnd))

Then add in the skip to remove the first few characters on that...
skip(take(InputData, sub(length(InputData), Num#CharRemoveFromEnd)), Num#CharRemoveFromStart)

Virtualizeit · ‎12-19-2022

@takolota

You are correct, the extra characters show up at the end as well. I've tried to use the same datasets between tests for consistency, so I can't seem to find what causes the additional characters when the only change in the Flow was the additional actions (List files in folder and Apply to each). When that changes, the extra characters show up.

I'd love to try your suggestion and forgive my ignorance, but where exactly would I insert and use your suggested code...

skip(take(InputData, sub(length(InputData), Num#CharRemoveFromEnd)), Num#CharRemoveFromStart)

Anywhere in this section?

Screenshot 2022-12-19 at 8.18.23 PM.png