I need to import many CSV files. The problem is that each of these CSV files have the exact same name. However, they all have a different parent folder with a unique ID. How can I import these CSVs with the unique name to keep the data separated?

Daniel ADams 2 · February 28, 2023

Fredrik Rosell · March 1, 2023

Hello,

This question is tagged (topics) both as Spotfire and Jaspersoft Studio.

To ensure that any responses are relevant for the particular product that you are using, please clarify what that product is.

Thanks!

/Fredrik

Daniel ADams 2 · March 1, 2023

Spotfire,... not sure how Jaspersoft got added. But, perhaps there is a solution outside of Spotfire as well.

If there were only a few files I would just do this manually. But, I need to open thousands of files and import the data.

Fredrik Rosell · March 1, 2023

Thank you for the clarification! As you mentioned "I need to open thousands of files and import the data.", are you looking to import them all

a) into the same analysis, as separate data tables

b) into the same analysis, as one data table where you add rows (keeping the origin identifiable)

c) into separate analysis files

d) something else?

Daniel ADams 2 · March 1, 2023

Thank you for responding.

Each file will add rows to the same analysis. The problem being how to keep the origin labeled not as the file name (as they are all the same), but as the parent folder or even the file path? Is this possible?

Fredrik Rosell · March 6, 2023

Hello,

Well, first of all, but important as you are dealing with so many files, I will for now assume that you have already concluded that it makes most sense for your particular use case to do this merge of the data inside of Spotfire (and not just merge all that data BEFORE importing, using your textfile-batch-handling solution of choice).

With the previous disclaimer in place, maybe one way to do this in Spotfire would be using an IronPython script. Here's something I tested to list all the csv files (and their parent folders) in subdirectories of a root folder.

from Spotfire.Dxp.Data import *from Spotfire.Dxp.Data.Import import *from Spotfire.Dxp.Data import DataTableSaveSettingsimport clrimport os  #For this test, I get the root folder that contains all the files from a document property rootFolderPathProperty = "propCSVParentPath"rootFolderPath = Document.Properties[rootFolderPathProperty]#The new data table is called CSVDatacsvDataTableName = "CSVData" #Find the full path and immediate folder name of each of the CSV files filesFullPath = []parentDirs = []# os.walk() returns subdirectories, file from current directory and # And follow next directory from subdirectory list recursively until last directoryfor root, dirs, files in os.walk(rootFolderPath):	for file in files:		if file.endswith(".csv"):			filePath =os.path.join(root, file)			filesFullPath.append(filePath)			dirPath = os.path.dirname(filePath)			parentDirs.append(os.path.basename(dirPath))print(filesFullPath)print(parentDirs)  #Now add the files. Add the first file as a new data table, and add rows to that one for all the other files. filePath=filesFullPath[0] #specify any settings for the filesettings= TextDataReaderSettings()settings.Separator = ","dataSource=TextFileDataSource(filePath,settings) dataManager=Document.Data#datasource name and datasource to add tabledataManager.Tables.Add(csvDataTableName,dataSource)

In my example, having created a few cvs files in subdirectories of C:Tempcsvimport, I get this from

>print(filesFullPath)

['C:\Temp\csvimport\folder1\data.csv', 'C:\Temp\csvimport\folder2\data.csv', 'C:\Temp\csvimport\folder3\data.csv']

>print(parentDirs)

['folder1', 'folder2', 'folder3']

I stopped there as I noticed that there is already an existing example that shows you how to do the rest - see "Appending Rows to a Table and Creating a New Column to Keep Track of the Origin of the Data in TIBCO Spotfire® Using IronPython Scripting"

https://community.spotfire.com/s/article/Appending-Rows-to-a-Table-and-Creating-a-New-Column-to-Keep-Track-of-the-Origin-of-the-Data-in-TIBCO-Spotfire-Using-IronPython-Scripting

Sign In

I need to import many CSV files. The problem is that each of these CSV files have the exact same name. However, they all have a different parent folder with a unique ID. How can I import these CSVs with the unique name to keep the data separated?

Recommended Posts

Daniel ADams 2

Link to comment

Share on other sites

Fredrik Rosell

Link to comment

Share on other sites

Daniel ADams 2

Link to comment

Share on other sites

Fredrik Rosell

Link to comment

Share on other sites

Daniel ADams 2

Link to comment

Share on other sites

Fredrik Rosell

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Industries