Jump to content

I need to import many CSV files. The problem is that each of these CSV files have the exact same name. However, they all have a different parent folder with a unique ID. How can I import these CSVs with the unique name to keep the data separated?


Daniel ADams 2

Recommended Posts

Thank you for the clarification! As you mentioned "I need to open thousands of files and import the data.", are you looking to import them all

a) into the same analysis, as separate data tables

b) into the same analysis, as one data table where you add rows (keeping the origin identifiable)

c) into separate analysis files

d) something else?

Link to comment
Share on other sites

Hello,

Well, first of all, but important as you are dealing with so many files, I will for now assume that you have already concluded that it makes most sense for your particular use case to do this merge of the data inside of Spotfire (and not just merge all that data BEFORE importing, using your textfile-batch-handling solution of choice).

With the previous disclaimer in place, maybe one way to do this in Spotfire would be using an IronPython script. Here's something I tested to list all the csv files (and their parent folders) in subdirectories of a root folder.     

from Spotfire.Dxp.Data import *from Spotfire.Dxp.Data.Import import *from Spotfire.Dxp.Data import DataTableSaveSettingsimport clrimport os  #For this test, I get the root folder that contains all the files from a document property rootFolderPathProperty = "propCSVParentPath"rootFolderPath = Document.Properties[rootFolderPathProperty]#The new data table is called CSVDatacsvDataTableName = "CSVData" #Find the full path and immediate folder name of each of the CSV files filesFullPath = []parentDirs = []# os.walk() returns subdirectories, file from current directory and # And follow next directory from subdirectory list recursively until last directoryfor root, dirs, files in os.walk(rootFolderPath): for file in files: if file.endswith(".csv"): filePath =os.path.join(root, file) filesFullPath.append(filePath) dirPath = os.path.dirname(filePath) parentDirs.append(os.path.basename(dirPath))print(filesFullPath)print(parentDirs)  #Now add the files. Add the first file as a new data table, and add rows to that one for all the other files. filePath=filesFullPath[0] #specify any settings for the filesettings= TextDataReaderSettings()settings.Separator = ","dataSource=TextFileDataSource(filePath,settings) dataManager=Document.Data#datasource name and datasource to add tabledataManager.Tables.Add(csvDataTableName,dataSource)

In my example, having created a few cvs files in subdirectories of C:Tempcsvimport, I get this from

>print(filesFullPath)

['C:\Temp\csvimport\folder1\data.csv', 'C:\Temp\csvimport\folder2\data.csv', 'C:\Temp\csvimport\folder3\data.csv']

>print(parentDirs)

['folder1', 'folder2', 'folder3']

I stopped there as I noticed that there is already an existing example that shows you how to do the rest - see "Appending Rows to a Table and Creating a New Column to Keep Track of the Origin of the Data in TIBCO Spotfire® Using IronPython Scripting"

https://community.spotfire.com/s/article/Appending-Rows-to-a-Table-and-Creating-a-New-Column-to-Keep-Track-of-the-Origin-of-the-Data-in-TIBCO-Spotfire-Using-IronPython-Scripting

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...