Adee Ran Posted April 26, 2021 Share Posted April 26, 2021 New Spotfire user, so this may be a dumb question. I need to analyze data in CSV files that are generated programmatically. The files contain thousands of columns and often fewer rows than columns. Many of the columns contain only one distinct value per column, so they don't really contain information, but they consume a lot of space in the list of columns and make it hard to get around - so I'd like to hide them, remove them, convert them to table property, etc. - but with the huge number of columns, doing that manually is not practical. I cannot affect the creation of the input files. These "single-value" columns are typically the same ones for all input files so a one-time analysis to create a program is possible. However, I would prefer a more generic solution that goes over all columns, identifies the single-value ones, and handles them automatically. Is there a simple built-in way to do that in Spotify (Desktop 11.3) If not, I assume this can be done with some script, but I'm not at that level yet, so would welcome a solution. Link to comment Share on other sites More sharing options...
Fabian Duerr Posted April 27, 2021 Share Posted April 27, 2021 I'm still using 7.11 and I'm not aware of a built-in method to do this. Yes, you could do this with a data function where your full table is the input. Then you would check every column if you only have one unique value. And then remove all columns where this is the case. This gets very simple when you use dplyr. Your data function would look like this, where source.table is the input and new.table is the output. Both variable are of type TABLE: library(dplyr) new.table % select_if(~n_distinct(.)>1)If you have never used a data function before, you can check a little tutorial that I posted here: https://community.spotfire.com/questions/self-referential-calculated-column Link to comment Share on other sites More sharing options...
Gaia Paolini Posted April 27, 2021 Share Posted April 27, 2021 You could try also an Iron Python script (check which solution works better for you): check indentation when you copy and paste substitute 'yourtablename' with the name of your table limitation: if a column has nulls, these are counted as separate values in this script and the column will not be removed #modified from #https://community.spotfire.com/questions/how-delete-columns-effciently-spotfire-iron-python-script from System.Collections.Generic import List from Spotfire.Dxp.Data import * #the data table object from name provided dt = Document.Data.Tables['yourtablename'] rowCount = dt.RowCount rowsToInclude = IndexSet(rowCount,True) #start with empty list columnsToDelete = List[DataColumn]() for column in dt.Columns: values = column.RowValues.ValidRows cursor = DataValueCursor.CreateFormatted(column) values=list() for row in dt.GetRows(rowsToInclude,cursor): value = cursor.CurrentValue values.append(value) uniqueValues = set(values) #print uniqueValues.Count if (uniqueValues.Count==1): #print column.Name + ' ' + str(uniqueValues.Count) columnsToDelete.Add(column) #print 'there are ' + str(columnsToDelete.Count) + ' columns to delete' if columnsToDelete.Count != 0: dt.Columns.Remove(columnsToDelete) #print ('columns removed') Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now