Jump to content

Filter rows with delimiter-separated values based on a "Contains" logic


RobotZebra
Go to solution Solved by barchiel33,

Recommended Posts

Hello,

I have several columns that contain delimiter-separated values (in this case, a tab, but I can change this to a comma or another delimiter) and I would like to allow users to filter based on the presence of any item within the delimiter-separated list. For example, my table looks like this:
 

Column A Column B
Banana, Apple, Orange Bread, Rice, Pasta
Apple, Banana Pasta

A listbox filter would look like this for Column A:

Banana, Apple, Orange
Apple, Banana

The user would have to select both the first and second filter options to display the two rows, but I would like to have them select only "Banana" or "Apple" to show both rows since they both contain those words. I have a document property which is a list that contains all the unique values from the column, but I'm not sure if it's possible to create a filter using this document property. Is there any way to create a filter using the list of unique values?

Link to comment
Share on other sites

This is obviously a simplified version of the actual problem you are trying to solve, but hopefully it won't take too many iterations to get to a solution.

Using this example, I used a case() statement and the find() function to create a new column, which you can use as a filter:

case  when Find(DocumentProperty("your document property"),[ColA])>0 then "Yes" else "No" end as [Found]

So if you select 'Banana' or 'Apple' in the document property then you'll have a new column called 'Found' that will show "Yes" or "No", then you can filter to only show "Yes" values.

Link to comment
Share on other sites

Hi Kirsten, thank you for your quick reply. Your suggestion is a good idea, but the document property can be quite lengthy in the actual visualization (between 15-20 distinct values, potentially more in the future) and I would like for the property to be populated upon dashboard initialization so it's always up to date. Is there a way to populate a drop down or listbox filter, ideally a multiple select with the items in the document property, then use this to filter the data table? Or if you have any other suggestions, that would be appreciated! Thank you.

-Tiffany

Link to comment
Share on other sites

Hi David, thanks for your reply. I'm not sure this would work since it seems the listbox would still display the values as they are listed in the column. For example, row A would have 'Banana, Apple, Orange' and row B would have 'Apple, Banana', so a user cannot see all rows where 'Banana' or 'Apple' exists in the column without selecting the listbox option that shows the exact string in the column. In this simplified scenario, they would have to select both listbox options, which for a large data table with many different combinations of "fruits' could be quite inconvenient. 

Is there any other way of doing a 'Contains' type search where the user can select from a list of distinct values?

Link to comment
Share on other sites

Hi David, thanks for your other suggestion. Unfortunately it won't work since the 'search' functionality I am looking for is a bit more complicated. I am using a different approach but would like to ask for help in resolving the last step (#5). This is my workflow:

1. On dashboard launch, an IronPython script runs which iterates through every row in 3 different columns to create 3 lists that contain the unique values for those columns. E.g. List 1 would be 'x, y, z' and List 2 could be 'a,b,c'. Those lists are assigned to document properties. These document properties contain all the possible unique values that user could filter by.

2. I use a data function to write those document properties to three different data tables.

3. I insert three multi-select property controls, the values for which are set through the 'Unique values in column' option which come from the new data tables. Essentially, List 1 becomes Table 1, which allows me to set the list box property control with values 'a,b,c'.

4. When the user selects value(s) from the list box controls, this selection will be stored in a different set of document properties, e.g. List1Selected.

5. When the value of List1Selected is updated so that its value becomes ['x', 'y'] for example, I would like to execute the following IronPython script which iterates over every row and checks if the list values are a subset of the values in that row. For example, if the row contains the values "y, x, a", then I want to return a value of 'True' in a calculated column called 'IsFound' since x and y both appeared in the row value, regardless of the order in which they appear.

from Spotfire.Dxp.Data import *

#I convert the list to a set because I want to compare each item in the set, regardless of order.
values = set(Document.Properties["List1Selected"])
cursor1 = DataValueCursor.CreateFormatted(dataTable.Columns["Criteria"])
cc = Document.Data.Tables["myTable"].Columns["CriteriaFound"]

for row in dataTable.GetRows(cursor1):
    list = cursor1.CurrentValue.split("\t") #I have to do this since my Criteria column is tab-delimited
    values_set = set(list)
    result = values.issubset(values_set)
    cc = result

The script returns the correct values in the output window, but the values do not appear in the calculated column. How do I write those returned values to the corresponding row in the calculated column? I'd like my final result to be:

If List1Selected is ['x', 'y']:

Criteria Criteria Found
y, x, a True
x False

 

Link to comment
Share on other sites

Hi Tiffany,

Not sure if I understand what you mean by "The script returns the correct values in the output window, but the values do not appear in the calculated column."

What are the values that your script returns? And what output do you want to appear in a calculated column (I see you are using the cc variable twice in your script, is that intentional)?

I would think that your script output still needs to be send to a(nother) Document Property, so your calculated column can pick it up. And I can't see that in your script. Is that the correct? If so, you probably want to add a line saying Document.Properties("Final Output")='script output'.

Kind regards,

David

Link to comment
Share on other sites

Hi David, 

Thanks for your reply. The screenshot below shows the output of the script which is either True or False for every row that is evaluated. This value is the output that I would like to appear in the calculated column. Essentially, in the script, I am trying to iterate through every row and compare the values in a document property to the values in each row of a specific column, and if the values in the Document Property are a subset of the value in the column (for that row), it will return True. So rather than trying output the whole list of True/False values to a single document property, I need to write the output of each iteration back to the calculated column (so the calculated column will be either True or False for that row). Hope this helps to clarify. 

image.png.f0d62e9a48d76817bfd22fc5d192ba86.png

Kind regards,

Tiffany

Link to comment
Share on other sites

  • Solution

I'm thinking you'd have an easier time using a Python data function for what you're trying to do. I would think something like:

 li = list(search_list.split(', '))
output = text_column.apply(lambda row: all(x in row for x in li))

where search_list is a string document property containing a comma-space separated list of the terms to seach for, text_column is a column from the data table that you want to search for the terms, and output is a column output which will contain true if all terms in search_list are in the corresponding row from the text column and false if any term isn't.

Link to comment
Share on other sites

  • 2 weeks later...

Thank you @barchiel33! I'm posting my full solution here in case this can help others in the future. The following is a bit of a workaround to create a filter that will return rows which contain keywords in any order selected in a multi-select list box. I am using this to filter on columns which contain tags which may appear in any order, and which are likely to change. For example, the tags could be "Apple, Banana, Orange" or "Banana, Apple, Pineapple" and I want to present the user with a list of distinct fruits. The user would want to find all rows where "Banana" appears in the list of tags - regardless of where it appears in the order or if there are other tags present. 

1. On dashboard launch, an IronPython script runs which iterates through every row in 3 different columns to create 3 lists that contain the unique values for those columns. E.g. List 1 would be 'x, y, z' and List 2 could be 'a,b,c'. Those lists are assigned to document properties. These document properties contain all the possible unique values that user could filter by. To do this, create a custom date time function and write the output to a document property as described here. 

 

2. Create the following script that will execute every time the document property changes:

Script that loops through all the columns containing the values you want to filter by and stores those values in separate document properties

from System import Array

#create a list of columns to loop through
columns = ['ColumnNames']
#Create a dictionary to store values for each column so they don't get overwritten
vals_dict = {}
#Loop through each list item and create a list called 'vals' for each item. This will go in the dictionary.
for i in columns:
    nodes=Document.Data.Tables['TableName'].Columns[i].Hierarchy.Levels.LeafLevel.TryGetNodes(int.MaxValue)
    vals = []
#Loop through each node, create an array of the values
    for nval in nodes[1]:
        #removes the 'tab' spacing between the different options in the column
        array = []
        array.Add(nval.FormattedValue.split("\t"))
        #Create a flat list as the above code will generate lists within a list
        flatList = [element for innerList in array for element in innerList]
        #Use the set function to create a list containing only the unique value from the list generated above in the previous step.
        vals.extend(flatList)
        #Append a static string "Show All" to the list (removing this for now)
        #vals.append("-Show All-")
#Store the values for the current column in the dictionary
    vals_dict[i] = list(set(vals))

#Assign the list values to document properties
delimiter = ","
for name, lst in vals_dict.items():
    listname = name+"List"
    Document.Properties[listname] = delimiter.join(lst)

3.  Use a data function to write those document properties to three different data tables.

Data Function to write document property values to data table rows, where the input parameters are the document properties and your output parameters are new tables.

a<- strsplit(x, ",")
OutputTable1 <- as.data.frame(a, col.names = "Col1")

#repeat for as many columns as you need

4. I insert the multi-select list box property controls, the values for which are set through the 'Unique values in column' option which come from the new data tables. Essentially, List 1 becomes Table 1, which allows me to set the list box property control with values 'a,b,c'.

5. Create a data function that will compare the values selected in the multi-select list boxes, iterate over each row in the corresponding column and check if any of the values in the document property match any of the values present in the column. Please note - an error will be thrown if you try to use a string list document property in a data function. A workaround is documented here: https://support.tibco.com/s/article/Tibco-KnowledgeArticle-Article-43929

Script

#This searches your column(s) for any matches
li =  list(search_list.split(','))

#Use fillna() to account for columns where no tags exist.
output = column.fillna('').apply(lambda row: "True" if "-Show All-" in li else any(x in row for x in li))

6. Finally, use the limit Data Using Expression in the table to filter out the rows where the column = True:
Case when "${MultipleSelectDocumentProperty}" = "" then 
true else
[ColumntoFilter] = "True"
end
 

  • Like 1
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...