Jump to content

Sum data in one column depending on distinct other column


Hauke Verrept

Recommended Posts

Hi All,

I am trying to sum one column for all distinct values in another column. This to avoid counting duplicates twice. I have been experimenting with different forms of sum(if() and Distinct sumI9if( etc but can't seem to get it to work. The lines are not 100% identical due to some contaminated data, so just removing all exactly equal lines only solves part of the problem.

 

 

Example of dataset

 

 

RecID

Datetimes of event

timespan

 

 

1

15/01/2020 0900

1:30

 

 

2

15/01/2020 0900

1:30

 

 

3

15/01/2020 1000

00:45

 

 

4

16/01/2020 1500

2:30

 

 

 

 

 

 

 

 

 

 

 

 

 

I would need to sum the timespans for unique datetimes of event. The correct solution in this case would be: 1:30+00:45+2:30 = 4:45

If i count the duplicate my result becomes 6:15

Thanks!

Hauke

Link to comment
Share on other sites

Starting in Spotfire version 10.1, there is a "Filter Rows" transformation which allows you to remove any duplicates from the data table completely and only retain one row per defined group. This can be accomplished via:

 

Data > Transform data > Transformation: Filter Rows > Insert

 

The following expression can be used for it:

Rank(baserowid(),"asc",[Datetimes of event],[timespan])=1

Link to comment
Share on other sites

What is your expected output A calculated column or a calculated value Or are you working with a cross table

 

Most importanty, what makes your column unique How is your data contaminated Please show an example.

 

One approach could be to create a calculated columns that returns the "First Hit" for each unique row. Here, I use a boolean expression to get True/False:

 

If(Rank([Datetimes],"ties.method=first",[Datetimes],[Timespan])=1,TRUE,FALSE)

 

And then you could sum if TRUE 

 

If([First Hit],Sum([Timespan]))

 

If you are working with a cross table you could simply filter out FALSE in the "First Hit" column

 

As suggested by shaggarw I would recommend to get your data clean before you continue with further analysis. The only thing you need to know: What makes your data row unique

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...