Jump to content

Cumulative count between dates (Calculated column)


Go to solution Solved by Gaia Paolini,

Recommended Posts

Hi there

I'm looking to count the number of doses within a 2 day period like below (# doses within 2 days of initial is the calculated column)

Patient Dose Dose DateTime # doses within 2 days of initial
1111 Initial 01Jan2024 10:00 3
1111 Initial 03Jan2024 10:00 1
1111 Initial 10Jan2024 10:00 1
1112 Initial 12Jan2024 10:00 1
1112 Initial 15Jan2024 10:00 2
1112 Initial 18Jan2024 10:00 1
1111 Additional 01Jan2024 18:00 3
1111 Additional 01Jan2024 19:00 3
1112 Additional 15Jan2024 12:00 2

I'm able to calculate the time since prior dose per patient but am now stuck adding a calculated column to count the additional doses:

Dose Ranked: Rank([Dose DateTime],[Patient])

TimeSinceDose: DateDiff("day", First([Dose DateTime]) OVER (Intersect([Patient],Previous([Dose Ranked]))), [Dose DateTime])

 

Thanks!

Stéphane

Link to comment
Share on other sites

In your data, there is no order of events, so no way to look for 'previous' or 'next' events, or assign the Additional doses to its own Initial dose.
Is there any way such assignment can be provided in the data? Otherwise, would a data function work for you?

Link to comment
Share on other sites

Thanks Gaia, I'm not able to do a sort in the data source as "Initial" and "Additional" are different flat files that have been joined in Spotfire. 

In the past I have sorted similar data using the below data function although I was curious to see if I could make it more efficient by using an expression.

 

Link to comment
Share on other sites

try and add an index column, then reference the index rather than datetime. So patient 111 indexed lines are 1,23, and the next is 4,5,6 etc..?  It shouldn't matter when you add new data lines because your next index number is always greater than the prior. I'm a rookie so I'm spitballing there.  

  • Like 1
Link to comment
Share on other sites

  • Solution

I could think of a way without data functions, but it is a bit involved, and the final data structure is not exactly what you wanted. 
Maybe there is an easier way?  Otherwise I would do it in a data function.


You start with one 'initial' dataset and one 'additional' dataset.
Read them in as separate data tables, with DoseDateTime read of type Datetime.
Change column names of Dose DateTime to 'Initial' and 'Additional' respectively.

Add a new table with the + symbol on the left, choose 'Other' then select the 'initial' table.
Rename it to e.g. 'combined'
In data canvas, add transformation 'add columns' and bring in the 'additional' dataset from the existing table in the analysis
In settings for added columns: join by Patient only, and select Full Outer Join as join type. In 'columns from new data' only import the column you named 'Additional'.

This will do a cartesian join within each [Patient], now you have to work out which rows make sense.

on the 'combined' table create a calculated column: [Delta], as
    DateDiff('day',[Initial],[Additional])
    
Now create your target column [Goal] as:
   1 + sum(Integer(([Delta]>=0) and ([Delta]<=2))) OVER ([Patient],[Initial])
   
 the logical clause is:  the time difference is more than zero and less than  two days .
 Integer() turns True/False into 1/0.
 So when you sum, you ignore the zeros.
 You add 1 as you need to count the initial dose itself
 You group by Patient and initial dose.

image.thumb.png.fa1d07725ac06ba150e6060dc0d6f4c8.png

 

  • Like 1
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...