Cumulative count between dates (Calculated column)

Stephane Forrer · March 4

Hi there

I'm looking to count the number of doses within a 2 day period like below (# doses within 2 days of initial is the calculated column)

Patient	Dose	Dose DateTime	# doses within 2 days of initial
1111	Initial	01Jan2024 10:00	3
1111	Initial	03Jan2024 10:00	1
1111	Initial	10Jan2024 10:00	1
1112	Initial	12Jan2024 10:00	1
1112	Initial	15Jan2024 10:00	2
1112	Initial	18Jan2024 10:00	1
1111	Additional	01Jan2024 18:00	3
1111	Additional	01Jan2024 19:00	3
1112	Additional	15Jan2024 12:00	2

I'm able to calculate the time since prior dose per patient but am now stuck adding a calculated column to count the additional doses:

Dose Ranked: Rank([Dose DateTime],[Patient])

TimeSinceDose: DateDiff("day", First([Dose DateTime]) OVER (Intersect([Patient],Previous([Dose Ranked]))), [Dose DateTime])

Thanks!

Stéphane

Gaia Paolini · March 6

In your data, there is no order of events, so no way to look for 'previous' or 'next' events, or assign the Additional doses to its own Initial dose.
Is there any way such assignment can be provided in the data? Otherwise, would a data function work for you?

Stephane Forrer · March 11

Thanks Gaia, I'm not able to do a sort in the data source as "Initial" and "Additional" are different flat files that have been joined in Spotfire.

In the past I have sorted similar data using the below data function although I was curious to see if I could make it more efficient by using an expression.

Robert Hood · March 11

try and add an index column, then reference the index rather than datetime. So patient 111 indexed lines are 1,23, and the next is 4,5,6 etc..? It shouldn't matter when you add new data lines because your next index number is always greater than the prior. I'm a rookie so I'm spitballing there.

Gaia Paolini · March 12

I could think of a way without data functions, but it is a bit involved, and the final data structure is not exactly what you wanted.
Maybe there is an easier way? Otherwise I would do it in a data function.

You start with one 'initial' dataset and one 'additional' dataset.
Read them in as separate data tables, with DoseDateTime read of type Datetime.
Change column names of Dose DateTime to 'Initial' and 'Additional' respectively.

Add a new table with the + symbol on the left, choose 'Other' then select the 'initial' table.
Rename it to e.g. 'combined'
In data canvas, add transformation 'add columns' and bring in the 'additional' dataset from the existing table in the analysis
In settings for added columns: join by Patient only, and select Full Outer Join as join type. In 'columns from new data' only import the column you named 'Additional'.

This will do a cartesian join within each [Patient], now you have to work out which rows make sense.

on the 'combined' table create a calculated column: [Delta], as
DateDiff('day',[Initial],[Additional])

Now create your target column [Goal] as:
1 + sum(Integer(([Delta]>=0) and ([Delta]<=2))) OVER ([Patient],[Initial])

the logical clause is: the time difference is more than zero and less than two days .
Integer() turns True/False into 1/0.
So when you sum, you ignore the zeros.
You add 1 as you need to count the initial dose itself
You group by Patient and initial dose.

Sign In

Cumulative count between dates (Calculated column)

Recommended Posts

Stephane Forrer

Link to comment

Share on other sites

Gaia Paolini

Link to comment

Share on other sites

Stephane Forrer

Link to comment

Share on other sites

Robert Hood

Link to comment

Share on other sites

Gaia Paolini

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Industries