Stephane Forrer Posted March 4 Share Posted March 4 Hi there I'm looking to count the number of doses within a 2 day period like below (# doses within 2 days of initial is the calculated column) Patient Dose Dose DateTime # doses within 2 days of initial 1111 Initial 01Jan2024 10:00 3 1111 Initial 03Jan2024 10:00 1 1111 Initial 10Jan2024 10:00 1 1112 Initial 12Jan2024 10:00 1 1112 Initial 15Jan2024 10:00 2 1112 Initial 18Jan2024 10:00 1 1111 Additional 01Jan2024 18:00 3 1111 Additional 01Jan2024 19:00 3 1112 Additional 15Jan2024 12:00 2 I'm able to calculate the time since prior dose per patient but am now stuck adding a calculated column to count the additional doses: Dose Ranked: Rank([Dose DateTime],[Patient]) TimeSinceDose: DateDiff("day", First([Dose DateTime]) OVER (Intersect([Patient],Previous([Dose Ranked]))), [Dose DateTime]) Thanks! Stéphane Link to comment Share on other sites More sharing options...
Gaia Paolini Posted March 6 Share Posted March 6 In your data, there is no order of events, so no way to look for 'previous' or 'next' events, or assign the Additional doses to its own Initial dose. Is there any way such assignment can be provided in the data? Otherwise, would a data function work for you? Link to comment Share on other sites More sharing options...
Stephane Forrer Posted March 11 Author Share Posted March 11 Thanks Gaia, I'm not able to do a sort in the data source as "Initial" and "Additional" are different flat files that have been joined in Spotfire. In the past I have sorted similar data using the below data function although I was curious to see if I could make it more efficient by using an expression. Link to comment Share on other sites More sharing options...
Robert Hood Posted March 11 Share Posted March 11 try and add an index column, then reference the index rather than datetime. So patient 111 indexed lines are 1,23, and the next is 4,5,6 etc..? It shouldn't matter when you add new data lines because your next index number is always greater than the prior. I'm a rookie so I'm spitballing there. 1 Link to comment Share on other sites More sharing options...
Solution Gaia Paolini Posted March 12 Solution Share Posted March 12 I could think of a way without data functions, but it is a bit involved, and the final data structure is not exactly what you wanted. Maybe there is an easier way? Otherwise I would do it in a data function. You start with one 'initial' dataset and one 'additional' dataset. Read them in as separate data tables, with DoseDateTime read of type Datetime. Change column names of Dose DateTime to 'Initial' and 'Additional' respectively. Add a new table with the + symbol on the left, choose 'Other' then select the 'initial' table. Rename it to e.g. 'combined' In data canvas, add transformation 'add columns' and bring in the 'additional' dataset from the existing table in the analysis In settings for added columns: join by Patient only, and select Full Outer Join as join type. In 'columns from new data' only import the column you named 'Additional'. This will do a cartesian join within each [Patient], now you have to work out which rows make sense. on the 'combined' table create a calculated column: [Delta], as DateDiff('day',[Initial],[Additional]) Now create your target column [Goal] as: 1 + sum(Integer(([Delta]>=0) and ([Delta]<=2))) OVER ([Patient],[Initial]) the logical clause is: the time difference is more than zero and less than two days . Integer() turns True/False into 1/0. So when you sum, you ignore the zeros. You add 1 as you need to count the initial dose itself You group by Patient and initial dose. 1 Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now