Jump to content

how to create a calculated column from another data table using script


Ahmed Ali Shah

Recommended Posts

I have 2 Data Tables.

Table 1: Table 2:

Days StartTime EndTime value

9/2/2020 00:00:00 9/2/2020 00:01:00 9/2/2020 00:03:00 123

9/2/2020 00:01:00 9/2/2020 00:04:00 9/2/2020 00:09:00 456

9/2/2020 00:02:00 9/2/2020 02:01:00 9/2/2020 03:03:00 789

9/2/2020 00:03:00 9/2/2020 11:33:00 9/2/2020 23:59:00 987

9/2/2020 00:04:00

9/2/2020 00:05:00

.

.

.

9/2/2020 23:59:00

 

I want to create another column 'Val1' in Table 1 based on condition"StartTime = EndTime"and if condition is satisfied then fill with Column "Values".

 

Expected Answer is

Table1:

Days Val1

9/2/2020 00:00:00 123

9/2/2020 00:01:00 123

9/2/2020 00:02:00 123

9/2/2020 00:03:00 123

9/2/2020 00:04:00 456

9/2/2020 00:05:00 456

.

.

.

9/2/2020 23:59:00 987

Link to comment
Share on other sites

Be careful with your condition: "StartTime <= Days >= EndTime" should be changed to "StartTime < Days >= EndTime" or to "StartTime <= Days > EndTime". Furthermore you should not get any match for your first data 9/2/2020 00:00:00 because it doesn't fulfill the condition.

 

Do you need an entire script Or is it sufficient if I explain the approach

Link to comment
Share on other sites

Have you worked with R data functions so far If yes then you can use the following approach:

 

Inputs: your two tables

 

Output: a new table

 

Here's an example that you can run with RStudio

 

#input: table1
df1 <- data.frame (ColA = seq(10))

#input: table2
df2 <- data.frame (Start = c(1,5,9), End = c(4,8,10), Val = c(11,13,15))

#output: table3. Initialized with column from table one and dummy entry 1 
df3 <- data.frame (ColA = df1$ColA, Val = c(1))

#lenght of table1
cnt <- nrow(df1)

library(dplyr)

#loop thru all values of table1 and use them to filter table2. store output to table3
for (i in seq(cnt)){
 #get value of 1st column
 x <- df1[i,1]
 
 #filter table2 to this value with your condition
 temp <- df2 %>% filter(x >= Start & x <= End)
 
 #make sure that condition returns some value. Else return NA
 if (nrow(temp) > 0){
   Val <- temp %>% select(Val)
 } else {
   Val <- NA
 }
 
 #store values to table3
 df3[i,1] <- x
 df3[i,2] <- Val
}
# end of script

Link to comment
Share on other sites

Is there any R script for this Or any other way also works

Date table 1

Col1

 

11/14/2018 5:00:00 AM

 

11/14/2018 5:01:00 AM

 

11/14/2018 5:02:00 AM

 

11/14/2018 5:03:00 AM

 

11/14/2018 5:04:00 AM

 

11/14/2018 5:05:00 AM

 

11/14/2018 5:06:00 AM

 

11/14/2018 5:07:00 AM

 

11/14/2018 5:08:00 AM

 

11/14/2018 5:09:00 AM

 

11/14/2018 5:10:00 AM

 

11/14/2018 5:11:00 AM

 

11/14/2018 5:12:00 AM

 

11/14/2018 5:13:00 AM

 

11/14/2018 5:14:00 AM

 

11/14/2018 5:15:00 AM

 

11/14/2018 5:16:00 AM

 

11/14/2018 5:17:00 AM

 

11/14/2018 5:18:00 AM

 

11/14/2018 5:19:00 AM

 

11/14/2018 5:20:00 AM

 

11/14/2018 5:21:00 AM

 

11/14/2018 5:22:00 AM

 

11/14/2018 5:23:00 AM

 

11/14/2018 5:24:00 AM

 

11/14/2018 5:25:00 AM

 

11/14/2018 5:26:00 AM

 

11/14/2018 5:27:00 AM

 

11/14/2018 5:28:00 AM

 

11/14/2018 5:29:00 AM

 

11/14/2018 5:30:00 AM

 

.

 

.

 

.

 

11/15/2018 4:59:00 AMDataTable2

 

StartTime                               EndTime                       Value

 

11/14/2018 5:00:00 AM 11/14/2018 5:00:20 AM 3218.248

 

11/14/2018 5:00:22 AM 11/14/2018 5:39:30 AM 3257.966

 

11/14/2018 5:39:32 AM 11/14/2018 5:39:58 AM 3257.973

 

11/14/2018 5:40:00 AM 11/14/2018 5:49:28 AM 3257.973

 

11/14/2018 5:49:30 AM 11/14/2018 5:50:28 AM 3257.973

 

11/14/2018 5:50:30 AM 11/14/2018 5:52:58 AM 3193.992

 

11/14/2018 5:53:00 AM 11/14/2018 5:53:36 AM 3198.542

 

11/14/2018 5:53:38 AM 11/14/2018 6:02:52 AM 3198.558

 

11/14/2018 6:02:54 AM 11/14/2018 6:04:58 AM 3101.826

 

11/14/2018 6:05:00 AM 11/14/2018 6:10:22 AM 3104.928

 

11/14/2018 6:10:24 AM 11/14/2018 6:12:24 AM 3009.105

 

11/14/2018 6:12:26 AM 11/14/2018 6:12:54 AM 3011.392

 

11/14/2018 6:12:56 AM 11/14/2018 6:17:52 AM 3011.878

 

11/14/2018 6:17:54 AM 11/14/2018 6:21:36 AM 2915.363

 

11/14/2018 6:21:38 AM 11/14/2018 6:22:54 AM 2925.756

 

11/14/2018 6:22:56 AM 11/14/2018 6:24:20 AM 2915.662

 

11/14/2018 6:24:34 AM 11/14/2018 6:27:32 AM 2917.891

 

.

 

.

 

.Expected Output:Hour Min    Value

 

5        0    3260.347

 

5       1     3260.347

 

5       2    3260.347

 

5       3   3260.347

 

5       4   3260.347

 

5       5   3260.347

 

5       6   3260.347

 

5       7   3260.347

 

5       8   3260.347

 

5       9    3260.347

 

5      10   3260.347

 

5       11     3260.347

 

5       12    3260.347

 

5       13 3260.347

 

5           14 3260.347

 

5           15 3260.347

 

5      16 3260.347

 

5      17 3260.347 

Link to comment
Share on other sites

When you work with dates you might have to use the POSIXct format for all date columns to get the comparison work correctly. 

 

You can set the format like this: data.frame$column <- as.POSIXct(data.frame$column, format = ...)

 

Example: as.POSIXct("11/14/2018 6:10:24 AM", format = "%m/%d/%Y %I:%M:%S %p")

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...