Jump to content

Find value in column when specific criteria are met


Bernard Clark

Recommended Posts

My database is organized with columns [Project], [DG] (Decision Gate/Formal Meeting Level), [section] (class of variable), [Variable] (tracked data), [base] (value of tracked data), [Date] (value of tracked decision gate meetings).[Date] and [Value] are seperate columns to deal with excel/spotfire formatting issues between dates and real numbers. I've attached a sample dataset.

I have calculated a [Data Accuracy] column to track how accurate the assessments were at each stage (DG), and I am looking for a way to plot that accuracy over time with respect to multiple projects. The simplest way that I can see is to create a new calculated column [DG Date] that will distribute the dates of the decision gates across all variables.

This is how I assume th sampel code would work, but I can't get the over fuctions to find specific variables. What I'd like to do, effectively, is have a vLookUp or Index search for three variables ([DG] = "Actual", [Variable] = "DG1 Date", [Project] = current project)and then retun the value in the [Date] column

 

New column [DG Date]

CASE

WHEN [DG] = "DG1" THEN return value of[Date] where [DG] = "Actual" and [Variable] = "DG1 Date" OVER [Project]

WHEN [DG] = "DG2" THEN return value of[Date] where [DG] = "Actual" and [Variable] = "DG2Date" OVER [Project]

...etc...

ELSE NULL END

 

 

I can't figure out how to use the OVER functions (inresect, etc) to return only for specific values of the columns. I've been able to jerry-rig nexted sum/if/over statements for real values, but I can't get that to work with date values.

Thanks.

 

 

 

Edit 2:

I am able to do the following, which returns what I need. When I try to combint it into one calculated column instead of 4, it returns null values. How can I combine this to decrease the number of extraneous columns in the project

[test1] =Max(If(([DG]="Actual") and ([Variable]="DG1 Date"),[Date])) over ([Project])

[test2] =Max(If(([DG]="Actual") and ([Variable]="DG2Date"),[Date])) over ([Project])

[test3] =Max(If(([DG]="Actual") and ([Variable]="FIDDate"),[Date])) over ([Project])

[test4] =case

WHEN [DG]="DG1" then Max(If(([DG]="Actual") and ([Variable]="DG1 Date"),[Date])) over ([Well Name])

when [DG]="DG2" then [test2]

when [DG]="FID" then [test3]

else null END

Link to comment
Share on other sites

Sorry, I don't fully understand what you want to do. Maybe because you changed your column names in the description and the example... Can you please fix that

 

The OVER statement is always doing an aggregation. You can see it as a grouping. And a aggregation can only return one value for this group. So you need something like Min, Max, Average, First, Last,....

 

So let's say you do something like Max([Value]). This would only return the maximum value for the entire Value column. But if you do Max([Value]) OVER [Project] it will return the max val for every project. And Max([Value]) OVER ([Project],[Date]) will return the max val for every unique project and day combination.

 

Sometime, in particular as a beginner, it might be easier to first do the grouping and then continue with the CASE statement. It's just more readable. You could for example use a column that just concetenates all columns that you need for your grouping (make sure you use strings only). Or use a CASE statement when you only have a few groups. Then you can use this column in your calculated column with the CASE statement.

 

At the end you most likely will have to include a IF statement within your CASE statement. But this is just a guess. If you could share some rows of your data table or some dummy data set you can get further help.

Link to comment
Share on other sites

Okay, 1st thing you should do is to get your data correct. Because you are mixing the DG and Variable column. You have a expected date and an actual date for each gate. So please get it to this format and then I'm sure you'll be able to finish your task:

 

actual.png

 

You could pivot this kind of data so that you get two columns with expected and actual date and then merge it to your other data set. Where your identifier would be the project and the ID. (I assume/hope you have at least two data sources, since the data in the excel sheet is kind of a mess ;). PS I don't understand what the 'start date' is good for... )

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...