Jump to content

Discrepancy between TERR and R output


Sam Waterworth

Recommended Posts

I have come across a baffling problem. I wrote and tested a script in R which (in part) computes a statistic for a model fit of data. This statistic is then added to a row of mixed data formats and appended to the end of a dataframe. When a statistic cannot be computed, a NULL value is assigned.

There are 175 NULL values in the dataset (manually confirmed by inspecting data graphs and running analyses for each individually). This runs just fine in R (v 4.0.3). However, when I upload the exact same script as a TERR data function in Spotfire, a total of 437 NAs are introduced by coercion. There doesn't seem to be any rhyme or reason as to which values are getting axed (please see screenshots below of direct comparison).

Snippet of pertinent code (ellipsis indicate where additional code is):

...for (i in group_list) { ... rss = "NULL" skip_to_next <- FALSE tryCatch({ i1 = getInitial(pred_coords$y ~ SSfpl(pred_coords$x, A, B, xmid, scal), data = pred_coords) #Calculate initial values for model fitting to start at rss = sum(residuals(nls(pred_coords$y~B + ((A-B)/(1 + exp((pred_coords$x-xmid)/scal))), data=pred_coords, start=i1))^2) #Run model fitting and extract, sum, and square residuals }, error = function(e) { skip_to_next <<- TRUE}) Tm_df[nrow(Tm_df) + 1,] = unique(sub_df$Well), unique(sub_df$Name), as.numeric(rss)) #Append new row with RSS if(skip_to_next) { next } ...}...

The rss value when calculated is numeric, and I have added the as.numeric function in an effort to bypass this weird problem (no avail obviously). Now, when I run this on R, the final table Tm_df, has RSS listed as numeric.

Here are de-identified screenshots of the output table (Tm_df) as run in R (v 4.3.0) and TERR (v R_64_6.0.0.69). You will note that several of the values in R are blank in the Spotfire output. Can someone please explain the cause of this discrepancy and what I can do to fix it?

Spotfire output:

image.thumb.png.3b71a348beefe755ee272fee9ffd7839.png 

R output (of exactly the same wells):

image.thumb.png.c61fd610582c76388d64aefd6725d82d.png 

Link to comment
Share on other sites

I have tried to run your script both in R and TERR. Both generate some warnings, but TERR generates more warnings than R, as it seems to have more than one source of error.

It looks like the TERR implementation of some stats functions are different.

For instance, smooth.spline gives slightly different results, but that looks rather benign. In fact, the pred_coords look virtually identical in R and TERR.

What seems to differ, is the calculation of

i1 = getInitial(pred_coords$y ~ SSfpl(pred_coords$x, A, B, xmid, scal), data = pred_coords)

this is what throws errors, which are also the source of your warnings when you try to assign to Tm_df. But your code does not show the errors.

In R, the error (e.g. for Plate3_I19) is:

Error in nls(y ~ cbind(1, 1/(1 + exp((xmid - x)/exp(lscal)))), data = xy, :   number of iterations exceeded maximum of 50

  

whereas in TERR the error for the same Plate is:

Error in nlsfit.plinear(start, RHS, response, mEnv, weights = weights : number of iterations exceeded 'maxiter'

These could perhaps be easily sorted by specifying more iterations.

However, TERR throws a different error for Plate3_A3:

Error in nlsfit.plinear(start, RHS, response, mEnv, weights = weights : step factor reduced below 'minFactor'

I don't know what causes this. There are multiple opinions in Stack Overflow, but I did not manage to find one that fits (pardon the pun).

My tentative suggestion would be to add defensive code, and if you want to know more about the TERR implementation, to open a support case.

Link to comment
Share on other sites

Hi Gaia

Thanks for getting back to me and having a good go at this issue. What I don't understand is how TERR is generating errors (and therefore resulting in the NULL RSS values) when R analyzes the data without issue. How do I run TERR interactively? I'd like to get a list of what wells are throwing what errors in TERR and to see if I can identify a common theme in the "additional" errors that TERR finds.

Link to comment
Share on other sites

R too generates errors, see the case for Plate3_I19. It just generates errors of a different kind, and not that many in your case.

Some stats functions are implemented differently in TERR. Sometimes I noticed that they have different input parameters.

The overall idea was to make them more robust I believe (sometimes R functions cut a few corners).

To debug TERR code you need to do a few things:

1) add a line at the start of your R script:

save.image('C:/<your folder>/Debug.RData')

pointing to an existing folder where you want to write the RData, and the file name obviously does not matter.

Then run your R script again in Spotfire to generate the file.

2) open RStudio pointing to the TERR interpreter. You can check what version of R/TERR is used by RStudio by typing the 'version' command. (This also works for testing the code in open source R).

To do this, the easy way would be in Spotfire to go to (top menu) Tools > TERR Tools > click on Launch RStudio IDE.

This works for me, does not work for some.

In the latter case, you can still open RStudio and point it manually to TERR:

- still in Spotfire, Tools > TERR Tools > Copy TERR engine path to clipboard. This will tell you where TERR is.

In my case, it is:

C:UsersblahblahTIBCO Enterprise Runtime for R_64_6.0.1.13engine

- in RStudio: go to Tools > Global Options > General. You will see an R Version. Click 'Change' then 'Browse' then copy and paste on the explorer the engine path above.

Click on 'select folder' then 'OK'. You may need to close and re-open RStudio.

3) Note that there might be issues with compatibility between TERR and newer RStudio versions. I have RStudio 1.3.1093. 

You might need to install an older version if that is a problem for you.

4) copy and paste your R code into RStudio. 

5) change 'save.image' to 'load'. This enables you to load exactly the inputs that Spotfire is sending.

6) run your code. It helps to add the line: options(warn=1)

on top, to see the warnings as they are generated.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...