Dominik Posted September 4 Share Posted September 4 I have a dataset from two production batches, where did a linear regression in python and exported the fitted values to a data table. For the regression model I chose the intercept to be 0. So, only the slope coefficient was fitted. Now I want to show a scatter plot with the fitted lines in a trellis visualization (one trellis panel per production batch). However, when I choose the option "curve from data table" to draw the line, I get both lines in each trellis, where only one fitted curve would be the correct one (see image "Plot"). Is there a way to designate which curve shows up in which trellis panel? The other potential solution is simply choosing a straight line fit from the "Lines and Curves" menu. However, there I found no way to force the intercept to be equal to 0. Are there any pointers or solutions to this problem? Link to comment Share on other sites More sharing options...
David Boot-Olazabal Posted September 4 Share Posted September 4 Hi Dominik, Not sure if the trellis would work in a way that you like. Trellising basically 'breaks' the visualization into mini visualizations, based upon your Trellised variable. But these 'mini's would include all configurations from your original visualization (color would be different of course, if set on batch as in your example). To better answer your last question, could you share an example dxp file? Kind regards, David Link to comment Share on other sites More sharing options...
Dominik Posted September 4 Author Share Posted September 4 Hi David I attached the dxp file. Thanks for your help. reduced_spotfire_community.dxp Link to comment Share on other sites More sharing options...
Rae Chen Posted September 5 Share Posted September 5 Hi Dominik: As my colleague David said, trellis work for all configurations. As for straight line fit, it generates new linear regressions for your data. You may achieve your goal by doing this: 1. Add two calculated columns—one for each coefficient corresponding to each batch—and use an IF condition to set the other batch's value to NULL. (Since I don’t have the numbers, I assumed 8.6 for Batch A and 5.75 for Batch B.) Your data will then look like this: 2. Go back to the canvas, add curve from data table by setting the expression to [cof_batch_a]*[x]. You'll need to add another curve for batch B: [cof_batch_b]*[x]. Make sure to check the boxes under One per so you can see different colours. The final diagram will look like this. The intercepts will be 0 once you update the coefficients with the correct values. reduced_spotfire_community_RaeEdit.dxp Link to comment Share on other sites More sharing options...
Dominik Posted September 5 Author Share Posted September 5 Hi Rae, thanks for the suggestion. For a static file that would be a solid solution. What I forgot to mention in the original request, is that the file will be used to load any batch the user wants, which could also be more than two batches. So, in order to make your solution work, I guess I would need to write an ironpython script, that dynamically adds these new columns and draws the curves based on the selected batches. If there is any way to make the "straight line fit" go through 0, I think that would be the easier option. Link to comment Share on other sites More sharing options...
Solution Rae Chen Posted September 5 Solution Share Posted September 5 Hi Dominik: If your data function can store all the coefficients for each batch in a single column (e.g., [_coef]), you don't need the IF function and can simply add curve from data table using the expression [_coef]*[x], as shown below. You'll only need to do it once no matter how many batches there are. I'm afraid it's not possible to set the intercept of straight line fit to 0. Tested on 3 batches: Link to comment Share on other sites More sharing options...
Olivier Keugue Tadaa Posted Friday at 04:13 PM Share Posted Friday at 04:13 PM (edited) Hi Dominic, I looked at your data set and the two following columns are missing : [fit inter], [fit coeff]. could you provide a dataset with those missing columns? It should not draw two lines per trellis panel if the values differ across the bacth_red(s). I suspect that the [fit inter] and [fit coeff] values are overlapping between the batch_red(s) groups. Here is what it should look like Edited Friday at 04:16 PM by Olivier Keugue Tadaa Link to comment Share on other sites More sharing options...
Dominik Posted Tuesday at 07:36 AM Author Share Posted Tuesday at 07:36 AM Thanks Rae and Chen for your input. I found the issue of why all fitted lines were in each trellis in the beginning. I had a separate data table with the coefficient from the fitting that were assigned to each beach. So, if I plotted the points from one data table and separated them into trellis by batch, this separation did not hold up when I used a different data table for the coefficients. I added the column with the coefficients to the original data table and now I have one line per trellis. Thanks a lot for your help! 1 Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now