Funny Cat Posted August 23, 2023 Share Posted August 23, 2023 HiI've got a source dataset containing the columns "Parameter" and "Value".The data ranges for different parameters have a large variation. (approx. -1E9 ... +1E9 for the largest vs. -0.25 ... +0.25 for the smallest range)I'm creating three visualizations:a cross table containing the parameters on the vertical axis and some statistical values in the valuesa box plot in which the data is limited to the marking of the cross tablea histogram in which the data is limited to the marking of the cross tableIn the histogram I'd like to compare two subsetsdata in the marking of the box plotdata not in the marking of the box plot (but only data marked in the cross table)The problem I've got is that the histogram x-axis range is obviously the entire range of the value column of the source data. Even if I would set a filter to the single selected parameter and set the "Evaluate axis expression on: Current filtering only" the range is not changing.Of course I can manually or programmatically set the range to the range I need. The problem is, that for the small ranges there's not enough histogram bins (max is 1000000000) and therefore I only get one large bar in the region of interest.This behavior only occurs when I use subset different then "Current filtering". When I set the subset to "Current filtering", the proper range is shown.Help is highly appreciated.Thank you! Link to comment Share on other sites More sharing options...
Julieta Diaz Posted August 24, 2023 Share Posted August 24, 2023 Hi,Could you please include a sample dxp file that illustrates your issue? Best regards,Julieta Link to comment Share on other sites More sharing options...
Funny Cat Posted August 24, 2023 Author Share Posted August 24, 2023 Hi Julieta,here's a sample file.The range of Parameter40 is huge.In the parameter table select a parameter (one that is not red).Let's say Parameter91. In the table you see that the minimum is 3.66, the maximum is 4.07The box plot is updated. The range of the box plot is adjusted so that it fits the parameter range (plus the red limit lines)Mark some boxes in the box plot.The histogram plot is updated. However, the x-axis range is still huge. If you set the category axis range to Min 3.4 / Max 4.4, you get 2 histogram bar in the range of interest. Unfortunately more bars is not possible, due to the maximum number of bins is already reached.Set the filter on Parameter to Parameter91 --> the behavior is not changing.Remove the filterSet the category axis range to automatic again.Change the subset to "Current filtering" onlyThe histogram range is automatically adjusted to range 3.4 ... 4.4 and one can set a sensible amount of bins (e.g. 10) --> that's how it should behave also with custom subset.I guess there's a bug in how the marking subset determines it's data range.Instead of using the data range from the currently applied filter or marking, it is using the data range from the entire source dataset.Thank you very much! Link to comment Share on other sites More sharing options...
Julieta Diaz Posted August 30, 2023 Share Posted August 30, 2023 Hi,Have you tried normalizing the data? For example, scaling the data from its original values to a range between 0 and 1. You can do that by adding a transformation in the Data Canvas before unpivoting your data table (see the screenshot attached). Hope this helps. Regards Link to comment Share on other sites More sharing options...
Funny Cat Posted August 30, 2023 Author Share Posted August 30, 2023 Hi Julieta,no I haven't.But this is no option. The users who work with the template have to be able to see the parametric values and limits, not some normalized data.I'd rather clean the dataset from this heavy outliers. But I'd prefer to keep the dataset untouched, if there's a way. Link to comment Share on other sites More sharing options...
Gaia Paolini Posted August 31, 2023 Share Posted August 31, 2023 I think your problem in visualization terms are the strong outliers.However, the Marking subsets seem to ignore filtering. So I cannot filter them out and use the marking to subset.I would try a different approach than subsets. I experimented with this, maybe it helps.0. Remove binning of [Value]. This seems to create problems and was set to a really high number of bins when I opened it.1. Create a column [is_Outlier] that tells you whether the Value is an outlier or not, for each ParameterSN(([Value]>Avg(UIF([Value])) over ([Parameter])) or ([Value]<Avg(LIF([Value])) over ([Parameter])),True)where UIF(..) and LIF(..) are equivalent to box plot limits, and SN(..,True) sets any empty values (there is one forParameter40) to TrueSo firstly you can filter on it using a filter in a Text Area.2. Create the subsets 'programmatically' by having a multiple-select List box where you can select the Numbers you want to assign to the selected group (the ones you would have originally marked in red). The associated document property group1 is declared as an Integer (for some reason, even if it will contain a comma-separated list of integer values) and filled with the selected values in the [Number] column.3. Create a second calculated column [selected_Group] that assigns values to group1 or not (so it is a boolean)Find(String([Number]),'$map("${group1}", " ")')>0or a nicer, alternative version as a string that gives you the group members explicitlycase when Find(String([Number]),'$map("${group1}", " ")')>0 THEN '$map("${group1}", ",")' else 'Rest' endso now rows can be grouped by the values of [selected_Group]4. Remove the previous bar chart and create one that is limited by the blue marking and is trellised by [selected_Group]This will also react to the filtering in/out of the outliers.Now the box plot is not really doing anything, but it is there to guide the selection of the groups. Link to comment Share on other sites More sharing options...
Funny Cat Posted August 31, 2023 Author Share Posted August 31, 2023 Hi Gaia,thanks for the input."I think your problem in visualization terms are the strong outliers."Indeed, that's what I figured."However, the Marking subsets seem to ignore filtering. So I cannot filter them out and use the marking to subset."Wouldn't you call that a bug? In my opinion, the data should be limited to what is set in the Data section of the chart settings. Marking and filtering should always be the base for which data appears in the subsets. Do you agree?In case I have to filter the outliers I'm thinking about an easy solution:calculating the outlier columnMaking another dataset from a linked copy to my previous datasetFilter the outliers with a "Filter rows" transformation.Use the new dataset as inputCreate the Marking/Not in Marking subsets.--> works just fine Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now