Joel Dror 2 Posted November 25 Posted November 25 Hi, I created the following data function script (with the help of some AI chat). It's supposed to get grouped sets in which there is simulation data (IsSim=1) and measured data (IsSim=0). It works well in a solely Python environment on example data I created, but not within Spotfire. In Python env the code started like this to simulate the typical case (see full code further down): FreqSim = pd.Series([1,2,3,4,5,6,7,8,9,10]*3) GainSim = pd.Series([10,20,30,40,50,60,70,80,90,100, -10,-20,-30,-40,-50,-60,-70,-80,-90,-100, 210,220,230,240,250,260,270,280,290,300]) IsSimSim = pd.Series([True]*len(FreqSim)) ColumnForGroupSim = pd.Series(['A']*10 + ['B']*10 + ['C']*10) FreqMeas = pd.Series([1.1,2.2,3.3,4.9, 1.2,2.3,3.5,4.7,4.9, 1.3,3.6]) IsSimMeas = pd.Series([False]*len(FreqMeas)) ColumnForGroupMeas = pd.Series(['A','A','A','A', 'B','B','B','B','B', 'C','C']) data = pd.DataFrame({ 'Freq': pd.concat([FreqSim, FreqMeas]).reset_index(drop=True), 'Gain': pd.concat([GainSim, pd.Series([np.nan] * len(FreqMeas))]).reset_index(drop=True), 'IsSim': pd.concat([IsSimSim, IsSimMeas]).reset_index(drop=True), 'ColumnForGroup': pd.concat([ColumnForGroupSim, ColumnForGroupMeas]).reset_index(drop=True) }) The resulting interpolated series was all OK as expected. Freq Gain IsSim ColumnForGroup InterpGainFromSim 0 1.0 10.0 True A NaN 1 2.0 20.0 True A NaN 2 3.0 30.0 True A NaN 3 4.0 40.0 True A NaN 4 5.0 50.0 True A NaN 5 6.0 60.0 True A NaN 6 7.0 70.0 True A NaN 7 8.0 80.0 True A NaN 8 9.0 90.0 True A NaN 9 10.0 100.0 True A NaN 10 1.0 -10.0 True B NaN 11 2.0 -20.0 True B NaN 12 3.0 -30.0 True B NaN 13 4.0 -40.0 True B NaN 14 5.0 -50.0 True B NaN 15 6.0 -60.0 True B NaN 16 7.0 -70.0 True B NaN 17 8.0 -80.0 True B NaN 18 9.0 -90.0 True B NaN 19 10.0 -100.0 True B NaN 20 1.0 210.0 True C NaN 21 2.0 220.0 True C NaN 22 3.0 230.0 True C NaN 23 4.0 240.0 True C NaN 24 5.0 250.0 True C NaN 25 6.0 260.0 True C NaN 26 7.0 270.0 True C NaN 27 8.0 280.0 True C NaN 28 9.0 290.0 True C NaN 29 10.0 300.0 True C NaN 30 1.1 NaN False A 11.0 31 2.2 NaN False A 22.0 32 3.3 NaN False A 33.0 33 4.9 NaN False A 49.0 34 1.2 NaN False B -12.0 35 2.3 NaN False B -23.0 36 3.5 NaN False B -35.0 37 4.7 NaN False B -47.0 38 4.9 NaN False B -49.0 39 1.3 NaN False C 213.0 40 3.6 NaN False C 236.0 The full code in the Spotfire data function: import numpy as np import pandas as pd from scipy.interpolate import interp1d Freq.name = 'Freq' Gain.name = 'Gain' IsSim.name = 'IsSim' ColumnForGroup.name = 'ColumnForGroup' data = pd.DataFrame({ 'Freq': Freq.reset_index(drop=True), 'Gain': Gain.reset_index(drop=True), 'IsSim': IsSim.reset_index(drop=True), 'ColumnForGroup': ColumnForGroup.reset_index(drop=True) }) print(data[0:100]) # Add a column to store the interpolated values data['InterpGainFromSim'] = np.nan # Function to perform interpolation for each group def interpolate_group(group): # Reset index to avoid any NA issues group = group.reset_index(drop=True) # Separate the known and unknown data known = group[group['IsSim']] unknown = group[~group['IsSim']] if not known.empty and not unknown.empty: # Create interpolation function interpolation_function = interp1d(known['Freq'], known['Gain'], kind='linear', fill_value='extrapolate') # Apply interpolation to the unknown data and store the result in the new column # Interpolate and update the new column in the group interpolated_values = interpolation_function(unknown['Freq']) group.loc[unknown.index, 'InterpGainFromSim'] = interpolated_values return group # Apply interpolation to each group # interpolated_data = data.groupby('ColumnForGroup').apply(interpolate_group) interpolated_data = data.groupby('ColumnForGroup', group_keys=False).apply(interpolate_group).reset_index(drop=True) # Extract the 'InterpGainFromSim' column as a Series interp_gain_from_sim_series = interpolated_data['InterpGainFromSim'] When the data function is run I get the following error: Could not execute function call 'CreateGainDeltaMeasSim' (2) Error executing Python script: KeyError: "None of [Index([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],\n dtype='Int32')] are in the [columns]" Traceback (most recent call last): File "data_function.py", line 364, in _execute_script exec(compiled_script, self.globals) File "<data_function>", line 45, in <module> File "groupby.py", line 1824, in apply result = self._python_apply_general(f, self._selected_obj) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "groupby.py", line 1885, in _python_apply_general values, mutated = self._grouper.apply_groupwise(f, data, self.axis) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "ops.py", line 919, in apply_groupwise res = f(group) ^^^^^^^^ File "<data_function>", line 29, in interpolate_group File "frame.py", line 4108, in __getitem__ indexer = self.columns._get_indexer_strict(key, "columns")[1] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "base.py", line 6200, in _get_indexer_strict self._raise_if_missing(keyarr, indexer, axis_name) File "base.py", line 6249, in _raise_if_missing raise KeyError(f"None of [{key}] are in the [{axis_name}]") I think all my data is intact and I don't know where these ...0, 0,\n come from. Any ideas? Thanks in advance, Joel
Gaia Paolini Posted November 25 Posted November 25 can you share how you set up your input parameters in Spotfire?
Solution Gaia Paolini Posted November 26 Solution Posted November 26 Your IsSim column is read as an "object" into Spotfire. Some data types need to be recast in Python data functions (notably dates, I did not know about booleans). if I add the line below, to recast the column to a bool: IsSim=IsSim.astype(bool) just before defining data, the code runs without error in Spotfire (using your simulated data). 1
Joel Dror 2 Posted November 26 Author Posted November 26 Hi Gaia, Thanks for your help, this line indeed eliminated the error. Now I see values appear in the output column, but I see that this output column is not in sync with the original Freq and Gain columns. I would appreciate any help. Thanks, Joel
Gaia Paolini Posted November 26 Posted November 26 can you recreate this problem with the simulated data? can you elaborate an example of what is out of synch?
Joel Dror 2 Posted November 26 Author Posted November 26 I think I managed. Thanks. Probably some inconsistency in my data, I removed a lot of old data sources that were part of the eventual data table, and now it seems to work OK.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now