Jump to content

Spotfire Chat GPT Mod/Iron Python (or similar)


apreble

Recommended Posts

Does anyone know if there is currently an option/ability in Spotfire for a Mod or some script that could summarize 1 column with multiple rows? I have a data table that consists of daily activity and I'd like to summarize all the "Comments" column for a specific day into a paragraph, if it's possible. Thank you for the help!

Link to comment
Share on other sites

hi apreble,

there are a couple of options for this, depending on how you want to display the summary. in my examples below, i'm using `Concatenate()` but if you expect to have duplicates you may want to use `UniqueConcatenate()` -- just be aware that the latter option can be a bit slower depending on how many rows you're concatenating.

if you want the summary in a column as part of the Data Table, you could use a Calculated Column with the expression:

Concatenate([comment column]) OVER [another column]

more likely, you want to display this in a Text Area, in which case a Calculated Value can be added with a similar expression. this screenshot shows where the Calculated Value option is:

image.png.5c6b80c72a23ca55a834f2b14ee2a7db.png

and the expression:

Concatenate([comment column])

hopefully that sorts you out. if not, please share more details about how you'd like to display the comments and we can go from there :)

Link to comment
Share on other sites

although, i'm just re-reading your post and i think i've misunderstood your question :) an action mod miiiight be able to handle this for you, but i think mods are sandboxed from the outside world, meaning you wouldn't be able to submit a block of text to an LLM on another network. i'll play around with this and get back to you if i learn anything new!

Link to comment
Share on other sites

Hi @Niko Maresco , 

Thanks for the reply! Yes, I am only looking for a summarized version of some commentary fields. I have a column meant for comments (free text) and I also have the day broken down in hours/couple hour intervals and I'd like to summarize all the activity comments for all hour rows in a day. Not trying to concatenate, but just a summary. 

Link to comment
Share on other sites

hi apreble

so indeed, action mods are sandboxed; so that won't work. but you can use IronPython instead! buckle up, this is gonna be a bit of a ride :)

assuming you'll want to send this to ChatGPT you could use the sample below as a starting point (i have not tested this code!). you can find other examples of using REST APIs with IronPython in our community articles.

if you're familiar with "regular" Python, know that within Spotfire's IronPython context, the requests module is not available, so we need to use the .NET equivalent and it's a little, hm, wordy? BUT if you are comfortable deploying the requests module to your Spotfire Server, you can write this all as a Data Function and it'll be much cleaner and won't require user interaction.

for ChatGPT, in the code i created a completely arbitrary prompt in the code in case you need a starting point. you might want to try a few times with some samples to see what Mr GPT comes up with and tweak accordingly. i was not able to test this end to end as i'm out of OpenAI credits. you may need to play with the result handling 

lastly, the prompt requests separate daily summaries plus one total summary as one JSON object which you would need to parse. i didn't put much effort into parsing it since i wasn't sure if that was what you wanted -- if you want something simpler you could change the prompt to request the summaries in some other format/structure.

i put together a little sample DXP (Spotfire 14.4) that you can open up and insert your API key into the script. here are the steps i took to build it. i am assuming you have an intermediate-advanced level of Spotfire expertise, but of course if you'd like clarification on anything let me know :)

  1. create two new Document Properties: commentContext and commentSummary 
  2. create a new Python Data Function with an input of type Table named input_table and an output of type Value named output_value. use the following code in the function (this assumes your columns are [datetime] and [comments] -- change these values if that's not the case):
    output_context = "".join(f"{row.datetime};{row.comments}|" for index, row in input_table.iterrows())
  3. save the Data Function and then configure the parameters: input_table gets the relevant columns from your Data Table; output_context gets assigned to the commentContext Document Property. i ticked Refresh function automatically so that if the Data Table changes, the context will be updated automatically, but if you prefer to have manual control over this, leave that box unticked.
  4. save and close all the Data Function dialogs.
  5. create a Text Area and add an Action Control containing a new IronPython script.
  6. add two String type parameters to the script: context_input and summary_output. assign these to the commentContext and commentSummary Document Properties, respectively.
  7. use the IronPython script below, making sure to update your OpenAI API key:
    import clr
    clr.AddReference("System")
    clr.AddReference("System.Web.Extensions")
    
    from System import Uri
    from System.IO import StreamReader, StreamWriter
    from System.Net import HttpWebRequest
    from System.Text import Encoding
    from System.Web.Script.Serialization import JavaScriptSerializer
    
    # Define the API endpoint and API key (replace with your actual API key)
    api_url = "https://api.openai.com/v1/chat/completions"
    api_key = "APIKEY"
    
    # Define the prompt and context
    # change prompt to suit your use case
    prompt = """
    You are an expert at summarizing status report data.
    Each comment below is in the format timestamp;comment and delimited by the | character.
    Summarize the comments by day, then generate a larger summary for the entire set of comments.
    Produce a JSON output like: { daily: [ { day: [day], summary: [summary] }, ... ], total: date_range: [range], summary: [summary] }.
    All dates should be in YYYY-mm-dd format.
    Do not include any information not represented by the comments.
    Do not produce any output or commentary other than the JSON.
    """
    context = context_input
    
    # Data payload for the request
    data = {
        "model": "gpt-3.5-turbo",  # change model here as needed
        "messages": [
            {"role": "system", "content": context},
            {"role": "user", "content": prompt}
        ]
    }
    
    # Serialize the data to JSON
    json_serializer = JavaScriptSerializer()
    json_data = json_serializer.Serialize(data)
    
    # Create the HTTP request
    request = HttpWebRequest.Create(Uri(api_url))
    request.Method = "POST"
    request.ContentType = "application/json"
    request.Headers.Add("Authorization", f"Bearer {api_key}")
    
    # Write the JSON data to the request stream
    request_stream = request.GetRequestStream()
    writer = StreamWriter(request_stream, Encoding.UTF8)
    writer.Write(json_data)
    writer.Close()
    
    # Get the response from the server
    response = request.GetResponse()
    response_stream = response.GetResponseStream()
    
    # Read and deserialize the response
    reader = StreamReader(response_stream, Encoding.UTF8)
    response_text = reader.ReadToEnd()
    
    # Deserialize the JSON response
    result = json_serializer.DeserializeObject(response_text)
    summary_output = result['choices'][0]['message']['content']
    print("Generated Text:", summary_output)
    
    # Clean up
    reader.Close()
    response_stream.Close()
    response.Close()
  8. either in a new Text Area, or using the one created earlier, add a Property Control of type Label and set it to the commentSummary Document Property.
  9. click the button and see your summary populate in the Text Area.

gpt-summarize.dxp

Link to comment
Share on other sites

Thanks for the reply! I'm not the most savvy with Python or IronPython, but I can work with this and give it a try. I might have the same problem as you with GPT- it's not free to use so this might backfire on me completely. I tried to open your dxp file and for some reason received an error when trying to open it. 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...