Jump to content

NLP and LLMs in Spotfire

Quick links to all NLP and LLM tools

  • NLP and LLMs in Spotfire


    This article is home to find all topics and resources related to natural language processing (NLP) and large language model (LLM) tools within Spotfire. It is meant to be a concise table of contents that links to more in-depth resources.

     

    Introduction

    Spotfire is a powerful visual data science platform that enables users to make data-driven decisions. One of the most exciting features of Spotfire is its native integrations with natural language processing (NLP) and large language models (LLMs) tools. This article points to all the content we have in the rapidly evolving Generative AI (GenAI) space with an emphasis in NLP and LLMs.

    Screenshot2024-05-06at4_16_59PM.thumb.png.5f8506d19232f3aaf4f11c44fad68e52.png

     

    For further questions or feedback you can leave comments on this post or email us at datascience@spotfire.com.

     

    Spotfire Copilot

    Copilot Main Page

    Spotfire Copilot™ is a free, natural language extension to the Spotfire® platform. It leverages LLMs to help Spotfire users be significantly more efficient in generating charts, data views, data functions, and reports. It also allows companies to add the context of their proprietary documents to Spotfire privately. The Copilot is geared for everyone from novice to seasoned users. It is important to consider the Copilot as a copilot to the human user of Spotfire. The team has heavily tested Copilot using OpenAI’s GPT-3.5 & GPT-4 models.

    Copilot Exchange Download

    Here is the direct download for Copilot; it's one of our most popular Exchange items!

    White Paper

    This white paper contains an advanced overview of the NLP and LLM landscape, specifically the evolution, advantages, and limitations of using LLMs. You can download the white paper here: NLP and LLMs in Spotfire - Whitepaper.pdf. Make sure you are logged in for access.

    Copilot Development Blog

    This community article, Interact with Spotfire in human language, details how Copilot has evolved in this ever-changing GenAI landscape especially since late 2022.

    Glossary

    This community article explains the most commonly used terminology in NLP and LLM.

     

    Ready-to-Use Dashboards

    NLP Module in spotfire-dsml Python Library

    spotfire-dsml is our custom Spotfire python library introduced in 2023 that has modules in NLP, machine learning, time series, geospatial, and more! For information specifically on the NLP module within the library, check out this article: Analyze any Text Data in Spotfire. To learn more about the library and the rest of the modules, check out the spotfire-dsml community blog and Exchange item

    NLP Python Toolkit

    The NLP Python Toolkit on the Exchange enables users to perform several exploratory data analysis techniques and statistical modeling using Python NLTK and spaCy. It includes a data function primarily for preprocessing and N-gram modeling and a data function for pre-trained model tagging. You can review this detailed article Analyze any Text Data in Spotfire or watch the accompanying video on the Exchange to understand how to use this specific toolkit for full range of functionalities.

    Natural Language Generation in R

    This Natural Language Generation (NLG) example on the Exchange is implemented through a TERR data function. It makes use of a template for language preparation so that it is possible to specify the verbosity, level of detail, and tone of the generated language. An end user without R programming skills can change the generated language to fit any application.

    Sentiment Analysis using Cloud Services

    This sentiment analysis and topic modeling example from a Spotfire meetup uses NLTK and Gensim Python libraries. We can also use cloud services to perform such analytics, as shown in this blog. Note that these functionalities are now in the NLP Python Toolkit and spotfire-dsml module listed above (cloud services are not needed as we use open-source python libraries).

     

    Individual Data Functions

    NLP Tagging 

    NLP Tagging Data Function for Spotfire® is a pre-built data function that allows users to perform Sentiment Analysis, Named-Entity Recognition (NER), and part-of-speech (POS) tagging on any text data using the popular open-source spaCy library. This allows vertical and language compatibility with any spaCy models. The Python Toolkit above includes this data function.

    Fuzzy String Match 

    The Fuzzy String Match Data Function for Spotfire is a data cleansing function designed for Spotfire. Its purpose is to provide users with a tool that can search for and match similar strings, even if they have small differences in spelling or formatting, using the Levenshtein distance algorithm.

    General Library

    The aforementioned data functions are also on our Spotfire Data Function Library that contains free-standing Python and R data function scripts as well as the Exchange Data Functions section. Some of these data functions also overlap with those in the Dashboards section.

     

    Text-Related Mods

    Word Cloud

    The word cloud mod depicts the most prominent words or phrases and their occurrences. We suggest using this to visualize N-grams following the use of spotfire-dsml or NLP Python toolkit preprocessing and modeling.

    Text Card

    The text card mod makes it easy to visualize long, length-varying texts. It can be used to visualize logs, paragraphs, etc. We suggest using the spotfire-dsml or NLP python toolkit's preprocessing (data cleaning) functions then visualizing the before and after.

    Intelligent Narration

    The intelligent narrative mod was created with our partner Arria to assist with NLG. Use of this mod requires an Arria account.

    General Library

    The aforementioned mods are also on our Spotfire Mods main page. 

     

    NLP Highlights on the Demo Gallery

    Root Cause Analysis in Logs

    The Spotfire Demo Gallery houses some great examples of visual data science in practice! This cybersecurity anomaly detection dashboard uses the NLP Python Toolkit and spotfire-dsml library to perform preprocessing and N-gram modeling on the logs over time. It's then visualized using the word cloud mod and text card mod.

    Explainability (NLG) for visuals

    The COVID dashboard contains explanations of visuals using Arria NLG. We also now offer this feature inbuilt through Spotfire Copilot.

    • Like 1

    User Feedback

    Recommended Comments

    There are no comments to display.


×
×
  • Create New...