Introduction
Spotfire is a powerful visual data science platform that enables users to make data-driven decisions. One of the most exciting features of Spotfire is its native integrations with natural language processing (NLP) and large language models (LLMs) tools. This article points to all the content we have in the rapidly evolving Generative AI (GenAI) space with an emphasis in NLP and LLMs.
For further questions or feedback you can leave comments on this post or email us at datascience@spotfire.com.
Spotfire Copilot
Copilot Main Page
Spotfire Copilot™ is a free, natural language extension to the Spotfire® platform. It leverages LLMs to help Spotfire users be significantly more efficient in generating charts, data views, data functions, and reports. It also allows companies to add the context of their proprietary documents to Spotfire privately. The Copilot is geared for everyone from novice to seasoned users. It is important to consider the Copilot as a copilot to the human user of Spotfire. The team has heavily tested Copilot using OpenAI’s GPT-3.5 & GPT-4 models.
Copilot Exchange Download
Here is the direct download for Copilot; it's one of our most popular Exchange items!
White Paper
This white paper contains an advanced overview of the NLP and LLM landscape, specifically the evolution, advantages, and limitations of using LLMs. You can download the white paper here: NLP and LLMs in Spotfire - Whitepaper.pdf. Make sure you are logged in for access.
Copilot Development Blog
This community article, Interact with Spotfire in human language, details how Copilot has evolved in this ever-changing GenAI landscape especially since late 2022.
Glossary
This community article explains the most commonly used terminology in NLP and LLM.
Ready-to-Use Dashboards
NLP Module in spotfire-dsml Python Library
spotfire-dsml is our custom Spotfire python library introduced in 2023 that has modules in NLP, machine learning, time series, geospatial, and more! For information specifically on the NLP module within the library, check out this article: Analyze any Text Data in Spotfire. To learn more about the library and the rest of the modules, check out the spotfire-dsml community blog and Exchange item,
NLP Python Toolkit
The NLP Python Toolkit on the Exchange enables users to perform several exploratory data analysis techniques and statistical modeling using Python NLTK and spaCy. It includes a data function primarily for preprocessing and N-gram modeling and a data function for pre-trained model tagging. You can review this detailed article Analyze any Text Data in Spotfire or watch the accompanying video on the Exchange to understand how to use this specific toolkit for full range of functionalities.
Natural Language Generation in R
This Natural Language Generation (NLG) example on the Exchange is implemented through a TERR data function. It makes use of a template for language preparation so that it is possible to specify the verbosity, level of detail, and tone of the generated language. An end user without R programming skills can change the generated language to fit any application.
Sentiment Analysis using Cloud Services
This sentiment analysis and topic modeling example from a Spotfire meetup uses NLTK and Gensim Python libraries. We can also use cloud services to perform such analytics, as shown in this blog. Note that these functionalities are now in the NLP Python Toolkit and spotfire-dsml module listed above (cloud services are not needed as we use open-source python libraries).
Individual Data Functions
NLP Tagging
NLP Tagging Data Function for Spotfire® is a pre-built data function that allows users to perform Sentiment Analysis, Named-Entity Recognition (NER), and part-of-speech (POS) tagging on any text data using the popular open-source spaCy library. This allows vertical and language compatibility with any spaCy models. The Python Toolkit above includes this data function.
Fuzzy String Match
The Fuzzy String Match Data Function for Spotfire is a data cleansing function designed for Spotfire. Its purpose is to provide users with a tool that can search for and match similar strings, even if they have small differences in spelling or formatting, using the Levenshtein distance algorithm.
General Library
The aforementioned data functions are also on our Spotfire Data Function Library that contains free-standing Python and R data function scripts as well as the Exchange Data Functions section. Some of these data functions also overlap with those in the Dashboards section.
Text-Related Mods
Word Cloud
The word cloud mod depicts the most prominent words or phrases and their occurrences. We suggest using this to visualize N-grams following the use of spotfire-dsml or NLP Python toolkit preprocessing and modeling.
Text Card
The text card mod makes it easy to visualize long, length-varying texts. It can be used to visualize logs, paragraphs, etc. We suggest using the spotfire-dsml or NLP python toolkit's preprocessing (data cleaning) functions then visualizing the before and after.
Intelligent Narration
The intelligent narrative mod was created with our partner Arria to assist with NLG. Use of this mod requires an Arria account.
General Library
The aforementioned mods are also on our Spotfire Mods main page.
NLP Highlights on the Demo Gallery
Root Cause Analysis in Logs
The Spotfire Demo Gallery houses some great examples of visual data science in practice! This cybersecurity anomaly detection dashboard uses the NLP Python Toolkit and spotfire-dsml library to perform preprocessing and N-gram modeling on the logs over time. It's then visualized using the word cloud mod and text card mod.
Explainability (NLG) for visuals
The COVID dashboard contains explanations of visuals using Arria NLG. We also now offer this feature inbuilt through Spotfire Copilot.
- 1
Recommended Comments
There are no comments to display.