Jump to content
  • NLP and LLMs in Spotfire


    This article is the home to find any topics and resources related to integration of natural language processing (NLP) and language learning models (LLMs) tools with Spotfire. It explores some of the NLP and LLM tools available for Spotfire, such as the NLP Python Toolkit, NLP Tagging Data Function, and Fuzzy String Match Data Function and Spotfire Copilot.

    NLP and LLMs - Glossary

    Introduction

    Spotfire is a powerful data visualization and analysis software platform that enables users to make informed decisions based on their data. One of the most exciting features of Spotfire is its integration with natural language processing (NLP) and language learning models (LLMs) tools. In this article, we will explore some of the NLP and LLMs tools that are available for Spotfire, including the NLP Python Toolkit, NLP Tagging Data Function, and Fuzzy String Match Data Function. 

    And review our whitepaper about the evolution, advantages, limitations, and use of LLMs as a managed API service/self-managing on open-source models within your industry and the available NLP and LLM tools for Spotfire. You can download the whitepaper here: NLP and LLMs in Spotfire - Whitepaper.pdf. Make sure you are logged in for access.

    rtaImage.png.84b1df5d189f3d988af7954952db663c.png

    NLP and LLM tools available for Spotfire

    Spotfire provides several built-in tools and functionalities that use NLP and LLMs for analyzing and visualizing unstructured data. Here are some examples of NLP and LLM tools available for you:

    Spotfire Data Functions


    NLP Python Toolkit for Spotfire

    Spotfire provides an NLP Python Toolkit that enables users to preprocess and analyze text data using Python NLTK or spaCy. Users can perform NLP tasks such as tokenization, stemming, and sentiment analysis on their text data and then visualize the results within their Spotfire analyses.

    Spotfire's NLP Python Toolkit provides two essential data functions that enable users to perform a wide range of text analytics on their data. The first data function is the Preprocess Text and Extract N-gram Features function, which cleans and preprocesses text by removing stop words and performing stemming or lemmatization. It then calculates the most frequent n-grams across all text and extracts key n-gram phrases per document, which can be useful for identifying common patterns and themes in large sets of text data. The second data function is Named Entity Recognition (NER), Part of Speech (POS), and Sentiment Tagging, which tags pre-defined entities, such as people or organizations, in text. It also tags each word with its part of speech, such as noun or adjective, and assigns a polarity and subjectivity metric to each document. 

    This toolkit is designed for data scientists and analysts with some knowledge of Python programming and NLP concepts. 

    You can review this detailed article Analyse any Text Data in Spotfire to understand how to use this specific NLP Python toolkit for Spotfire. Or watch this video for a presentation on NLP and a demo:


    NLP Tagging Data Function for Spotfire

    NLP Tagging Data Function for Spotfire® is a pre-built data function that allows users to perform part-of-speech (POS) tagging on text data within the Spotfire platform. POS tagging is labeling each word in a text document with its corresponding part of speech, such as a noun, verb, adjective, or adverb. This process is a fundamental step in many NLP tasks, such as text classification, named entity recognition, and sentiment analysis.

    The NLP Tagging Data Function for Spotfire uses the Natural Language Toolkit (NLTK), a popular open-source library for NLP in Python, to perform POS tagging on text data. It can handle multiple languages and allows users to customize the tagging algorithm by selecting different tagsets or adjusting the threshold for unknown words.

    By using the NLP Tagging Data Function, users can easily preprocess their text data and extract valuable insights from it using other Spotfire functionalities or custom Python scripts. For example, they can filter the text data by specific parts of speech, such as only analyzing the nouns or adjectives, or use the tagged data as input for further NLP analyses.

    Fuzzy String Match Data Function for Spotfire

    The Fuzzy String Match Data Function for Spotfire is a fuzzy string match function designed for Spotfire. Its purpose is to provide users with a tool that can search for and match similar strings, even if they have small differences in spelling or formatting. The function calculates a similarity score between strings using the Levenshtein distance algorithm, which measures the number of changes needed to transform one string into another. The fuzzy string match data function can be useful for data cleansing and data matching tasks in various industries, including finance, healthcare, and retail.

    Natural Language Generation with Spotfire 

    This article describes an example of a Spotfire dashboard including natural language generation (NLG) implemented through a TERR data function. The example makes use of a template for language preparation so that it is possible to specify the verbosity, level of detail, and tone of the generated language. An end user without R programming skills can change the generated language to fit with the application. You will be able to try this yourself by downloading the data function from the community exchange.

     

    LLMs - Spotfire Copilot

    What if you could ask Spotfire "What is the best-selling product and its quarterly revenue?" or "How do I add a new Python Library?" right inside Spotfire? Learn about the development of the Spotfire Copilot made possible with LLMs in this detailed community article: Spotfire Copilot: Interact with Spotfire in human language!
     

    Spotfire Partners specialized in NLP techniques

    Arria (NLG), an example of COVID Dashboard using Arria

    Wordsmith (NLG)
     

    NLP & LLM Glossary

    This community article explains the most commonly used terminology in NLP and LLM.

     


    User Feedback

    Recommended Comments

    There are no comments to display.


×
×
  • Create New...