Summary
Overview
This release includes data functions that preprocess or clean text, extract n-gram features, and tag entities in any English text using a combination of NLP methods and algorithms. The preprocessing steps include removing stop words, removing special characters, removing numbers, and performing text normalization like stemming or lemmatization. The n-gram features show the most frequent n-grams and top keywords per document. The tagging functions perform named entity recognition, part-of-speech tagging, and sentiment analysis.
You can review this detailed article "Analyse any Text Data in Spotfire" to understand how to use this specific NLP Python toolkit for Spotfire. Or watch this
for a presentation on NLP and demo.
Installing the data function
Follow the online guide available here to register a data function in Spotfire.
Configuring the data function
Each data function may require inputs from the Spotfire analysis and will return outputs to the Spotfire analysis. For each data function, these need to be configured once the data function is registered. To learn about how to configure data functions in Spotfire please view this video:
For more information on Spotfire visit the Spotfire training page.
Data function library
There exists a large number of data functions covering various features. Feel free to review what is available on the Data Function Library.
Release 1.1.0
Published: May 2022
Changes Include:
- Fixes to minor bugs
- Remove N-gram Keywords that don't appear in original text
- Options to Choose any Langauge Model
- Make more Parameters Optional
Release Includes:
- Data function
- Dxp with example usage
- Documentation
- License information
Initial Release (version 1.0.0)
Published: September 2021
Initial release includes:
- Dxp with example usage
- Two data functions (Features, Entities and Sentiment)
- Documentation
- License information