Jump to content
  • Customer Churn


    This article demonstrates how to build a classification model for predicting customer churn in Spotfire Data Science - Team Studio. By modeling churn, companies can better understand the root causes of customer attrition and the impact of marketing efforts.

    Use Case Overview

    This article demonstrates how to build a classification model for predicting customer churn. By modeling churn, companies can better understand the root causes of customer attrition and the impact of marketing efforts. Additionally, the models identify customers where proactive engagement can improve retention.

    Data Requirements

    We use customer history data from a B2B transportation company, global and broken down by business unit. These data include customer history data, such as annual revenue, volume, growth, and usage information by account and service type. A smaller dataset includes customer scorecards, which measure the quality of the customer contract terms, average days to payment, and profitability of the contract.

    Data Exploration and Feature Creation

    screen_shot_2017-06-27_at_9_36.41_am.thumb.png.62e280955d85d174fbc13a89eaab14a5.png

    First template workflow for exploratory data visualization and feature creation. 

    Our first workflow contains exploratory data visualizations summarizing churn frequency, the number of services customers typically consume by industry, and the overall distribution of customers by industry. Later in the flow we create binary indicator variables for churn, both global and per business unit and join the customer history data with the contract scorecards. In the final section we move these joined data into Hadoop to apply windowing functions that aggregate the prior three years of data for a given year, adds lag columns from one year prior, then removes samples that have no historical data.

    Modeling Global Churn

    screen_shot_2017-06-27_at_9_39.21_am.thumb.png.956426bd8abeb52d21da0f6648b35b9e.png

    Second template for modeling global churn, both with and without the contract scorecard data. 

    The second step is to create models that predict which customers will churn. We create two modeling branches off the ETLed data from the first workflow. The first branch uses just the customer history data, which is more comprehensive since the scorecard data don't exist for all customers. The second branch is restricted to those customers with scorecard data.

    In both cases we use a range of classification models: logistic regression, decision tree, random forest, and gradient boosted trees, 75% sample as a training set, and 25% as a hold out validation set. Random forest is the best-performing algorithm, with a churn prediction accuracy of 85% with scorecard data present. Without scorecard data, accuracy drops to 80%.

    Key Technique - Model Evaluation

    We use model evaluation operators to determine each model's performance on the hold out data. The ROC Curve operator measures the tradeoff between true positive and false positive identification rates. The Confusion Matrix operator shows the incidence of correct classifications and different error types (false positives and false negatives), as well as the overall classification accuracy percentage.

    Other Playbook Assets

    screen_shot_2017-06-27_at_9_42.46_am.thumb.png.6997c05531f84691880b7dea00d1fcfb.png

    Touchpoint for generating lists of likely to churn customers by time period and industry. 

    This Playbook includes several other assets, including flows for modeling churn on a per-business unit basis, and prediction flows for driving self-service Touchpoints. The Touchpoints allow business users to independently generate reports on customers that are likely to churn.

     

     

    Check It Out!

    For access to this Playbook, including its workflows, sample data, a PowerPoint summary, and expert support from Spotfire® Data Science data scientists, contact your Spotfire® Data Science sales representative.


    User Feedback

    Recommended Comments

    There are no comments to display.


×
×
  • Create New...