Jump to content

Market Basket Analysis - Data Function Help


Tyler Kendle
Go to solution Solved by Gaia Paolini,

Recommended Posts

Hi everyone,

I am trying to run the MBA data function from the Customer Analytics template I downloaded from the Exchange page. Unfortunately I keep getting an error. The error appears to be something with line 27 and has to do with "consequents" but I can't figure out what I'm missing.

I attached a DXP with a similar data structure as the template and the data function embedded in the analysis. My goal is to produce the AssociationRules table that I can't get.

Thank you for any suggestions you might have.

Link to comment
Share on other sites

  • Solution

It appears to be a lack of defensive coding in the data function. Your input parameters (I think primarily the minimum support) do not return any rules, and the rules data frame should have been returned empty instead of throwing an error.

I fixed so that if no rules are found, it returns a data frame with one row which should help a bit. See below

  1.    # Import pandas
    import pandas as pd
    # MBA packages
    # Import the transaction encoder function from mlxtend
    from mlxtend.preprocessing import TransactionEncoder
    # Import Apriori algorithm
    from mlxtend.frequent_patterns import apriori
    # Import the association rule function from mlxtend
    from mlxtend.frequent_patterns import association_rules
     
    # List transactions by group
    trans_df = customer_df.groupby(['Invoice', 'Customer_ID'])['CategoryGroup'].apply(list).reset_index(name='Transaction')
    trans_df['Transaction'] = trans_df['Transaction'].apply(lambda x: list(set(x)))
    # Data preprocessing
    trans_list = trans_df['Transaction'].to_list()
    encoder = TransactionEncoder()
    encode_arr = encoder.fit_transform(trans_list)
    # Converting to dataframe
    encode_df = pd.DataFrame(encode_arr, columns=encoder.columns_)
    # Compute frequent itemsets using the Apriori algorithm
    frequent_itemsets = apriori(encode_df, min_support = min_support, max_len = max_len, use_colnames=True)
    # Compute all association rules for frequent_itemsets
    rules = association_rules(frequent_itemsets, metric="lift", min_threshold= min_lift)
    if rules.shape[0]>0:
    # Clean rules
    rules['antecedents'] = rules['antecedents'].apply(lambda x: ','.join(list(x))).apply(lambda x: x.replace(',','|'))
    rules = rules[(rules['consequents'].apply(lambda x: len(x)==1))]
    rules['consequents'] = rules['consequents'].apply(lambda x: ','.join(list(x))).apply(lambda x: x.replace(',','|'))
    else:
    rules= pd.DataFrame(index = range(1), columns=rules.columns)
    rules['antecedents']='None found'
    rules['consequents']='None found'
    rules.fillna(0,inplace=True)
     
     
    # Copyright (c) 2024. TIBCO Software Inc.
    # This file is subject to the license terms contained in the license file that is distributed with this file.
Link to comment
Share on other sites

I also found a different Exchange item which contains a more robust MBA data function

https://community.spotfire.com/s/exchange/aCv4z000000kPeQCAU/market-basket-analysis-python-data-function-for-tibco-spotfire

This one has an extra output parameter called 'message' which is a simple status report. It also as a slightly different set of input parameters.

Link to comment
Share on other sites

Gaia,

Thank you so much for your help with updating that code. Much appreciated!

I did still get an error even after implementing your new code but it was a Pandas issue. I was getting an error that said "AttributeError: 'DataFrame' object has no attribute 'iteritems'". When I searched the internet for that error I found an article that talked about how it's related to iteritems being removed from newer versions of Pandas.

A solution that someone in the thread recommended was to add this line in:

"pd.DataFrame.iteritems = pd.DataFrame.items"

I added it in and I was able to create the new data table Association Rules.

Thank you for all your help. Posting the article in case anyone else runs into a similar issue.

https://stackoverflow.com/questions/76404811/attributeerror-dataframe-object-has-no-attribute-iteritems

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...