Project: Marketing Campaign Analysis¶
In this project, my aim is to analyze marketing data and address some important business problems/questions.
Context¶
Marketing Analytics broadly refers to the practice of using analytical methods and techniques to understand the effectiveness of various marketing activities and deploy data-driven decisions to optimize for ROI on conversion rates. It typically involves analyzing various metrics around customer engagement with various marketing activities including but not limited to, ATL (above the line) marketing activities, BTL (below the line) campaigns, targeting personalized offers. Typically the variables of interest are customer profile, campaign conversion rates,and costs associated with various marketing channels. These can generate valuable insights that can help an organization form better marketing strategies, optimize/innovate on delivery, and achieve overall growth.
Problem Statement¶
The Chief Marketing Officer share that the recent marketing campaigns have not been as effective as they were expected to be and the conversion rate is very low. My task is to analyze the related data, understand the problem, and identify key insights and recommendations for the CMO to potentially implement.
The data set marketing_data.csv consists of 2,240 customers of Atqlik company with data on:
- Campaign successes/failures
- Product preferences
- Channel performances
- Customer profiles based on the spending habits
Data Dictionary¶
- ID : Unique ID of each customer
- Year_Birth : Age of the customer
- Education : Customer's level of education
- Marital_Status : Customer's marital status
- Kidhome : Number of small children in customer's household
- Teenhome : Number of teenagers in customer's household
- Income : Customer's yearly household income
- Recency : Number of days since the last purchase
- MntFishProducts : The amount spent on fish products in the last 2 years
- MntMeatProducts : The amount spent on meat products in the last 2 years
- MntFruits : The amount spent on fruits products in the last 2 years
- MntSweetProducts : Amount spent on sweet products in the last 2 years
- MntWines : The amount spent on wine products in the last 2 years
- MntGoldProds : The amount spent on gold products in the last 2 years
- NumDealsPurchases : Number of purchases made with discount
- NumCatalogPurchases : Number of purchases made using catalog (buying goods to be shipped through the mail)
- NumStorePurchases : Number of purchases made directly in stores
- NumWebPurchases : Number of purchases made through the company's website
- NumWebVisitsMonth : Number of visits to company's website in the last month
- AcceptedCmp1 : 1 if customer accepted the offer in the first campaign, 0 otherwise
- AcceptedCmp2 : 1 if customer accepted the offer in the second campaign, 0 otherwise
- AcceptedCmp3 : 1 if customer accepted the offer in the third campaign, 0 otherwise
- AcceptedCmp4 : 1 if customer accepted the offer in the fourth campaign, 0 otherwise
- AcceptedCmp5 : 1 if customer accepted the offer in the fifth campaign, 0 otherwise
- AcceptedCmp6 : 1 if customer accepted the offer in the last campaign, 0 otherwise
- Complain : 1 If the customer complained in the last 2 years, 0 otherwise
- Country: Country customer belongs to
Importing libraries and overview of the dataset¶
# Library to supress warnings or deprecation notes
import warnings
warnings.filterwarnings('ignore')
# Libraries to help with reading and manipulating data
import numpy as np
import pandas as pd
# Libraries to help with data visualization
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
# from google.colab import files
# uploaded = files.upload()
Load the dataset¶
# loading the datset
df = pd.read_csv("C:/Users/ND/Downloads/Marketing+data (1).csv")
df.head()
ID | Year_Birth | Education | Marital_Status | Income | Kidhome | Teenhome | Recency | MntWines | MntFruits | ... | NumStorePurchases | NumWebVisitsMonth | AcceptedCmp1 | AcceptedCmp2 | AcceptedCmp3 | AcceptedCmp4 | AcceptedCmp5 | AcceptedCmp6 | Complain | Country | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1826 | 1970 | Graduation | Divorced | 84835.0 | 0 | 0 | 0 | 189 | 104 | ... | 6 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | SP |
1 | 1 | 1961 | Graduation | Single | 57091.0 | 0 | 0 | 0 | 464 | 5 | ... | 7 | 5 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | CA |
2 | 10476 | 1958 | Graduation | Married | 67267.0 | 0 | 1 | 0 | 134 | 11 | ... | 5 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | US |
3 | 1386 | 1967 | Graduation | Together | 32474.0 | 1 | 1 | 0 | 10 | 0 | ... | 2 | 7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | AUS |
4 | 5371 | 1989 | Graduation | Single | 21474.0 | 1 | 0 | 0 | 6 | 16 | ... | 2 | 7 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | SP |
5 rows × 27 columns
Check info of the dataset¶
#Checking the info
df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 2240 entries, 0 to 2239 Data columns (total 27 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 ID 2240 non-null int64 1 Year_Birth 2240 non-null int64 2 Education 2240 non-null object 3 Marital_Status 2240 non-null object 4 Income 2216 non-null float64 5 Kidhome 2240 non-null int64 6 Teenhome 2240 non-null int64 7 Recency 2240 non-null int64 8 MntWines 2240 non-null int64 9 MntFruits 2240 non-null int64 10 MntMeatProducts 2240 non-null int64 11 MntFishProducts 2240 non-null int64 12 MntSweetProducts 2240 non-null int64 13 MntGoldProds 2240 non-null int64 14 NumDealsPurchases 2240 non-null int64 15 NumWebPurchases 2240 non-null int64 16 NumCatalogPurchases 2240 non-null int64 17 NumStorePurchases 2240 non-null int64 18 NumWebVisitsMonth 2240 non-null int64 19 AcceptedCmp1 2240 non-null int64 20 AcceptedCmp2 2240 non-null int64 21 AcceptedCmp3 2240 non-null int64 22 AcceptedCmp4 2240 non-null int64 23 AcceptedCmp5 2240 non-null int64 24 AcceptedCmp6 2240 non-null int64 25 Complain 2240 non-null int64 26 Country 2240 non-null object dtypes: float64(1), int64(23), object(3) memory usage: 472.6+ KB
Observations:
- There are a total of 27 columns and 2,240 observations in the dataset
- We can see that the Income column has less than 2,240 non-null values i.e. column has missing values. We'll explore this further
Check the percentage of missing values for the Income column.¶
# % Null values in the Income column
(df.isnull().sum()/df.shape[0]*100)['Income']
1.0714285714285714
Observations:
- Income has ~1.07% missing values.
Create a list for numerical columns in the dataset and check the summary statistics¶
Summary statistics for numerical columns¶
# num_cols contain numerical varibales
num_cols=['Year_Birth','Income','Recency', 'MntWines', 'MntFruits',
'MntMeatProducts', 'MntFishProducts', 'MntSweetProducts',
'MntGoldProds', 'NumDealsPurchases', 'NumWebPurchases',
'NumCatalogPurchases', 'NumStorePurchases', 'NumWebVisitsMonth','Kidhome',
'Teenhome']
# printing descriptive statistics of numerical columns
#Uncomment the following code and fill in the blanks
df[num_cols].describe().T
count | mean | std | min | 25% | 50% | 75% | max | |
---|---|---|---|---|---|---|---|---|
Year_Birth | 2240.0 | 1968.805804 | 11.984069 | 1893.0 | 1959.00 | 1970.0 | 1977.00 | 1996.0 |
Income | 2216.0 | 52247.251354 | 25173.076661 | 1730.0 | 35303.00 | 51381.5 | 68522.00 | 666666.0 |
Recency | 2240.0 | 49.109375 | 28.962453 | 0.0 | 24.00 | 49.0 | 74.00 | 99.0 |
MntWines | 2240.0 | 303.935714 | 336.597393 | 0.0 | 23.75 | 173.5 | 504.25 | 1493.0 |
MntFruits | 2240.0 | 26.302232 | 39.773434 | 0.0 | 1.00 | 8.0 | 33.00 | 199.0 |
MntMeatProducts | 2240.0 | 166.950000 | 225.715373 | 0.0 | 16.00 | 67.0 | 232.00 | 1725.0 |
MntFishProducts | 2240.0 | 37.525446 | 54.628979 | 0.0 | 3.00 | 12.0 | 50.00 | 259.0 |
MntSweetProducts | 2240.0 | 27.062946 | 41.280498 | 0.0 | 1.00 | 8.0 | 33.00 | 263.0 |
MntGoldProds | 2240.0 | 44.021875 | 52.167439 | 0.0 | 9.00 | 24.0 | 56.00 | 362.0 |
NumDealsPurchases | 2240.0 | 2.325000 | 1.932238 | 0.0 | 1.00 | 2.0 | 3.00 | 15.0 |
NumWebPurchases | 2240.0 | 4.084821 | 2.778714 | 0.0 | 2.00 | 4.0 | 6.00 | 27.0 |
NumCatalogPurchases | 2240.0 | 2.662054 | 2.923101 | 0.0 | 0.00 | 2.0 | 4.00 | 28.0 |
NumStorePurchases | 2240.0 | 5.790179 | 3.250958 | 0.0 | 3.00 | 5.0 | 8.00 | 13.0 |
NumWebVisitsMonth | 2240.0 | 5.316518 | 2.426645 | 0.0 | 3.00 | 6.0 | 7.00 | 20.0 |
Kidhome | 2240.0 | 0.444196 | 0.538398 | 0.0 | 0.00 | 0.0 | 1.00 | 2.0 |
Teenhome | 2240.0 | 0.506250 | 0.544538 | 0.0 | 0.00 | 0.0 | 1.00 | 2.0 |
Observations:¶
1️⃣ Potential Outliers in Product Purchases: A significant gap between the 3rd quartile and the maximum values for all products purchased in the last two years suggests the presence of right-skewed outliers in these variables.
2️⃣ Customer Household Structure: Each customer had a maximum of two children, either in the kidhome or teenhome category.
3️⃣ Spending Behavior on Wine: Analysis indicates that customers allocated a significant portion of their income to wine purchases, making it the most frequently bought product.
4️⃣ Income Distribution Anomaly: A high-income outlier is observed on the right, suggesting the presence of a few customers with exceptionally high earnings compared to the rest of the dataset.
Create a list for categorical columns in the dataset and check the count of each category¶
#cat_cols contain categorical variables
cat_cols=['Education', 'Marital_Status', 'AcceptedCmp3', 'AcceptedCmp4', 'AcceptedCmp5', 'AcceptedCmp1',
'AcceptedCmp2', 'AcceptedCmp6', 'Complain', 'Country']
# Printing the count of each unique value in each column
for column in cat_cols:
print(df[column].value_counts(normalize=True))
print("-" * 40)
Graduation 0.503125 PhD 0.216964 Master 0.165179 2n Cycle 0.090625 Basic 0.024107 Name: Education, dtype: float64 ---------------------------------------- Married 0.385714 Together 0.258929 Single 0.214286 Divorced 0.103571 Widow 0.034375 Alone 0.001339 YOLO 0.000893 Absurd 0.000893 Name: Marital_Status, dtype: float64 ---------------------------------------- 0 0.927232 1 0.072768 Name: AcceptedCmp3, dtype: float64 ---------------------------------------- 0 0.935714 1 0.064286 Name: AcceptedCmp4, dtype: float64 ---------------------------------------- 0 0.925446 1 0.074554 Name: AcceptedCmp5, dtype: float64 ---------------------------------------- 0 0.927232 1 0.072768 Name: AcceptedCmp1, dtype: float64 ---------------------------------------- 0 0.850893 1 0.149107 Name: AcceptedCmp2, dtype: float64 ---------------------------------------- 0 0.986607 1 0.013393 Name: AcceptedCmp6, dtype: float64 ---------------------------------------- 0 0.990625 1 0.009375 Name: Complain, dtype: float64 ---------------------------------------- SP 0.488839 SA 0.150446 CA 0.119643 AUS 0.071429 IND 0.066071 GER 0.053571 US 0.048661 ME 0.001339 Name: Country, dtype: float64 ----------------------------------------
Observations:
1️⃣ Education Categories Simplification: The terms "2nd Cycle" and "Master" represent the same education level and should be combined into a single category.
2️⃣ Marital Status Consolidation: The "Alone" category can be merged with "Single" to streamline the data.
3️⃣ Unclear Categories in Marital Status: The meanings of "Absurd" and "YOLO" are not clear from the data. These can be grouped into a new category labeled "Others" for better analysis.
4️⃣ Customer Complaints: Only 21 customers have filed complaints in the last two years, indicating a low complaint rate.
5️⃣ Customer Distribution by Country: The majority of customers are from Spain, while Mexico has the lowest representation.
6️⃣ Most Common Educational Level: The most frequent education level among customers is Graduation.
7️⃣ Most Common Marital Status: The majority of customers are Married.
Data Preprocessing and Exploratory Data Analysis¶
- Fixing the categories
- Creating new columns as the total amount spent, total purchase made, total kids at home, and total accepted campaigns
- Dealing with missing values and outliers
- Extract key insights from the data
Replacing the "2n Cycle" category with "Master" in Education and "Alone" with "Single" in Marital_Status and "Absurd" and "YOLO" categories with "Others" in Marital_Status¶
# Replacing 2n Cycle with Master
df["Education"].replace("2n Cycle", "Master", inplace=True)
# Replacing Alone with Single
df["Marital_Status"].replace(["Alone",], "Single", inplace=True)
# Replacing YOLO, Absurd with Others
df['Marital_Status'].replace(["Absurd", "YOLO"], "Others", inplace=True)
We have fixed the categories in the Marital_Status. Check distribution count in different categories for marital status.
df.Marital_Status.value_counts()
Married 864 Together 580 Single 483 Divorced 232 Widow 77 Others 4 Name: Marital_Status, dtype: int64
Observation:
The majority of customers fall into the married category, while the other category has only four observations.
Creating new features from the existing features¶
# creating new features to get overall picture of a customer, how much he/she has spend,
#how many children he/she has, total campaigns accepted, etc.
# total spending by a customer
spending_col = [col for col in df.columns if 'Mnt' in col]
df['Total_Spending'] = df[spending_col].sum(axis = 1)
#total purchases made by a customer
platform_col = [col for col in df.columns if 'Purchases' in col]
df['Total_Purchase'] = df[platform_col].sum(axis = 1)
#total no. of childern
df['NumberofChildren'] = df['Kidhome'] + df['Teenhome']
# Total no. of campaign accepted by a customer
campaigns_cols = [col for col in df.columns if 'Cmp' in col]
df['TotalCampaignsAcc'] = df[campaigns_cols].sum(axis=1)
Check outliers for new variables - Total_Spending, Total_Purchase and also analyze the Year_Birth column as we observed above that it had a minimum value of 1893.¶
# Plotting boxplot for Year_Birth, Total_Spending, Total_Purchase
cols=['Year_Birth','Total_Spending','Total_Purchase']
for i in cols:
sns.boxplot(x=df[i])
plt.show()
Observations:
1️⃣ Potential Data Entry Errors in Birth Year: Some users have a birth year reported as ≤1900, while the current year is 2021. It is highly unlikely that these individuals are still alive, suggesting a possible data entry or reporting error.
2️⃣ Presence of Outliers in Total Spending and Total Purchases: The dataset contains outliers in both total spending and total purchases. However, these observations are close to the upper whisker of the distribution, and extreme values are expected for variables like total spending.
3️⃣ Decision on Outlier Treatment: Given that some extreme values are naturally expected, these outliers do not require immediate removal or treatment. Keeping them in the dataset ensures that we retain high-value customers or significant purchasing patterns.
Check the number of observations for which year birth is less than 1900.
df[df['Year_Birth'] < 1900]
ID | Year_Birth | Education | Marital_Status | Income | Kidhome | Teenhome | Recency | MntWines | MntFruits | ... | AcceptedCmp3 | AcceptedCmp4 | AcceptedCmp5 | AcceptedCmp6 | Complain | Country | Total_Spending | Total_Purchase | NumberofChildren | TotalCampaignsAcc | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
513 | 11004 | 1893 | Master | Single | 60182.0 | 0 | 1 | 23 | 8 | 0 | ... | 0 | 0 | 0 | 0 | 0 | SA | 22 | 4 | 1 | 0 |
827 | 1150 | 1899 | PhD | Together | 83532.0 | 0 | 0 | 36 | 755 | 144 | ... | 1 | 0 | 0 | 0 | 0 | SP | 1853 | 15 | 0 | 1 |
2 rows × 31 columns
Observation:
- There are only 2 observations for which birth year is less than 1900. We can drop these observations.
#keeping data for customers having birth year >1900
df = df[df['Year_Birth'] > 1900]
Check the outliers and impute the missing values for the Income variable¶
#plotting Boxplot for income
plt.figure(figsize=(10,4))
sns.boxplot(df['Income'])
plt.title('Income boxplot', size=16)
plt.show()
Observations:
-- The boxplot indicates the presence of outliers in the income variable. -- To quantify the outliers, we need to calculate the upper whisker value and determine how many observations exceed this threshold
#Calculating the upper whisker for the Income variable
Q1 = df.quantile(q=0.25) #First quartile
Q3 = df.quantile(q=0.75) #Third quartile
IQR = Q3 - Q1 #Inter Quartile Range
upper_whisker = (Q3 + 1.5*IQR)['Income'] #Upper Whisker
print(upper_whisker)
118348.5
#Checking the observations marked as outliers
df[df.Income>upper_whisker]
ID | Year_Birth | Education | Marital_Status | Income | Kidhome | Teenhome | Recency | MntWines | MntFruits | ... | AcceptedCmp3 | AcceptedCmp4 | AcceptedCmp5 | AcceptedCmp6 | Complain | Country | Total_Spending | Total_Purchase | NumberofChildren | TotalCampaignsAcc | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
325 | 4931 | 1977 | Graduation | Together | 157146.0 | 0 | 0 | 13 | 1 | 0 | ... | 0 | 0 | 0 | 0 | 0 | SA | 1730 | 28 | 0 | 0 |
497 | 1501 | 1982 | PhD | Married | 160803.0 | 0 | 0 | 21 | 55 | 16 | ... | 0 | 0 | 0 | 0 | 0 | US | 1717 | 44 | 0 | 0 |
527 | 9432 | 1977 | Graduation | Together | 666666.0 | 1 | 0 | 23 | 9 | 14 | ... | 0 | 0 | 0 | 0 | 0 | SA | 62 | 11 | 1 | 0 |
731 | 1503 | 1976 | PhD | Together | 162397.0 | 1 | 1 | 31 | 85 | 1 | ... | 0 | 0 | 0 | 0 | 0 | SP | 107 | 1 | 2 | 0 |
853 | 5336 | 1971 | Master | Together | 157733.0 | 1 | 0 | 37 | 39 | 1 | ... | 0 | 0 | 0 | 0 | 0 | SP | 59 | 2 | 1 | 0 |
1826 | 5555 | 1975 | Graduation | Divorced | 153924.0 | 0 | 0 | 81 | 1 | 1 | ... | 0 | 0 | 0 | 0 | 0 | SP | 6 | 0 | 0 | 0 |
1925 | 11181 | 1949 | PhD | Married | 156924.0 | 0 | 0 | 85 | 2 | 1 | ... | 0 | 0 | 0 | 0 | 0 | CA | 8 | 0 | 0 | 0 |
2204 | 8475 | 1973 | PhD | Married | 157243.0 | 0 | 1 | 98 | 20 | 2 | ... | 0 | 0 | 0 | 0 | 0 | IND | 1608 | 37 | 1 | 0 |
8 rows × 31 columns
Observations:
- Only 8 customers have an income above the upper whisker (outliers).
- Of these, only 3 (ID: 4931, 1501, 8475) made more than 11 purchases in the last 2 years.
- The remaining 5 outliers have minimal total spending, indicating that higher income does not always correlate with higher purchases.
Compare the summary statistics for these observations with observations on the other side of the upper whisker.
#Checking the summary statistics for observations marked as outliers
df[df.Income>upper_whisker].describe().T
count | mean | std | min | 25% | 50% | 75% | max | |
---|---|---|---|---|---|---|---|---|
ID | 8.0 | 5989.250 | 3525.251308 | 1501.0 | 4074.00 | 5445.5 | 8714.25 | 11181.0 |
Year_Birth | 8.0 | 1972.500 | 10.028531 | 1949.0 | 1972.50 | 1975.5 | 1977.00 | 1982.0 |
Income | 8.0 | 221604.500 | 179850.404431 | 153924.0 | 157090.50 | 157488.0 | 161201.50 | 666666.0 |
Kidhome | 8.0 | 0.375 | 0.517549 | 0.0 | 0.00 | 0.0 | 1.00 | 1.0 |
Teenhome | 8.0 | 0.250 | 0.462910 | 0.0 | 0.00 | 0.0 | 0.25 | 1.0 |
Recency | 8.0 | 48.625 | 33.687376 | 13.0 | 22.50 | 34.0 | 82.00 | 98.0 |
MntWines | 8.0 | 26.500 | 30.798887 | 1.0 | 1.75 | 14.5 | 43.00 | 85.0 |
MntFruits | 8.0 | 4.500 | 6.524678 | 0.0 | 1.00 | 1.0 | 5.00 | 16.0 |
MntMeatProducts | 8.0 | 621.875 | 846.511402 | 1.0 | 7.25 | 17.0 | 1592.00 | 1725.0 |
MntFishProducts | 8.0 | 4.250 | 5.650537 | 1.0 | 1.00 | 2.0 | 3.50 | 17.0 |
MntSweetProducts | 8.0 | 1.250 | 0.886405 | 0.0 | 1.00 | 1.0 | 1.25 | 3.0 |
MntGoldProds | 8.0 | 3.750 | 4.131759 | 1.0 | 1.00 | 1.5 | 5.00 | 12.0 |
NumDealsPurchases | 8.0 | 4.250 | 6.777062 | 0.0 | 0.00 | 0.0 | 6.75 | 15.0 |
NumWebPurchases | 8.0 | 0.500 | 1.069045 | 0.0 | 0.00 | 0.0 | 0.25 | 3.0 |
NumCatalogPurchases | 8.0 | 9.875 | 13.484780 | 0.0 | 0.00 | 0.5 | 23.50 | 28.0 |
NumStorePurchases | 8.0 | 0.750 | 1.035098 | 0.0 | 0.00 | 0.5 | 1.00 | 3.0 |
NumWebVisitsMonth | 8.0 | 1.125 | 2.031010 | 0.0 | 0.00 | 0.5 | 1.00 | 6.0 |
AcceptedCmp1 | 8.0 | 0.000 | 0.000000 | 0.0 | 0.00 | 0.0 | 0.00 | 0.0 |
AcceptedCmp2 | 8.0 | 0.000 | 0.000000 | 0.0 | 0.00 | 0.0 | 0.00 | 0.0 |
AcceptedCmp3 | 8.0 | 0.000 | 0.000000 | 0.0 | 0.00 | 0.0 | 0.00 | 0.0 |
AcceptedCmp4 | 8.0 | 0.000 | 0.000000 | 0.0 | 0.00 | 0.0 | 0.00 | 0.0 |
AcceptedCmp5 | 8.0 | 0.000 | 0.000000 | 0.0 | 0.00 | 0.0 | 0.00 | 0.0 |
AcceptedCmp6 | 8.0 | 0.000 | 0.000000 | 0.0 | 0.00 | 0.0 | 0.00 | 0.0 |
Complain | 8.0 | 0.000 | 0.000000 | 0.0 | 0.00 | 0.0 | 0.00 | 0.0 |
Total_Spending | 8.0 | 662.125 | 848.380884 | 6.0 | 46.25 | 84.5 | 1635.25 | 1730.0 |
Total_Purchase | 8.0 | 15.375 | 18.220377 | 0.0 | 0.75 | 6.5 | 30.25 | 44.0 |
NumberofChildren | 8.0 | 0.625 | 0.744024 | 0.0 | 0.00 | 0.5 | 1.00 | 2.0 |
TotalCampaignsAcc | 8.0 | 0.000 | 0.000000 | 0.0 | 0.00 | 0.0 | 0.00 | 0.0 |
#Checking the summary statistics for observations not marked as outliers
df[df.Income<upper_whisker].describe().T
count | mean | std | min | 25% | 50% | 75% | max | |
---|---|---|---|---|---|---|---|---|
ID | 2205.0 | 5585.439456 | 3247.546423 | 0.0 | 2815.0 | 5455.0 | 8418.0 | 11191.0 |
Year_Birth | 2205.0 | 1968.904308 | 11.705801 | 1940.0 | 1959.0 | 1970.0 | 1977.0 | 1996.0 |
Income | 2205.0 | 51622.094785 | 20713.063826 | 1730.0 | 35196.0 | 51287.0 | 68281.0 | 113734.0 |
Kidhome | 2205.0 | 0.442177 | 0.537132 | 0.0 | 0.0 | 0.0 | 1.0 | 2.0 |
Teenhome | 2205.0 | 0.506576 | 0.544380 | 0.0 | 0.0 | 0.0 | 1.0 | 2.0 |
Recency | 2205.0 | 49.009070 | 28.932111 | 0.0 | 24.0 | 49.0 | 74.0 | 99.0 |
MntWines | 2205.0 | 306.164626 | 337.493839 | 0.0 | 24.0 | 178.0 | 507.0 | 1493.0 |
MntFruits | 2205.0 | 26.403175 | 39.784484 | 0.0 | 2.0 | 8.0 | 33.0 | 199.0 |
MntMeatProducts | 2205.0 | 165.312018 | 217.784507 | 0.0 | 16.0 | 68.0 | 232.0 | 1725.0 |
MntFishProducts | 2205.0 | 37.756463 | 54.824635 | 0.0 | 3.0 | 12.0 | 50.0 | 259.0 |
MntSweetProducts | 2205.0 | 27.128345 | 41.130468 | 0.0 | 1.0 | 8.0 | 34.0 | 262.0 |
MntGoldProds | 2205.0 | 44.057143 | 51.736211 | 0.0 | 9.0 | 25.0 | 56.0 | 321.0 |
NumDealsPurchases | 2205.0 | 2.318367 | 1.886107 | 0.0 | 1.0 | 2.0 | 3.0 | 15.0 |
NumWebPurchases | 2205.0 | 4.100680 | 2.737424 | 0.0 | 2.0 | 4.0 | 6.0 | 27.0 |
NumCatalogPurchases | 2205.0 | 2.645351 | 2.798647 | 0.0 | 0.0 | 2.0 | 4.0 | 28.0 |
NumStorePurchases | 2205.0 | 5.823583 | 3.241796 | 0.0 | 3.0 | 5.0 | 8.0 | 13.0 |
NumWebVisitsMonth | 2205.0 | 5.336961 | 2.413535 | 0.0 | 3.0 | 6.0 | 7.0 | 20.0 |
AcceptedCmp1 | 2205.0 | 0.073923 | 0.261705 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
AcceptedCmp2 | 2205.0 | 0.151020 | 0.358150 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
AcceptedCmp3 | 2205.0 | 0.073016 | 0.260222 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
AcceptedCmp4 | 2205.0 | 0.064399 | 0.245518 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
AcceptedCmp5 | 2205.0 | 0.074376 | 0.262442 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
AcceptedCmp6 | 2205.0 | 0.013605 | 0.115872 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
Complain | 2205.0 | 0.009070 | 0.094827 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
Total_Spending | 2205.0 | 606.821769 | 601.675284 | 5.0 | 69.0 | 397.0 | 1047.0 | 2525.0 |
Total_Purchase | 2205.0 | 14.887982 | 7.615277 | 0.0 | 8.0 | 15.0 | 21.0 | 43.0 |
NumberofChildren | 2205.0 | 0.948753 | 0.749231 | 0.0 | 0.0 | 1.0 | 1.0 | 3.0 |
TotalCampaignsAcc | 2205.0 | 0.450340 | 0.894075 | 0.0 | 0.0 | 0.0 | 1.0 | 5.0 |
Observations:
1️⃣ No Engagement with Campaigns or Complaints: Outliers did not accept any campaigns or submit complaints in the last two years.
2️⃣ Lower Mean Expenditure Across Products: Outliers spend less on all products except meat products.
3️⃣ Distinct Purchase Behavior: They prefer catalog purchases but make very few web purchases.
4️⃣ Recommendation: Remove Outliers: The five observations at indices [527, 731, 853, 1826, 1925] provide no value to the analysis and should be dropped.
#Dropping 5 observations at indices 527, 731, 853, 1826, 1925
df.drop(index=[527, 731, 853, 1826, 1925], inplace=True)
Check the distribution for Income¶
#plotting displot for income
sns.displot(df['Income'], kde=True, height=5, aspect=2)
plt.title('Income distribution', size=16, )
plt.ylabel('count');
Observations:
- After treating outliers, the income variable follows a nearly normal distribution, with only a few extreme values on the right. Given its slight right skew, missing values should be imputed using the median rather than the mean to prevent distortion from outliers.
#filling null values with median
df['Income'].fillna(df.Income.median(), inplace=True)
Analyzing all the campaigns¶
find out what is the acceptance rate for each campaign?¶
# PLotting the % acceptance for every campaign
Camp_cols=['AcceptedCmp1', 'AcceptedCmp2', 'AcceptedCmp3', 'AcceptedCmp4', 'AcceptedCmp5', 'AcceptedCmp6']
success_campaign=(df[Camp_cols].sum()/df.shape[0])*100
# plot
success_campaign.plot(kind='bar', figsize=(6,6))
plt.ylabel("Perentage")
plt.show()
Observations:¶
- Campaign 2 had the highest acceptance rate, while Campaign 6 had the lowest, indicating a significant difference in strategy effectiveness.
- Campaigns 1, 3, 4, and 5 had similar acceptance rates, suggesting a consistent but moderate impact.
- Further analysis is needed to understand why Campaign 2 was successful and why Campaign 6 underperformed, so strategies can be optimized across all campaigns.
Analyze what kind of customer are accepting campaigns?¶
plt.figure(figsize=(8,8))
sns.swarmplot(x='TotalCampaignsAcc', y='Income', data=df)
plt.show()
Observations:
- Higher the income higher the number of campaigns accepted.
# Let's see the mean income of customers
df.Income.mean()
51762.59811827957
- The average customer income is approximately 52K. To analyze campaign effectiveness, we can segment customers into two income groups: those earning above 52K and those earning below 52K. This will help us compare campaign acceptance rates across different income levels and identify any trends or anomalies in customer behavior.
# making dataframes of customers having income <52k and >52K
df1=df[df.Income<52000]
df2=df[df.Income>52000]
Camp_cols=['AcceptedCmp1', 'AcceptedCmp2', 'AcceptedCmp3', 'AcceptedCmp4', 'AcceptedCmp5', 'AcceptedCmp6']
#Calculating success rate of each campaign for both segments
success_campaign1=pd.DataFrame((df1[Camp_cols].sum()/df1.shape[0])*100, columns=['Income <52K'])
success_campaign2=pd.DataFrame((df2[Camp_cols].sum()/df2.shape[0])*100, columns=['Income >52K'])
new_df=pd.concat([success_campaign1, success_campaign2], axis=1)
# plot
plt.figure(figsize=(8,8))
sns.lineplot(data=new_df)
plt.title("Percentage Acceptance of each campaign")
plt.ylabel("Percentage Acceptance of a campaign")
plt.show()
**Observations:
- Campaign 1: Customers earning below 52K had a higher acceptance rate than those earning above 52K, an anomaly that requires further investigation.
- Campaigns 2 to 6: Customers with higher income (above 52K) accepted more offers, while those earning below 52K had lower acceptance rates.
Find out who has accepted the last campaign and what could be the reason
df[df['AcceptedCmp6']==1].shape
(30, 31)
- Only 30 customers accepted the last campaign.
- Next Step: Analyze whether these customers are new or have previously accepted other campaigns.
grouped2=df.groupby('AcceptedCmp6').mean()['TotalCampaignsAcc']
grouped2
AcceptedCmp6 0 0.404632 1 3.633333 Name: TotalCampaignsAcc, dtype: float64
Observations:
- Customers can accept a maximum of 5 campaigns, and those who accepted the last campaign have an average TotalCampaignsAcc of ~3.6. This suggests that these 30 customers are highly loyal, consistently engaging with most campaigns.
It could be that different campaigns are focussed on different set of products. Check if the product preference for those who accepted the campaigns is different from those who didn't - using amount spent and number of purchases¶
Define a function which will take the column name for the product as input and will generate the barplot for every campaign and average amount spent on a product
def amount_per_campaign(columns_name):
p1=pd.DataFrame(df.groupby(['AcceptedCmp1']).mean()[columns_name]).T
p2=pd.DataFrame(df.groupby(['AcceptedCmp2']).mean()[columns_name]).T
p3=pd.DataFrame(df.groupby(['AcceptedCmp3']).mean()[columns_name]).T
p4=pd.DataFrame(df.groupby(['AcceptedCmp4']).mean()[columns_name]).T
p5=pd.DataFrame(df.groupby(['AcceptedCmp5']).mean()[columns_name]).T
p6=pd.DataFrame(df.groupby(['AcceptedCmp6']).mean()[columns_name]).T
pd.concat([p1,p2,p3,p4,p5,p6],axis=0).set_index([Camp_cols]).plot(kind='line', figsize=(8,8))
plt.ylabel('Average amount spend on' + ' ' + columns_name)
plt.show()
Use the function defined above to generate barplots for different purchasing Products¶
#here is an example showing how to use this function on the column MntWines
amount_per_campaign('MntWines')
Observations:
- Customers who accepted Campaigns 3, 4, 5, and 6 exhibited a high average spending on wine, indicating a strong preference for this product. This trend suggests that wine promotions may be a key driver of campaign success and should be leveraged in future marketing strategies.
#meat products
#call the function amount_per_campaign for MntMeatProducts
amount_per_campaign('MntMeatProducts')
# Fruit products
#call the function amount_per_campaign for MntFruits
amount_per_campaign('MntFruits')
# gold products
#call the function amount_per_campaign for MntGoldProds
amount_per_campaign('MntGoldProds')
#sweet products
#call the function amount_per_campaign for MntSweetProducts
amount_per_campaign('MntSweetProducts')
Observations¶
1️⃣ Meat Products – Customers who accepted Campaigns 2, 3, and 4 spent significantly more on meat compared to those in Campaigns 5 and 6.
2️⃣ Fruit Products – Campaign 6 had a higher average spending among non-acceptors than acceptors, while Campaigns 2, 3, and 4 drove higher fruit purchases among acceptors.
3️⃣ Gold Products – The difference in spending between acceptors and non-acceptors was high for Campaigns 1, 2, 3, 4, and 6, but low for Campaign 5.
4️⃣ Sweet Products – Campaign 1 showed minimal difference between accepted and rejected purchases. However, spending increased for acceptors in Campaigns 2, 3, and 4, while Campaign 5 saw a sharp decline in accepted purchases.
5️⃣ Overall Trend – Customers who accepted campaigns consistently spent more across all product categories compared to those who rejected them.
Check the relationship of campaigns with different purchasing channels.¶
We have a defined a function which will take the column name of the channel name as input and will generate the barplot for every campaign and average purchase made through that channel if the campaign is accepted
def Purchases_per_campaign(columns_name):
dp1=pd.DataFrame(df.groupby(['AcceptedCmp1']).mean()[columns_name]).T
dp2=pd.DataFrame(df.groupby(['AcceptedCmp2']).mean()[columns_name]).T
dp3=pd.DataFrame(df.groupby(['AcceptedCmp3']).mean()[columns_name]).T
dp4=pd.DataFrame(df.groupby(['AcceptedCmp4']).mean()[columns_name]).T
dp5=pd.DataFrame(df.groupby(['AcceptedCmp5']).mean()[columns_name]).T
dp6=pd.DataFrame(df.groupby(['AcceptedCmp6']).mean()[columns_name]).T
pd.concat([dp1,dp2,dp3,dp4,dp5,dp6],axis=0).set_index([Camp_cols]).plot(kind='line', figsize=(8,8))
plt.ylabel('Average' + ' ' + columns_name)
plt.show()
#here is an example showing how to use this function on the column NumDealsPurchases
Purchases_per_campaign('NumDealsPurchases')
Observations:
- For the customers accepting campaign 3, 4, and 6 the average deals purchase is quite low.
# store purchase
Purchases_per_campaign('NumStorePurchases')
#call the function Purchases_per_campaign for NumStorePurchases
#Catalog purchase
Purchases_per_campaign('NumCatalogPurchases')
#call the function Purchases_per_campaign for NumCatalogPurchases
#Web purchases
Purchases_per_campaign('NumWebPurchases')
#call the function Purchases_per_campaign for NumWebPurchases
Observations:¶
1️⃣ Store Purchases: Customers who accepted Campaign 1 had a lower average number of store purchases compared to those who rejected it. This trend differs from Campaigns 2, 3, 4, 5, and 6, where accepted customers had higher store purchases.
2️⃣ Catalog Purchases: While rejected campaigns had lower catalog purchases overall, accepted Campaigns 3 and 4 had a higher average number of catalog purchases than Campaigns 1, 2, 5, and 6.
3️⃣ Web Purchases: There was a steady increase in web purchases from Campaign 1 to 4, followed by a gradual decline in Campaigns 5 and 6.
4️⃣ Spending on Products:
- Gold Products: The difference in spending between accepted and rejected campaigns was high for Campaigns 1, 2, 3, 4, and 6, but low for Campaign 5.
- Sweet Products: Campaign 1 had nearly identical spending for accepted and rejected offers. Spending increased in Campaigns 2, 3, and 4 but sharply declined in Campaign 5 despite acceptance.
- Overall Spending: Customers who rejected campaigns consistently spent less across all product categories compared to those who accepted them.
#Recency
Purchases_per_campaign('Recency')
Observations:
- Customers who accepted Campaign 2 had a low recency, meaning they had recently made a purchase before engaging with the campaign. This suggests that Campaign 2 was more effective for active, recently engaged customers, highlighting the importance of timing and customer activity in campaign success.
Check see the relationship of campaigns with different categorical variables¶
Check the percentage acceptance of each campaign with respect to each category in the categorical variable. The percentage acceptance is calculated as number of customers who have accepted the campaign to the total number of customers.
def Cat_Campaign_Relation(df, column_name):
e1=(df.groupby([column_name]).sum()['AcceptedCmp1']/df.groupby([column_name]).count()['AcceptedCmp1'])
e2=(df.groupby([column_name]).sum()['AcceptedCmp2']/df.groupby([column_name]).count()['AcceptedCmp2'])
e3=(df.groupby([column_name]).sum()['AcceptedCmp3']/df.groupby([column_name]).count()['AcceptedCmp3'])
e4=(df.groupby([column_name]).sum()['AcceptedCmp4']/df.groupby([column_name]).count()['AcceptedCmp4'])
e5=(df.groupby([column_name]).sum()['AcceptedCmp5']/df.groupby([column_name]).count()['AcceptedCmp5'])
e6=(df.groupby([column_name]).sum()['AcceptedCmp6']/df.groupby([column_name]).count()['AcceptedCmp6'])
df_new=pd.concat([e1,e2,e3,e4,e5,e6],axis=1).T
plt.figure(figsize=(8,8))
sns.lineplot(data=df_new, markers=True, linewidth=2)
plt.ylabel('Percentage Acceptance')
plt.show()
#here is an example showing how to use this function on the column Education
Cat_Campaign_Relation(df, 'Education')
Observations:
- Campaign 2 had strong engagement, with over 20% of Ph.D. holders accepting the offer.
- Customers with basic education only accepted Campaigns 1 and 2, showing limited engagement with other campaigns.
- All other education levels followed a consistent trend in campaign acceptance, except for those with basic education.
#NumberofChildren
#call the function Cat_Campaign_Relation for NumberofChildren
Cat_Campaign_Relation(df, 'NumberofChildren')
#Let's filter the observations with 'Others' category as they are only 4 such observations
df_rest=df[df.Marital_Status!='Others']
#call the function Cat_Campaign_Relation for Marital_Status with dataframe df_rest
Cat_Campaign_Relation(df_rest, 'Marital_Status')
#Let's filter the observations for 'ME' country as they are only 3 such observations
df_not_mexico=df[df.Country!='ME']
#Plot
plt.figure(figsize=(8,8))
sns.heatmap((df_not_mexico.groupby('Country').sum()[Camp_cols]/df_not_mexico.groupby('Country').count()[Camp_cols])*100, annot=True, fmt='0.2f', cmap="YlGnBu")
<AxesSubplot:ylabel='Country'>
Observation:¶
📌 Number of Children 25%+ of customers who accepted Campaign 2 have no children. For Campaign 1, acceptance rates are similar for customers with 0 to 2 children. Customers with 1 to 3 children show a consistent trend in acceptance percentages across campaigns.
📌 Marital Status Over 20% of widowed, divorced, and single customers accepted Campaign 2. Campaign 6 had the lowest acceptance rate across all marital statuses. All campaigns follow a similar acceptance trend regardless of marital status.
📌 Country-Based Trends Customers from all countries (except India) accepted Campaign 2 the most. Campaign 6 had the lowest acceptance percentage across all countries.
Check the product preferences by customers¶
#creating a list which contains name of all products
mnt_cols = [col for col in df.columns if 'Mnt' in col]
spending=df[mnt_cols].mean(axis=0)
spending.plot(kind='bar', figsize=(10,5))
plt.ylabel("Average spend by customers")
plt.show()
Observations:
📌 Spending Trends: Over the last two years, customers spent the most on wines, followed by meat products
Check if the product preferences are similar for different types of customers. Then calculate the percentage amount spent by customers on a product for each category with respect to the total spending by customers belonging to that category.
def amount_per_category(df, column_name):
df_new1=((df.groupby([column_name]).sum()[mnt_cols].T)/df.groupby([column_name]).sum()['Total_Spending'])
plt.figure(figsize=(10,8))
sns.heatmap(df_new1.T, annot=True, cmap="YlGnBu")
plt.show()
# plot showing the percentage of total spending of different products by a group of customers having the same education level
amount_per_category(df, 'Education')
Observations:¶
📌 Higher Education → Higher Spending on Wine & Meat PhD holders allocate ~60% of their total spending to wine and ~25% to meat. Graduates & Master’s degree holders spend ~45-50% on wine and ~27-29% on meat. Customers with a Master’s or PhD dedicate ~80% of their spending to wine and meat combined.
📌 Basic Education → Higher Spending on Other Products Customers with basic education prefer Fruits, Fish, Sweets, and Gold products over Wine and Meat.
#call the function amount_per_category for Marital_Status with dataframe df_rest
amount_per_category( df_rest, 'Marital_Status')
#call the function amount_per_category for Country with dataframe df_not_mexico
amount_per_category( df_not_mexico, 'Country')
Observations:¶
📌 Marital Status Across all marital statuses, customers spent the most on wine (48% to 53% of their salary), followed by meat products. Sweet products had the lowest spending across all categories in the last two years.
📌 Country Spanish customers spent more on wine than any other product compared to other countries. Across all seven countries, wine had the highest spending, while sweet products had the lowest spending consistently.
Check different channel performances¶
Calculate the percentage of purchases for all the channels.
# list of cols for channels
channel_cols = [col for col in df.columns if 'Purchases' in col]
#making dataframe of columns having purchase and taking sum of them.
channels = pd.DataFrame(df[channel_cols].sum()/df.Total_Purchase.sum(), columns=['NumberofPurchases'])
# plot
channels.plot(kind='bar', figsize=(6,6))
plt.ylabel("Percentage Purchases")
plt.show()
Observations:
- Most purchases occur in stores, followed by web purchases.
- Deal and catalog purchases remain low, requiring strategy improvements.
Check how number of purchases via different channels varies for different income bins.
#Binning the income column
df['income_cat']=pd.qcut(df.Income, q=[0, 0.25, 0.50, 0.75, 1], labels=['low', 'medium', 'high', 'very_high'])
group=df.groupby('income_cat').sum()[channel_cols]
(group/group.sum()).plot(kind='bar', figsize=(8,8))
plt.ylabel("Percentage Purchases")
plt.show()
Customer Income & Purchase Behavior Insights¶
- High-income customers had the highest purchase percentage across all channels.
- Low-income customers had the lowest purchase percentage across all channels.
- High-income customers distributed their purchases more evenly across different channels, showing a balanced spending pattern.
Visualize the correlation by purchases from different channels and income of the customer.¶
corr=df[['Income', 'NumDealsPurchases', 'NumWebPurchases', 'NumCatalogPurchases', 'NumStorePurchases' ]].corr()
#Write your code here
sns.heatmap(corr, annot = True)
<AxesSubplot:>
Observations:¶
The plot indicates a moderate positive correlation (59%) between Income and Number of Catalog Purchases, followed by a 53% correlation between Income and Number of Store Purchases. Additionally, there is a strong negative correlation between Income and Number of Deals Purchases, suggesting that higher-income customers prefer catalog and in-store purchases over discounted deals.
From our analysis, the three key factors influencing campaign acceptance are: 1️⃣ Income Level 2️⃣ Number of Children 3️⃣ Spending on Wine
To create a new customer profile, we can segment customers based on these attributes and predict their likelihood of campaign acceptance.
df3=df[df.Income>52000]
df4=df3[df3.MntWines>df3.MntWines.mean()]
new_profile=df4[df4.NumberofChildren==0]
#Calculating success rate of each campaign for both segments
success_campaign3=pd.DataFrame(success_campaign, columns=['Overall Acceptance'])
success_campaign4=pd.DataFrame((new_profile[Camp_cols].sum()/new_profile.shape[0])*100, columns=['New Customer Profile Acceptance '])
# plot
pd.concat([success_campaign3, success_campaign4], axis=1).plot(kind='bar', figsize=(8,6))
plt.title("")
plt.ylabel("")
plt.show()
Observations:
- Orange bars in the plot indicates that acceptance rate would have been high for new customer profile i.e. income greater than the mean income, no kid at home, amount spent of wines is greater than the mean amount spent on wines.
Conclusion¶
1️⃣ Spending Trends Across Countries Customers from different countries consistently spend more on wine than any other product.
2️⃣ Campaign Performance Campaign 2 had the highest acceptance rate, indicating a successful strategy. Campaign 6 had the lowest acceptance rate, requiring further investigation. The effectiveness of Campaign 2's strategy should be analyzed to see if it can be applied to improve acceptance rates for Campaigns 1, 3, 4, 5, and 6. A detailed review of Campaign 6 is necessary to understand why customers did not accept many offers.
3️⃣ Income-Based Anomalies in Campaign Acceptance An anomaly was observed in Campaign 1, where customers earning below 52K had a higher acceptance rate than those earning above 52K. This deviation should be investigated to understand why lower-income earners responded better to Campaign 1 compared to other campaigns.
4️⃣ Campaign 5: Low Spending Variation For Campaign 5, the difference in average spending between accepted and rejected campaigns was minimal across all products. This suggests that Campaign 5 may not be significantly influencing purchasing decisions and should either be optimized or discontinued.
5️⃣ Impact of Campaign Strategies on Spending Campaigns 3 and 4 led to high average spending across all products. The successful strategies from Campaigns 3 and 4 should be evaluated and potentially adapted to Campaigns 1, 2, 5, and 6 to drive higher spending.
6️⃣ Effect of Children on Campaign Acceptance Customers without children accepted more offers than those with at least one child. New targeted campaign strategies should be developed to improve engagement among customers with children.
7️⃣ Campaign Effectiveness by Marital Status and Country Campaign 2 was effective across different marital statuses and countries, while Campaign 6 was the least effective. These trends should be further analyzed to enhance future campaign personalization.
8️⃣ Customer Behavior Based on Income & Purchase Channels A negative correlation between income and purchases made with discounts indicates that lower-income customers rely more on discounts. A positive correlation between income and catalog purchases suggests that higher-income customers prefer catalog-based orders for mail delivery.
Recommendations¶
📌 1. Investigate the Success of Campaign 2 Identify key drivers behind its high acceptance rate and apply those insights to improve Campaigns 1, 3, 4, 5, and 6.
📌 2. Address the Poor Performance of Campaign 6 Conduct a detailed analysis to understand why it had the lowest acceptance rate. Adjust marketing strategies, promotional timing, or product offerings accordingly.
📌 3. Investigate the Income-Based Anomaly in Campaign 1 Identify factors influencing higher acceptance among customers earning below 52K. Consider whether discounts, messaging, or specific product offers played a role.
📌 4. Evaluate the Viability of Campaign 5 Since spending did not significantly differ between accepted and rejected offers, determine whether Campaign 5 needs strategy improvements or discontinuation.
📌 5. Adapt Strategies from Campaigns 3 & 4 These campaigns successfully increased average spending. Apply similar pricing, promotions, or engagement techniques to Campaigns 1, 2, 5, and 6.
📌 6. Improve Engagement Among Customers with Children Develop specialized campaigns tailored to the needs and preferences of families to boost acceptance rates.
📌 7. Align Marketing Strategies with Income-Based Purchase Preferences Promote discount-based offers to lower-income customers to increase conversions. Enhance catalog-based promotions for higher-income customers to optimize revenue.
📌 8. Identify Factors Behind the Increasing Trend in Purchases from Campaigns 1 to 4 The CMO should analyze what drove the consistent increase in purchases across all channels. This insight can be leveraged to further enhance future campaigns.
📌 9. Optimize Product-Specific Campaigns The CMO should develop strategies to increase sales of sweet, gold, and fish products without negatively impacting other product categories. These strategies should be effective across all customer demographics (countries & marital statuses).