Total Users
671K
Jan–Mar 2017
Churners
16,115
Churn rate 2.4%
Model AUC
0.764
LightGBM · Test set
Features Used
27
12 engineered
Data Collection
1
Raw data from Kaggle — KKBox Taiwan
4 files: members · transactions · user_logs · train. Joined with DuckDB into a single table.
2
Target population: 671K users
Subscribers whose plan expired in April 2017, registered before January 2017.
3
Monthly behavioral structure
3 months of listening data (Jan, Feb, Mar 2017) per user — enabling trend analysis over time.
4
80/20 stratified split
2.4% churn rate preserved identically in train and test sets.
Variables Collected
User Profile
Age · Gender · City · Registration date
Monthly Listening (×3 months)
Active days · Avg listening seconds · Songs completed · Unique songs played
Subscription
Auto-renewal cancelled · Amount paid (TWD)
Engineered Features
Trend %: change Jan→Mar in all listening metrics
Cost/sec: amount paid ÷ total listening seconds
Tenure group: 0–12 / 13–36 / 36+ months
Cost/sec: amount paid ÷ total listening seconds
Tenure group: 0–12 / 13–36 / 36+ months
Missing Value Strategy
Listening features
Fill with 0 — no data = didn't listen
Auto-renewal
Fill with 0 — no transaction = no action
Gender
Fill with "Unknown" — treated as a separate category
Age (invalid)
Fill with median + add binary flag
age_validChurn Rate by Gender
Unknown users churn less — possibly passive subscribers who forgot they have a subscription.
Churn Rate by Tenure
Tenure shows minimal variation — not a strong predictor of churn.
Churn Rate by Payment (TWD)
Higher payers churn more — premium users may feel they don't get enough value.
Avg Listening Seconds — Trend Analysis
Key Findings
Surprising: Churners listen MORE than non-churners across all 3 months (6,033 vs 5,686 sec/day in January).
The trend matters: Churners decline steeper (−3%) vs non-churners (−2.3%). Declining engagement predicts churn better than absolute level.
Active users churn too: 90% of churners are active listeners who make a conscious decision to leave — not passive dropouts.
Two churner types: Active churners (90%) need a better value proposition. Passive churners (10%) need reminders they have a subscription.
Active Days per Month — Churners vs Non-Churners
Total Users
671K
Jan–Mar 2017
Churn Rate
2.4%
16,115 churners
AUC — Test Set
0.764
LightGBM
Recall
64%
Churners caught
Model Comparison — AUC (5-Fold CV)
LightGBM winner
0.756
XGBoost
0.731
Random Forest
0.674
Logistic Regression
0.648
All models used 5-fold stratified cross-validation on training data. Test set locked until final evaluation.
Top Feature Importance
actual_amount_paid
423
active_days_trend_pct
344
age
266
listen_trend_pct
236
songs_completed_trend
222
unique_songs_trend_pct
218
unique_songs_mar
215
cost_per_sec
185
Churn Risk Calculator
38%
Medium Risk
Monitor this user closely
1
Focus retention on high-paying subscribers
Users paying 150–199 TWD churn at 4.7% — nearly double the average of 2.3%. These are the most valuable users to retain. The model identified actual_amount_paid as the strongest predictor of churn, which means price sensitivity is the #1 lever available to the business.
Recommended action: Offer loyalty discounts, exclusive content, or priority support at renewal time for users in the 150–199 TWD tier. Even a 10% discount on renewal is likely profitable if it prevents churn in this group.
High PriorityPricing
2
Build an early-warning trigger on activity decline
active_days_trend_pct is the second strongest predictor in the model (importance score: 344). A user whose active days drop by more than 20% from January to March is significantly more likely to churn in April. This signal is detectable 30+ days before the subscription expires.
Recommended action: Implement a monthly scoring pipeline that flags users with declining activity. Trigger an automated push notification or email with a special offer when a user crosses the −20% threshold, before the subscription renewal date.
High PriorityAutomation
3
Differentiate retention strategy: active vs passive churners
EDA revealed two distinct churner profiles. 90% of churners are active listeners who make a conscious decision to leave — they listen more than non-churners on average. Only 8–10% are truly inactive users. A single retention strategy cannot address both groups effectively.
For active churners: Improve the value proposition — better content, new features, or pricing flexibility.
For passive churners: Send a reminder that they have an active subscription, potentially with usage highlights to re-engage them.
For passive churners: Send a reminder that they have an active subscription, potentially with usage highlights to re-engage them.
SegmentationRetention
4
Investigate pricing perception at premium tiers
The data shows a counterintuitive pattern: higher-paying users churn more. Users paying 150–199 TWD churn at 4.7% vs 0.8% for users paying 1–99 TWD. This negative correlation between price and retention suggests that premium users may have higher expectations that are not being met by the current product.
Recommended action: Conduct a satisfaction survey targeting the 150–199 TWD segment to identify specific pain points. Use the findings to prioritize product improvements or create a dedicated premium experience that justifies the higher price point.
ResearchPricing