An ML based mostly method to proactive advertiser churn prevention | by Pinterest Engineering | Pinterest Engineering Weblog | Might, 2023

Pinterest Engineering
Pinterest Engineering Blog

Erika Solar ML Engineer | Advertiser Development Modeling Staff; Ogheneovo Dibie Engineering Supervisor | Advertiser Development Modeling Staff

Old, rustic boat sinking in ocean — Photo by Jason Blackeye on Unsplash
Picture by Jason Blackeye on Unsplash

On this weblog put up, we describe a Machine Studying (ML) powered proactive churn prevention resolution that was prototyped with our small & medium enterprise (SMB) advertisers. Outcomes from our preliminary experiment recommend that we are able to detect future churn with a excessive diploma of predictive energy and consequently empower our gross sales companions in mitigating churn. ML-powered proactive churn prevention can obtain higher outcomes than conventional reactive handbook effort.

Like many ads-based companies, at Pinterest, we’re intently targeted on minimizing advertiser churn on our platform. Historically, advertiser churn is addressed reactively. Particularly, a gross sales individual reaches out to an advertiser solely after they’ve churned. This method is difficult as a result of it’s extremely tough to “resurrect” a buyer as soon as they go away the platform. To handle the challenges with addressing churn reactively, we current a ML-powered proactive method to advertiser churn discount. Particularly, we developed a mannequin that may predict the probability of advertiser churn within the close to future and empowered our gross sales crew with insights from this mannequin to forestall in danger accounts from churning.

On this weblog, we cowl the:

  • Churn prediction mannequin’s design and implementation
  • Experimentation within the managed North America SMB phase

Our crew constructed a ML mannequin to foretell advertiser’s churn probability within the subsequent 14 days. We use the Shapely Additive Clarification (SHAP) package deal to estimate the mannequin’s options’ contribution to the churn prediction. We offer the mannequin churn prediction together with prime contributing options to gross sales. Gross sales makes use of this info to prioritize their effort to mitigate churn for advertisers in danger. We’ll speak about every part in additional element within the following subsections.

Mannequin Structure

The preliminary model of our mannequin relies on a snapshot Gradient Boosting Determination Tree (GBDT) structure. We selected GBDT for the next causes:

  1. GBDT is a extensively used mannequin with good efficiency on small to medium sized tabular information* (our information suits on this description).
  2. SHAP works nicely with GBDT to estimate options’ contributions.
  3. Mannequin characteristic significance is straightforward to generate with GBDT.
  4. It could possibly additionally function a superb baseline mannequin for future mannequin enhancements, e.g. a sequential mannequin.

*Snapshot means we use all the data obtainable as much as a given timestamp to foretell the churn chance within the subsequent 14 days with respect to that timestamp.

Goal Variable

After thorough evaluation and session on the enterprise wants, we determined to make use of the next goal variable definition (see Determine 1).

7/01 to 07/07 is 7 day spend >0. 07/07 to 07/21 is 14 days. 07/21 to 07/27 is 7 day spend >0 ? If yes, then Label 0: active. If no, then Label 1: churn.
Determine 1: Goal Variable Definition

For our use case, we distinguish between an energetic and churned advertiser as follows:

  • Lively advertiser: spent within the final 7 days
  • Churned advertiser: no spend within the final 7 days

We solely predict the churn probability for energetic advertisers. Particularly, we predict if they may churn within the subsequent 14 days.


There are over 200 options used within the mannequin. These options are aggregated throughout totally different statistical measures–e.g. min, avg, max and so forth — over a spread of time home windows such because the previous week / month previous to the inference dates. We additionally embrace week over week and month over month change options to mirror current developments. These options will be grouped within the following classes:

  • Efficiency: impressions**, clicks, conversions, conversion values, spend, price per 1000 impressions, price per click on, clickthrough charge
  • Aim: objective attainment ratio, distance to objective
  • Price range: price range and utilization
  • Adverts supervisor actions: creates, edits, archives, customized stories
  • Property: gross sales channel, nation, business, tenure, measurement, spend historical past
  • Marketing campaign configuration: focusing on, bid technique, goal sort, marketing campaign finish date

**View greater than 1 second.

Function Contribution

We use the SHAP library to estimate the characteristic contribution to mannequin chance output. Sigmoid of the sum of the options’ SHAP contribution is the same as mannequin chance. From SHAP characteristic contribution, we are able to know what the important thing drivers are of excessive churn chance. We then spotlight them for the Gross sales crew to forestall churn.

We use an offline educated mannequin to deduce energetic advertisers’ churn chance every day.

Churn Danger Class

To assist the Gross sales crew higher perceive the which means of the mannequin output, we classify accounts into three classes based mostly on their churn chance: excessive, medium, and low churn threat. Excessive churn threat captures the accounts which are principally more likely to churn with excessive precision. Medium churn threat captures the accounts which have a decrease probability of churn. Low churn threat accommodates the ‘wholesome’ accounts which are unlikely to churn within the subsequent 14 days. We choose the thresholds to outline totally different churn threat classes in keeping with the Gross sales crew’s request of desired precision and recall. Extra particulars will be present in Experiment End result.

Our first experiment was targeted on SMB accounts in North America which are managed by Gross sales Account Managers (AMs). We break up the advertisers randomly into remedy and management teams inside the experiment inhabitants. For the management group, we don’t make any adjustments to the present Gross sales crew procedures. For the remedy group, we supported the Gross sales crew to forestall churn with the next info:

  1. Churn Danger Class: Excessive / medium / low churn threat
  2. Churn Motive Class. We categorised the detailed churn causes into coarse churn classes to ease understanding. The Gross sales crew carried out investigations utilizing churn classes as instructions.
14 Day Churn Prediction Model — Overall Churn Risk High. Churn Category is Performance and Campaign Setup / Best Practices. Absolute Change in 14d Churn Risk % D/D is -11% down.
Determine 2: Churn Info Widget

Experiment Success Metrics

Our experiment was evaluated based mostly on the next standards:

  1. Mannequin predictive energy, i.e. how nicely our mannequin is ready to establish advertisers which are more likely to churn
  2. Efficacy of churn prediction in churn discount

Mannequin Predictive Energy

In an effort to decide the mannequin’s predictive energy, we in contrast its on-line efficiency on the management group (i.e. AMs who didn’t have entry to the churn predictions) to what we had noticed offline throughout improvement (i.e. our out-of-sample analysis). Particularly, we measured mannequin efficiency based mostly on:

  1. Mannequin high quality: We in contrast the AUC-ROC and AUC-PR noticed on-line to offline.
  2. Churn threat segmentation: In session with gross sales, we decided thresholds for top, medium, and low churn threat classes in order that:
  3. Recall in excessive and medium churn threat ought to be above 70%.
  4. Precision in excessive churn threat ought to be round 70%.

This allows gross sales to seize most accounts vulnerable to churning whereas additionally prioritizing how you can work by way of them, i.e. excessive churn threat first (highest precision).

With respect to mannequin high quality, our outcomes point out that the AUC-ROC noticed on-line is inside 1% of the offline AUC-ROC and the net AUC-PR is inside 3% of the offline AUC-PR. This means that the mannequin’s predictive energy in figuring out at-risk accounts is corresponding to what we noticed offline.

When it comes to churn threat segmentation, our mannequin’s precision, recall, and proportion of the inhabitants captured inside the excessive and medium threat churn classes have been persistently inside 2–3% of our offline analysis. This means that the segmentation of account threat based mostly on churn probability have been in keeping with our offline analysis and gross sales expectations.

Efficacy of Churn Prediction in Advertiser Churn Discount

We noticed a 24% (statistically vital) discount within the churn charge of excessive tier pods*** in our experiment remedy group in comparison with the management. This means that accounts whose churn dangers have been uncovered to AMs have been much less more likely to churn than people who weren’t.

*** In excessive tier pods, AMs handle about 50–70 accounts on common.

On this weblog put up, we illustrated the event and implementation of an ML-based resolution for proactive churn prevention at Pinterest. We’re additionally actively investigating sequential mannequin architectures equivalent to Lengthy short-term reminiscence (LSTM) and Transformers, which can higher seize the utilization behaviors of advertisers and decrease the necessity for handbook characteristic engineering equivalent to week-over-week or month-over-month characteristic aggregation utilized in our present mannequin.

Advertiser Development Modeling Staff

  • Engineering: Erika Solar, Ogheneovo Dibie, Keshava Subramanya, Mao Ye
  • Product: Shailini Pandya
  • Product Analytics/Information Science: Alex Simons

Gross sales Staff

  • Product: Wesley Kwiecien, Grace Yun
  • Gross sales Managers: Abby (Fromm) Lubarsky

Salesforce Staff

  • Engineering: Gayathri Varadarangan (She Her), Murthy Tumuluri, Phani Chimata, Gabriela Mihaila, Richard Wu

Optimization Workbench Staff

  • Engineering: Phil Value, Jordan Boaz, Lucilla Chalmer
  • Product: Dan Marantz

[1] When and Why Tree-Based Models (Often) Outperform Neural Networks | by Andre Ye | Towards Data Science

To be taught extra about engineering at Pinterest, take a look at the remainder of our Engineering Weblog and go to our Pinterest Labs web site. To discover life at Pinterest, go to our Careers web page.