Professional Machine Learning Engineer Exam - Question 264

Question

You work for a retail company. You have been tasked with building a model to determine the probability of churn for each customer. You need the predictions to be interpretable so the results can be used to develop marketing campaigns that target at-risk customers. What should you do?

Examice · Accepted Answer

To determine the probability of churn for each customer, which is a binary classification problem, a classification model is needed. Random forest classification models are known for their interpretability since they can generate feature importances, which help in understanding the factors contributing to the model's predictions. This interpretability is crucial for developing targeted marketing campaigns aimed at at-risk customers. The use of Vertex AI Workbench for building and training the model further supports this approach due to its comprehensive tools and support for generating feature importances.

guilhermebutzke · Answer

My Answer: B

“the probability of churn for each customer”: the probability is a number. So regression problem. (A,B, C)

“predictions to be interpretable”: explainable in predict not in the model (B,C)

Choosing between “Build an AutoML tabular regression model” and “Build a custom TensorFlow neural network by using Vertex AI custom training”, I think B could be the most relevant for the problem.  However I also think that others no enough information in the text to choose between the two.

pikachu007 · Answer

Option A: Regression, not classification, is used for random forest model, which is not appropriate for predicting probabilities.
Option B: While AutoML tabular can generate model explanations, random forests inherently provide more granular insights into feature importance.
Option C: Neural networks can be less interpretable than tree-based models, and generating explanations for them often requires additional techniques and libraries.

Yan_X · Answer

I don't know which one is correct...
As D is 'after the model is trained', so not for each prediction.
And B 'AutoML tabular regression model' is regression, but for not classification problem...

fitri001 · Answer

Since interpretability is key for your churn prediction model to inform marketing campaigns, 
--> Choose an interpretable model:
Logistic Regression: This is a classic choice for interpretability. It provides coefficients for each feature, indicating how a unit increase in that feature impacts the probability of churn. Easy to understand and implement, it's a good starting point.
Decision Trees with Rule Extraction: Decision trees are inherently interpretable, with each branch representing a decision rule. By extracting these rules, you can understand the specific factors leading to churn (e.g., "Customers with low tenure and high number of support tickets are more likely to churn").

shadz10 · Answer

https://cloud.google.com/bigquery/docs/xai-overview

daidai75 · Answer

The answer is D.
1.Churn prediction is a classification problem: We want to categorize customers as either churning or not churning, not predict a continuous value like revenue. Therefore, a classification model is needed.
2.Random forest models are interpretable: Feature importances provide insights into which features contribute most to the model's predictions, making them a good choice for understanding why customers churn. This interpretability is crucial for developing targeted marketing campaigns.
3.Vertex AI Workbench is a suitable platform: It provides notebook instances for building and training models, making it a good choice for this task.

sonicclasps · Answer

the question asks for explainability for predictions, answer D does not provide that. 
Although not the ideal solution, B is the only answer that suits the requirements, because churn can also be expressed as a probability.

pinimichele01 · Answer

the probability of churn for each customer -> regression -> B

gscharly · Answer

agree with Yan_X. This is a classification problem, so regression should not be used (rule out A&B). Neural networks don't have explainable features by default, and Random Forest provides global explanations...

Roulle · Answer

Churn problems are cases of classification. We don't predict the label, but the probability of belonging to a given class (churn or not). We then set a threshold to indicate the probability at which we can affirm that the person will or will not unsubscribe.
We can eliminate all responses that mention regression (A & B).

A random forest is therefore less complex to interpret than a neural network.

So I'm pretty sure it's D

Professional Machine Learning Engineer Exam - Question 264

Discussion