Professional Machine Learning Engineer Exam - Question 72

Question

You are building a linear model with over 100 input features, all with values between –1 and 1. You suspect that many features are non-informative. You want to remove the non-informative features from your model while keeping the informative ones in their original form. Which technique should you use?

Examice · Accepted Answer

L1 regularization is effective for feature selection in a linear model as it penalizes the sum of the absolute values of the coefficients. This technique can drive the coefficients of non-informative features to zero, effectively removing their impact from the model while retaining the informative features in their original form. Principal Component Analysis (PCA) transforms the features, making it unsuitable for retaining features in their original form. Shapley values are used for feature importance after model building and can be computationally expensive for high-dimensional data. Iterative dropout techniques are typically used in neural networks and do not directly identify non-informative features.

hiromi · Answer

L1 regularization it's good for feature selection
https://www.quora.com/How-does-the-L1-regularization-method-help-in-feature-selection
https://developers.google.com/machine-learning/crash-course/regularization-for-sparsity/l1-regularization

ares81 · Answer

A. PCA reconfigures the features, so no. 
C. After building your model, so no.
D. Dropout should be in the model and it doesn't tell us which features are informative or not. Big No!
For me, it's B.

mlgh · Answer

Answer C:
In the official sample questions, there's a similar question, the explanation is that L! is for reducing overfitting while explainability (shapely) is for feature selection, hence C.
https://docs.google.com/forms/d/e/1FAIpQLSeYmkCANE81qSBqLW0g2X7RoskBX9yGYQu-m1TtsjMvHabGqg/viewform

JeanEl · Answer

Agree with B

behzadsw · Answer

The features must be removed from the model. They are not removed when doing L1 regularization. PCA is used prior to training.

enghabeth · Answer

it's a best way, becouse you reduce features non relevant in this case non-informatives

tavva_prudhvi · Answer

Its B. See my explanations under the comments why its not C.

Antmal · Answer

L1 regularization penalises weights in proportion to the sum of the absolute value of the weights. L1 regularization helps drive the weights of irrelevant or barely relevant features to exactly 0. A feature with a weight of 0 is effectively removed from the model. https://developers.google.com/machine-learning/glossary#L1_regularization

M25 · Answer

Went with B

Liting · Answer

Went with B

PhilipKoku · Answer

B) L1 Regularisation

Professional Machine Learning Engineer Exam - Question 72

Discussion