Professional Machine Learning Engineer Exam QuestionsBrowse all questions from this exam

Professional Machine Learning Engineer Exam - Question 72


You are building a linear model with over 100 input features, all with values between –1 and 1. You suspect that many features are non-informative. You want to remove the non-informative features from your model while keeping the informative ones in their original form. Which technique should you use?

Show Answer
Correct Answer: B

L1 regularization is effective for feature selection in a linear model as it penalizes the sum of the absolute values of the coefficients. This technique can drive the coefficients of non-informative features to zero, effectively removing their impact from the model while retaining the informative features in their original form. Principal Component Analysis (PCA) transforms the features, making it unsuitable for retaining features in their original form. Shapley values are used for feature importance after model building and can be computationally expensive for high-dimensional data. Iterative dropout techniques are typically used in neural networks and do not directly identify non-informative features.

Discussion

11 comments
Sign in to comment
hiromiOption: B
Dec 18, 2022

L1 regularization it's good for feature selection https://www.quora.com/How-does-the-L1-regularization-method-help-in-feature-selection https://developers.google.com/machine-learning/crash-course/regularization-for-sparsity/l1-regularization

ailiba
Feb 22, 2023

but this is not a sparse input vector, just a high dimensional vector where many features are not relevant.

ares81Option: B
Dec 11, 2022

A. PCA reconfigures the features, so no. C. After building your model, so no. D. Dropout should be in the model and it doesn't tell us which features are informative or not. Big No! For me, it's B.

mlghOption: C
Jan 26, 2023

Answer C: In the official sample questions, there's a similar question, the explanation is that L! is for reducing overfitting while explainability (shapely) is for feature selection, hence C. https://docs.google.com/forms/d/e/1FAIpQLSeYmkCANE81qSBqLW0g2X7RoskBX9yGYQu-m1TtsjMvHabGqg/viewform

mlgh
Jan 26, 2023

It cannot be A either because PCA modifies the features, and it says you should keep them in their original form. and D cannot be because again dropout is for generalizing and avoiding overfitting, and it's done on the NN model not on the data.

tavva_prudhvi
Mar 17, 2023

Its wrong. Using Shapley values to determine feature importance can be a useful technique, but it requires building a complete model and can be computationally expensive, especially with over 100 input features. Additionally, it may not be practical to use this method for every model iteration or update. On the other hand, L1 regularization can be used during the model building process to effectively reduce the impact of non-informative features by shrinking their coefficients to 0, making it a more efficient and effective approach.

JeanElOption: B
Dec 13, 2022

Agree with B

behzadswOption: A
Jan 6, 2023

The features must be removed from the model. They are not removed when doing L1 regularization. PCA is used prior to training.

jamesking1103
Jan 13, 2023

should be A as keeping the informative ones in their original form

libo1985
Sep 26, 2023

How PCA can keep the original form?

tavva_prudhvi
Mar 17, 2023

That is a good point. PCA is a technique used to reduce the dimensionality of the dataset by transforming the original features into a new set of uncorrelated features. This can help to eliminate the least informative features and reduce the computational burden of building a model with many input features. However, it is important to note that PCA does not necessarily remove the original features from the model, but rather transforms them into a new set of features. On the other hand, L1 regularization can effectively remove the impact of non-informative features by setting their coefficients to 0 during the model building process. Therefore, both techniques can be useful for addressing the issue of non-informative features in a linear model, depending on the specific needs of the problem.

enghabethOption: B
Feb 9, 2023

it's a best way, becouse you reduce features non relevant in this case non-informatives

tavva_prudhviOption: B
Mar 17, 2023

Its B. See my explanations under the comments why its not C.

AntmalOption: B
Mar 30, 2023

L1 regularization penalises weights in proportion to the sum of the absolute value of the weights. L1 regularization helps drive the weights of irrelevant or barely relevant features to exactly 0. A feature with a weight of 0 is effectively removed from the model. https://developers.google.com/machine-learning/glossary#L1_regularization

M25Option: B
May 9, 2023

Went with B

LitingOption: B
Jul 7, 2023

Went with B

PhilipKokuOption: B
Jun 7, 2024

B) L1 Regularisation