Professional Machine Learning Engineer Exam QuestionsBrowse all questions from this exam

Professional Machine Learning Engineer Exam - Question 60


You work for a bank and are building a random forest model for fraud detection. You have a dataset that includes transactions, of which 1% are identified as fraudulent. Which data transformation strategy would likely improve the performance of your classifier?

Show Answer
Correct Answer: C

Since only 1% of the transactions are fraudulent, the dataset is highly imbalanced. Oversampling the minority class, in this case, fraudulent transactions, would increase their representation in the training dataset, helping the classifier to better learn to identify fraud. Writing data in TFRecords, Z-normalizing features, or using one-hot encoding on categorical features would not directly address the class imbalance issue affecting the performance of the model in detecting fraud.

Discussion

10 comments
Sign in to comment
ralf_ccOption: C
Jul 10, 2021

C - https://swarit.medium.com/detecting-fraudulent-consumer-transactions-through-machine-learning-25b1f2cabbb4

NamitSehgalOption: C
Jan 5, 2022

C is the answer

MultiCloudIronManOption: C
Apr 1, 2024

Oversampling increases the number of fraudulent transaction in the training data to enable the machine to learn how to predict them

M25Option: C
May 9, 2023

Went with C

Mohamed_MossadOption: C
Jul 11, 2022

the best option is C

hiromiOption: C
Dec 15, 2022

C https://medium.com/analytics-vidhya/credit-card-fraud-detection-how-to-handle-imbalanced-dataset-1f18b6f881

wish0035Option: C
Dec 16, 2022

ans: C A, B, D => wouldnt help with imbalance

fragkrisOption: C
Dec 5, 2023

C - Even though most similar questions propose to downsample the majority (not fraudulent) and add weights to it.

PhilipKokuOption: C
Jun 6, 2024

C) Oversample

dija123Option: C
Jun 18, 2024

Agree with C