Certified Machine Learning Professional Exam QuestionsBrowse all questions from this exam

Certified Machine Learning Professional Exam - Question 43


A machine learning engineer has developed a model and registered it using the FeatureStoreClient fs. The model has model URI model_uri. The engineer now needs to perform batch inference on customer-level Spark DataFrame spark_df, but it is missing a few of the static features that were used when training the model. The customer_id column is the primary key of spark_df and the training set used when training and logging the model.

Which of the following code blocks can be used to compute predictions for spark_df when the missing feature values can be found in the Feature Store by searching for features by customer_id?

Show Answer
Correct Answer: E

To compute predictions for spark_df, the most appropriate method is fs.score_batch(model_uri, spark_df). This method will automatically retrieve the necessary missing features from the Feature Store by utilizing the customer_id as the primary key, ensuring that all features needed for the model evaluation are included. This approach simplifies the pipeline by handling feature retrieval and batch scoring in a single step, making it an efficient and reliable choice.

Discussion

3 comments
Sign in to comment
victorcolomeOption: E
Jan 22, 2024

The answer is E. See score_batch in https://api-docs.databricks.com/python/feature-store/latest/feature_store.client.html. "Additional features required for model evaluation will be automatically retrieved from Feature Store." Besides, methods "get_missing_features" and "score_model" do not appear in the documentation.

BokNinjaOption: C
Dec 19, 2023

The correct answer is C. df = fs.get_missing_features(spark_df, model_uri) fs.score_batch(model_uri, df). In this code snippet, fs.get_missing_features(spark_df, model_uri) is used to retrieve the missing features from the Feature Store using the customer_id as the key. The resulting DataFrame df contains the original data along with the retrieved features. Then, fs.score_batch(model_uri, df) is used to perform batch inference on the DataFrame df using the model specified by model_uri.

inet777
May 27, 2024

Except - I could not find a method called get_missing_features in FeatureStoreClient APIs. E is right answer.

IT3008Option: E
Jan 15, 2024

Right answer is E - there is no API info called 'get_missing_features' in the doc.