Professional Machine Learning Engineer Exam - Question 265

Question

You work for a company that is developing an application to help users with meal planning. You want to use machine learning to scan a corpus of recipes and extract each ingredient (e. g. , carrot, rice, pasta) and each kitchen cookware (e. g. , bowl, pot, spoon) mentioned. Each recipe is saved in an unstructured text file. What should you do?

Examice · Accepted Answer

To extract specific named entities like ingredients and cookware from unstructured text recipes, creating a text dataset on Vertex AI for entity extraction is the most suitable approach. You would create two entities called 'ingredient' and 'cookware' and label examples of each entity in the dataset. Training an AutoML entity extraction model on this dataset will help in accurately identifying and extracting these entities, as it allows for customization based on your specific requirements. Other methods might not provide the required level of customization or accuracy for specialized domains such as recipes.

b1a8fae · Answer

A.
"... you might create an entity extraction model to identify specialized terminology in legal documents or patents."

I prefer this over C, which might classify carrot as vegetable, chicken as meat... custom entity extraction allows you to specify what entities you wish to extract from the text.

guilhermebutzke · Answer

My Answer: A

A: is the most suitable approach for this task because we need to identify and extract specific named entities ("ingredient" and "cookware") from the text, not classify the entire recipe into predefined categories.

B: This approach would require classifying each recipe based on all possible ingredients and cookware, leading to a vast number of classes and potential performance issues.

C: This pre-built solution might not be as customizable or scalable as training a specific model for this task.

D: This is impractical and unnecessary as the number of potential ingredients and cookware is vast.

pikachu007 · Answer

Option B: Multi-label text classification is less suitable for identifying specific entities within text and would require labeling entire recipes with multiple classes, increasing complexity and reducing model specificity.
Option C: Natural Language API's Entity Analysis might not be as accurate for this specialized domain as a model trained on custom recipe data.
Option D: Creating separate entities for each ingredient and cookware type would significantly increase labeling effort and potentially hinder model generalization.

shadz10 · Answer

A is the correct option here

shadz10 · Answer

Reconsidering my answer and going with C 
Option A involves using AutoML entity extraction, which could be a valid approach. However, for extracting entities like ingredients and cookware, Google Cloud's pre-trained Natural Language API might be a more straightforward solution.

daidai75 · Answer

I prefer to A.
Option C is not the best, because The NLP API is designed to identify general entities within text. While it's effective for broad categories, it may not be as precise for specialized domains like cooking ingredients and cookware, which require a more tailored approach.

omermahgoub · Answer

Natural Language API offers a pre-built solution for entity analysis which eliminates the need for custom model training and labeling large datasets, saving time and resources.

Vertex AI AutoML can aslo be used for entity extraction but it requires data labeling and training, which can be time-consuming for a vast number of potential ingredients and cookware.

fitri001 · Answer

For extracting ingredients and cookware from recipe text files, creating a text dataset on Vertex AI for entity extraction with a custom NER model is the better approach. While it requires more upfront effort for data labeling and training, it offers superior accuracy and control over the types of entities extracted.

However, if you need a quick and easy solution to get started, the Natural Language API's Entity Analysis can be a temporary option.  Be aware that the accuracy might be lower, and you might need to post-process the results to filter out irrelevant entities.

AzureDP900 · Answer

By choosing option A, you can leverage the power of machine learning to efficiently extract ingredients and cookware from recipes in a scalable manner.
option C uses the Entity Analysis method of the Natural Language API, which might be a viable option if you had access to the API's pre-trained models. However, since you're working with Vertex AI, creating a dataset for entity extraction is a better choice.

Professional Machine Learning Engineer Exam - Question 265

Discussion