AI-900 Exam QuestionsBrowse all questions from this exam

AI-900 Exam - Question 122


You have an AI solution that provides users with the ability to control smart devices by using verbal commands.

Which two types of natural language processing (NLP) workloads does the solution use? Each correct answer presents part of the solution.

NOTE: Each correct selection is worth one point.

Show Answer
Correct Answer: CD

To control smart devices using verbal commands, the solution primarily relies on speech-to-text conversion and language modeling. Speech-to-text is necessary to convert the spoken commands into written text so the system can process them. Language modeling is then used to understand the intent behind the textual commands and generate the appropriate actions based on this understanding. Key phrase extraction, while useful for identifying main concepts in text, doesn't provide the comprehensive understanding required for executing commands based on user intent.

Discussion

21 comments
Sign in to comment
ThariCDOptions: CD
Jan 6, 2023

This should be speech-to-text and language modeling. You need to use language modeling to determine the intent of the utterance and to perform an action based on that intent.

fgugliaOptions: CD
Mar 5, 2023

For me the answer is Speech to Text and Language Modeling

TDAC
Jan 25, 2024

The answer is correct. We all agree speech-to-text is correct. Key Phrase Detection is also correct. Here is why: Key phrase extraction is a technique that identifies the most important phrases in a given text. eg: "Turn the light on", or "What is today's date?" Language modeling is a technique that is used to predict the probability of a sequence of words in a given language. It is used to generate text that is similar to the input text. For example, given the text “The cat sat on the”, a language model would predict that the next word is “mat” with a higher probability than “car”. Since we are talking about a smart device, the answer will be key phrase detection. Upvote me it makes sense to you.

Mehe323
Jan 26, 2024

I disagree. According to Microsoft, key phrase extraction lists the main concepts from unstructured text. In this case there is no unstructured text. https://learn.microsoft.com/en-us/azure/search/cognitive-search-skill-keyphrases

Alex_W
Jun 9, 2024

Of course there is: extracted from speech-to-text.

Alex_W
Jun 9, 2024

Of course there is: extracted from speech-to-text.

VintageLady
Aug 8, 2024

Language modeling is based on prediction; it uses the preceding words in a sentence to determine context and "predict" the next word (e.g., autofill), so I don't think it's as useful here (your Alexa does not need to predict what you're asking, only to process the command you give and respond appropriately, so extracting key phrases in what you say to her is more important).

XtraWestOptions: CD
Apr 10, 2023

C. Speech to text D. Natural language understanding (NLU)

AplUSAndmINUS
Jul 12, 2024

Language modeling is too broad of a term to apply here. Though this is part of the process, the system is actually looking for phrases to help it understand what the user wants it to do here. "Turn on", "lights", "close garage door" all require the AI to extract those phrases from what the user is saying, which is key phrase detection. You also need speech-to-text to translate the user's spoken words into text the language model can understand. It's more specific and better answers the question.

Ous01
Dec 6, 2022

I believe this should be speech-to-text and Language Modeling.

Rosviul
Mar 22, 2023

it should be C-D... as per Language modeling: Identify key terms and phrases, understand sentiments, and build conversational interfaces into applications.

master_yodaOptions: BC
Apr 25, 2023

These two types of NLP workloads are key phrase extraction and speech-to-text. Key phrase extraction is used to quickly identify the main concepts in text while speech-to-text is used to convert spoken words into written text. The key here is control smart devices, not a human conversation.

rdemontisOptions: CD
Jun 1, 2023

Considering the scenario described where the goal is to control smart devices using voice commands, the most appropriate choice would be to use speech-to-text conversion as the first step in the process and then apply language modeling to generate consistent and meaningful responses or actions based on the commands recognized in the produced text. This could allow the AI to understand the users' intent and respond appropriately. Key phrase extraction could also work but is more complex because an additional layer would have to be added that understands user intent based on the combination of keywords extracted. But it would become complex and probably less efficient as well. Language modeling solves this problem natively.

tsummeyOptions: CD
Jul 3, 2024

While it’s true that C. Speech-to-Text is used to interpret what is spoken and convert it into text, and B. Key Phrase Extraction can process the text to identify the main points, these two alone might not be sufficient for a complete AI solution that controls smart devices using verbal commands. it doesn’t necessarily understand the context or the specific actions that need to be taken based on those key phrases. For example, in a command like “Turn on the living room lights”, key phrase extraction might identify “turn on”, “living room”, and “lights” as key phrases, but it doesn’t inherently understand that “turn on” is an action that needs to be applied to the “living room lights”. That's my reason why language modeling is a better answer than key phrase extraction.

Scott123
Mar 10, 2024

It should ne AD: Certainly! The AI solution for controlling smart devices via verbal commands utilizes two key types of Natural Language Processing (NLP) workloads: Text-to-Speech (TTS): TTS is a critical component that converts written text into spoken language. It enables the system to communicate with users by generating human-like speech from textual input1. Speech-to-Text (STT): STT, also known as Automatic Speech Recognition (ASR), performs the opposite function. It transcribes spoken language into written text, allowing the system to understand and process verbal commands2

sujitwarrier11Options: BC
May 27, 2024

I believe the given answer is correct. We dont need chat GPT like functionality here. We just need to know what action needs to be performed on what device. Key phrase extraction is perfect for the job.

Alex_WOptions: BC
Jun 9, 2024

STT plus key phrase extraction perfectly fit the job.

QueenShiOptions: BC
Nov 26, 2024

Not a huge fan of this question. It is missing an option that allows for intent recognition like Conversation language understanding or LUIS.

frych
Jan 6, 2023

Speech-to-Text is OK Why Language Modeling is not OK ? Language models analyze bodies of text data to provide a basis for their word predictions. They are used in natural language processing (NLP) applications, particularly ones that generate text as an output. Some of these applications include , machine translation and question answering.

1StepGrow
Oct 25, 2023

Thank you for the valuable information on the blog.I am not an expert in blog writing, but I am reading your content slightly, increasing my confidence in how to give the information properly. Your presentation was also good, and I understood the information easily. For more information Please visit the 1stepGrow website or data science course. https://1stepgrow.com/advance-data-science-and-artificial-intelligence-course/

M2000F007fubarOptions: CD
Nov 1, 2024

answer is C D

argb30Options: BC
Nov 4, 2024

Completely agree with VintageLady explanation

mmmmmnmOptions: CD
Jan 19, 2025

The question only said "by using verbal commands". That means the Key phrase extraction doesn't need to be used.

MoonOptions: BC
Jan 27, 2025

Deepseek answer is (C & D). ChatGPT and Gimni answer is A & C. My answer is: B & C.