AI-102 Exam - Question 150

Question

HOTSPOT-You have a collection of press releases stored as PDF files. You need to extract text from the files and perform sentiment analysis. Which service should you use for each task? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

Examice · Accepted Answer

.

zellck · Answer

1. Computer Vision
2. Language

https://learn.microsoft.com/en-us/azure/cognitive-services/computer-vision/overview-ocr
OCR or Optical Character Recognition is also referred to as text recognition or text extraction. Machine-learning based OCR techniques allow you to extract printed or handwritten text from images, such as posters, street signs and product labels, as well as from documents like articles, reports, forms, and invoices. The text is typically extracted as words, text lines, and paragraphs or text blocks, enabling access to digital version of the scanned text. This eliminates or significantly reduces the need for manual data entry.

Pffffff · Answer

The service you should use to extract text from the PDF files is B. Computer Vision.

Computer Vision has the ability to extract text from images and PDF files, making it a suitable choice for this scenario. Once the text has been extracted, you can then use a text analytics service, such as the Azure Cognitive Services Text Analytics API, to perform sentiment analysis on the extracted text.

Azure Cognitive Search is a search-as-a-service solution that allows you to index and search structured and unstructured data. It can also extract text from PDF files, but it may not provide the level of accuracy required for sentiment analysis.

Form Recognizer is a service that is designed to extract structured data from forms, such as receipts, invoices, and business cards. It may not be the best choice for extracting text from press releases.

Mehe323 · Answer

Answers are correct. Form recognizer is the old name of Document Intelligence. About Document Intelligence on Microsoft Learn:

"Document Intelligence Read Optical Character Recognition (OCR) model runs at a higher resolution than Azure AI Vision Read and extracts print and handwritten text from PDF documents and scanned images."

https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/concept-read?view=doc-intel-4.0.0

ziggy1117 · Answer

answer is correct:
1. form recognizer -> READ. 
Form Recognizer v3.0's Read Optical Character Recognition (OCR) model runs at a higher resolution than Computer Vision Read and extracts print and handwritten text from PDF documents and scanned images. It also includes preview support for extracting text from Microsoft Word, Excel, PowerPoint, and HTML documents. It detects paragraphs, text lines, words, locations, and languages. The Read model is the underlying OCR engine for other Form Recognizer prebuilt models like Layout, General Document, Invoice, Receipt, Identity (ID) document, in addition to custom models.
https://learn.microsoft.com/en-us/azure/applied-ai-services/form-recognizer/overview?view=form-recog-3.0.0
https://learn.microsoft.com/en-us/azure/applied-ai-services/form-recognizer/concept-read?view=form-recog-3.0.0

2. Language obviously

davidorti · Answer

Answer seems correct.
1. Form Recognizer (now Document Intelligence)
2. Language

In CV Read API docu for OCR it says clearly:
OCR for Images: "Optimized for general, *non-document images* with a performance-enhanced synchronous API"
Document Intelligence: "Optimized for text-heavy scanned and digital documents with an asynchronous API to help automate intelligent document processing at scale"

Here we're dealing with a collection of PDFs.

mon2002 · Answer

Azure AI Document Intelligence.
Azure AI Language.

HaraTadahisa · Answer

Service name was changed. Current answer is must be
1. Azure AI Document Intelligence
2. Azure AI Language

MaliSanFuu · Answer

I think the answer is correct as the FormRecognizer supports the ability for document analysis. There you can easy the read API to extract printed or handwritten text from images and documents.

omankoman · Answer

1. Computer Vision
2. Language

Computer Vision provides advanced algorithms that process images and return information based on visual features of interest. It offers four services: OCR, Face Service, Image Analysis, and Spatial Analysis. Form Recognizer is an advanced version of OCR.

kail85 · Answer

Azure Cognitive Search can be used to extract text from PDF files. It can ingest and index the content of various file formats, including PDFs, by using built-in document cracking capabilities or custom skills. The indexing process extracts text and metadata from the files, making the content searchable.

shahnawazkhot · Answer

Answer is - 
Azure Cognitive Search
Language

Yes, Azure Cognitive Search can be used to extract text from PDF files 12. The Azure Cognitive Search blob indexer can extract text from PDF and other document formats 2. However, extracting text from embedded images or tables is not yet integrated in Azure Search, but it is on the roadmap 3.

To extract text from PDF files using Azure Cognitive Search, you can use the Document Extraction cognitive skill 1. This skill extracts content from a file within the enrichment pipeline and can extract text and images with high accuracy 1. You can use this skill to extract text from PDF files and perform sentiment analysis on the extracted text using the Sentiment Analysis feature provided by Azure Cognitive Services.

sismer · Answer

The answer is correct:

For extracting text from PDF files, you can use Azure Cognitive Services specifically the Azure Form Recognizer service. Azure Form Recognizer is designed to extract key-value pairs, tables, and text from documents, including PDFs. It supports various document types, making it suitable for extracting text from press releases in PDF format.

For sentiment analysis, you can use the Azure Text Analytics service. Azure Text Analytics includes a sentiment analysis feature that can analyze the sentiment of text documents and provide a sentiment score. This service can help you determine whether the sentiment expressed in the press releases is positive, negative, or neutral.

takaimomoGcup · Answer

Computer Vision
Language

NagaoShingo · Answer

1. Computer Vision
2. Language

eskimolight · Answer

I feel the given answer is correct.. Form Recognizer can extract text, key-value pairs, and tables from forms and documents. It's particularly useful for processing structured documents like invoices, receipts, and business forms.

krzkrzkra · Answer

1. Computer Vision
2. Language

AI-102 Exam - Question 150

Discussion