Exam DP-203 All QuestionsBrowse all questions from this exam
Question 127

You are designing a statistical analysis solution that will use custom proprietary Python functions on near real-time data from Azure Event Hubs.

You need to recommend which Azure service to use to perform the statistical analysis. The solution must minimize latency.

What should you recommend?

    Correct Answer: B

    Azure Databricks is the best choice for performing statistical analysis on near real-time data from Azure Event Hubs with custom proprietary Python functions. Azure Databricks integrates smoothly with Azure Event Hubs and supports real-time data processing using Python, among other languages. It is designed to handle complex analytics and big data processing with low latency, making it suitable for the given scenario. Azure Stream Analytics, on the other hand, primarily uses a SQL-like language and does not natively support Python for custom functions, making it less appropriate for this specific requirement.

Discussion
kolakoneOption: B

My answer will be B Stream Analytics supports "extending SQL language with JavaScript and C# user-defined functions (UDFs)". There is no mention of Python support; hence Stream Analytics is not correct. https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-introduction Azure Databricks supports near real-time data from Azure Event Hubs. And includes support for R, SQL, Python, Scala, and Java. So I will go for option B.

anto69

But Python runs on Event Hubs why the other service does should support Python too?

Aditya0891

It's mentioned that "python runs on real time data from event hubs not on event hubs". Also event hub is to gather that data and after that it is analyzed by either databricks stream analytics. And since stream analytics doesn't support python so the answer is databricks

RoyP654

therefore i agree wih ASA

RoyP654

python can run Event Hubs libraries real time, it doesn't have to be supported by ASA, it just needs to send data to analytics service

ExamDestroyer69

@RoyP654, the question asks which service to perform the statistical analysis (e.g. execute the python) suggesting that the python has not/will not be ran in events hubs

anto69Option: C

I'm sure it's Stream Analytics cause Event Hubs already supports Python (https://docs.microsoft.com/en-us/azure/event-hubs/event-hubs-python-get-started-send). We don't need the other service to support it. We just need to lower costs. Hence ASA is the correct solution

RoyP654

the question does not ask which service can run Python, it's asking where to send the data for analytics since Python can run with Event Hubs libraries

prshntdxt7Option: C

chatGPT explains - Azure Stream Analytics is designed for real-time data stream processing and analytics. It can ingest data from various sources, including Azure Event Hubs, and allows you to run near real-time analytics using a SQL-like language. With Stream Analytics, you can easily apply custom Python functions using user-defined functions (UDFs) and achieve low-latency processing. Azure Synapse Analytics and Azure Databricks are powerful analytics services, but they are more suitable for complex analytics and big data processing rather than near real-time, low-latency scenarios. Azure SQL Database is a relational database service and is not specifically designed for real-time stream processing. Therefore, in this case, Azure Stream Analytics is the recommended choice for minimizing latency in statistical analysis on near real-time data from Azure Event Hubs.

HSZOption: C

From ChatGPT, To minimize latency for statistical analysis on near real-time data from Azure Event Hubs, I recommend using Azure Stream Analytics (Option C). Azure Stream Analytics is designed for real-time data processing and can ingest and analyze data from Event Hubs with low latency, making it a suitable choice for this scenario.

mav2000

Also from ChatGPT (GPT4) lol: For processing near real-time data with custom proprietary Python functions and minimizing latency, the best service would be: B. Azure Databricks Here’s why: Azure Databricks is an Apache Spark-based analytics service that integrates smoothly with Azure services such as Azure Event Hubs. It supports real-time streaming data processing and can execute custom Python code, which is necessary for your custom statistical analysis functions. Databricks is designed to handle large-scale data processing and analytics with low latency, making it suitable for near real-time scenarios. The other services have their uses but may not be the optimal choice for this particular scenario

sdg2844Option: C

Simply, they always want the Stream Analytics answer. It's the most straightforward.

evangelistOption: A

C is wrong, stream analytics is a SQL based analytics

e56bb91Option: B

ChatGPT 4o Azure Databricks is well-suited for real-time data processing and analytics. It provides a collaborative environment for working with Apache Spark, which is ideal for performing complex statistical analyses and machine learning tasks in real-time.

KarlGardnerDataEngineeringOption: B

It says we are using python after it is sent to event hubs: "custom proprietary Python functions on near real-time data FROM Azure Event Hubs". Yes, we can send events to Event Hub with python but it says that we are running statistical analysis AFTER we send it to Event Hub. Therefore, my answer is Databricks

eb36a01Option: B

Azure Stream Analytics: Azure Stream Analytics is designed for real-time data processing and can directly ingest data from Azure Event Hubs. However, it has limited support for custom Python functions. It is more suitable for simple real-time analytics and transformations rather than complex statistical analysis with custom code. correct answer: Azure Databricks, we have custom python function

DusicaOption: B

B is correct

poesklapOption: B

Azure Databricks provides a fast and scalable Apache Spark-based analytics platform that supports Python, among other programming languages. It allows you to perform near real-time data processing and analysis efficiently, making it ideal for scenarios where low latency is a priority. Additionally, it offers seamless integration with Azure Event Hubs, enabling you to ingest data in real-time and apply custom Python functions for statistical analysis.

ElancheOption: B

B. Azure Databricks Azure Databricks provides a fully managed Apache Spark-based analytics platform that is well-suited for processing and analyzing real-time streaming data. It offers native integration with Azure Event Hubs, allowing you to ingest data in real-time and apply custom Python functions for statistical analysis with minimal latency. Additionally, Databricks provides scalable compute resources, optimized processing capabilities, and support for various programming languages, making it an ideal choice for near real-time data analysis scenarios.

moneytimeOption: C

C is correct. At near realtime ,the window functions in azure stream analytics can be employed in compute some statical values (e.g count,maximum,min,avg. etc) of the data streaming from the even hub.

Azure_2023Option: B

Corrected!!! FROM Azure Event Hubs, not ON Azure Databricks

Azure_2023Option: D

FROM Azure Event Hubs, not ON

maxCarterOption: B

Azure Databricks

kkk5566Option: B

Databriks supports Python.