Certified Data Engineer Associate Exam - Question 75

Question

A data engineer has joined an existing project and they see the following query in the project repository:

CREATE STREAMING LIVE TABLE loyal_customers AS

SELECT customer_id -

FROM STREAM(LIVE.customers)

WHERE loyalty_level = 'high';

Which of the following describes why the STREAM function is included in the query?

Examice · Accepted Answer

The STREAM function is included in the query because the customers table is a streaming live table. This function is necessary to handle and process streaming data as it arrives in real-time, which is suitable for the context of a 'live table' that constantly updates based on incoming data streams.

meow_akk · Answer

Ans C is correct : https://docs.databricks.com/en/sql/load-data-streaming-table.html Load data into a streaming table To create a streaming table from data in cloud object storage, paste the following into the query editor, and then click Run: SQL Copy to clipboardCopy /* Load data from a volume */ CREATE OR REFRESH STREAMING TABLE AS SELECT * FROM STREAM read_files('/Volumes/////') /* Load data from an external location */ CREATE OR REFRESH STREAMING TABLE AS SELECT * FROM STREAM read_files('s3:////')

cxw23 · Answer

Ans is A.
CREATE STREAMING LIVE TABLE  syntax is does not exist.
It should be CREATE LIVE TABLE AS SELECT * FROM STREAM.

azure_bimonster · Answer

C is correct

OfficeSaracus · Answer

Option E, specifying "at least one notebook library to be executed," is not a requirement for setting up a Delta Live Tables pipeline. Delta Live Tables are built on top of Databricks and use notebooks to define the pipeline's logic, but the actual requirement when setting up the pipeline is typically the location where the data will be written to, like a target database or a path to cloud storage. While notebooks may contain the business logic for the transformations and actions within the pipeline, the fundamental requirement for setting up a pipeline is knowing where the data will reside after processing, hence why the location of the target database for the written data is crucial.

benni_ale · Answer

c is ok

benni_ale · Answer

c is correct . about D: it can be correct but it is not given the fact it comes from pyspark ; sql supports (at least in databricks) the creation of streaming live table as well so it is not necessasarily from pyspark

Certified Data Engineer Associate Exam - Question 75

Discussion