Certified Data Engineer Associate Exam - Question 72

Question

A Delta Live Table pipeline includes two datasets defined using STREAMING LIVE TABLE. Three datasets are defined against Delta Lake table sources using LIVE TABLE.

The table is configured to run in Development mode using the Continuous Pipeline Mode.

Assuming previously unprocessed data exists and all definitions are valid, what is the expected outcome after clicking Start to update the pipeline?

Examice · Accepted Answer

When a Delta Live Table pipeline is configured to run in Development mode using the Continuous Pipeline Mode, all datasets will be updated at set intervals until the pipeline is shut down. The compute resources will persist to allow for additional testing. In Development mode, Delta Live Tables avoid cluster restarts by reusing a cluster, allowing continuous updates and providing an environment conducive to ongoing testing and adjustments. Therefore, the pipeline will continuously process data until manually stopped, all while keeping the compute resources active.

meow_akk · Answer

Ans E : Development and production modes
You can optimize pipeline execution by switching between development and production modes. Use the Delta Live Tables Environment Toggle Icon buttons in the Pipelines UI to switch between these two modes. By default, pipelines run in development mode.

When you run your pipeline in development mode, the Delta Live Tables system does the following:

Reuses a cluster to avoid the overhead of restarts. By default, clusters run for two hours when development mode is enabled. You can change this with the pipelines.clusterShutdown.delay setting in the Configure your compute settings.

Disables pipeline retries so you can immediately detect and fix errors.

In production mode, the Delta Live Tables system does the following:

Restarts the cluster for specific recoverable errors, including memory leaks and stale credentials.

Retries execution in the event of specific errors, for example, a failure to start a cluster.

https://docs.databricks.com/en/delta-live-tables/updates.html#optimize-execution

SD5713 · Answer

E. All datasets will be updated at set intervals until the pipeline is shut down. The compute resources will persist to allow for additional testing.

anandpsg101 · Answer

E is correct

55f31c8 · Answer

https://docs.databricks.com/en/delta-live-tables/updates.html#continuous-vs-triggered-pipeline-execution

https://docs.databricks.com/en/delta-live-tables/testing.html#use-development-mode-to-run-pipeline-updates

nedlo · Answer

Why E? It persists with same functionality as was before, not for "additional testing"?

kz_data · Answer

E seems the correct answer

Garyn · Answer

E. All datasets will be updated at set intervals until the pipeline is shut down. The compute resources will persist to allow for additional testing.

Explanation:

In Development mode, Delta Live Tables persistently updates datasets at set intervals. The pipeline continuously processes incoming data until manually stopped or shut down.

Compute resources, including the cluster used for processing, persist without automatic restarts or retries (as it is the behavior in Development mode). This persistence allows for ongoing processing of data, enabling additional testing or continued data processing until the pipeline is manually shut down.

Therefore, option E accurately captures the behavior expected in Development mode, emphasizing the continuous update of datasets and the persistence of compute resources until the pipeline is manually terminated.

benni_ale · Answer

e as teh cluster actually persits differently from b

3fbc31b · Answer

The answer is E. The compute resources will persist even after the pipeline is shut down.

Certified Data Engineer Associate Exam - Question 72

Discussion