Certified Data Engineer Professional Exam - Question 141

Question

A junior data engineer is working to implement logic for a Lakehouse table named silver_device_recordings. The source data contains 100 unique fields in a highly nested JSON structure.

The silver_device_recordings table will be used downstream to power several production monitoring dashboards and a production model. At present, 45 of the 100 fields are being used in at least one of these applications.

The data engineer is trying to determine the best approach for dealing with schema declaration given the highly-nested structure of the data and the numerous fields.

Which of the following accurately presents information about Delta Lake and Databricks that may impact their decision-making process?

Examice · Accepted Answer

Because Databricks will infer schema using types that allow all observed data to be processed, setting types manually provides greater assurance of data quality enforcement. Manually setting schema types ensures that the data adheres to specific formatting and standards, which is crucial when the data is being used in production environments for monitoring dashboards and models.

hpkr · Answer

D is correct

Isio05 · Answer

Agree with propopsed answer, D

Certified Data Engineer Professional Exam - Question 141

Discussion