Exam Certified Data Engineer Professional All QuestionsBrowse all questions from this exam
Question 10

A Delta table of weather records is partitioned by date and has the below schema: date DATE, device_id INT, temp FLOAT, latitude FLOAT, longitude FLOAT

To find all the records from within the Arctic Circle, you execute a query with the below filter: latitude > 66.3

Which statement describes how the Delta engine identifies which files to load?

    Correct Answer: D

    The Delta engine identifies which files to load by scanning the Delta log for min and max statistics for the latitude column. Delta Lake captures statistics for each data file, including minimum and maximum values in each column, and uses these statistics to optimize query execution by determining which files may contain the data that matches the query filter.

Discussion
taif12340Option: D

Answer D: In the Transaction log, Delta Lake captures statistics for each data file of the table. These statistics indicate per file: - Total number of records - Minimum value in each column of the first 32 columns of the table - Maximum value in each column of the first 32 columns of the table - Null value counts for in each column of the first 32 columns of the table When a query with a selective filter is executed against the table, the query optimizer uses these statistics to generate the query result. it leverages them to identify data files that may contain records matching the conditional filter. For the SELECT query in the question, The transaction log is scanned for min and max statistics for the price column

RiktRikt007Option: D

I checked the delta log, and it dose store stat, stats":"{\"numRecords\":1,\"minValues\":{\"id\":1,\"name\":\"one\",\"age\":11},\"maxValues\":{\"id\":1,\"name\":\"one\",\"age\":11},\"nullCount\":{\"id\":0,\"name\":0,\"age\":0}}"

03355a2Option: D

No explanation needed, this is where the information is stored.

imatheushenriqueOption: D

D. The Delta log is scanned for min and max statistics for the latitude column

coercionOption: D

Delta log collects statistics like min value, max value, no of records, no of files for each transaction that happens on the table for the first 32 columns (default value)

TayariOption: D

D is the answer

arik90Option: D

Based on Docu is D I don't know why here is showing B

alexvnoOption: D

Delta log first

DavidRouOption: D

Statistics on first 32 columns of a table are computed and written in the Delta Log by default.

vikram12aprOption: D

D is the right answer

Curious76Option: D

D is the answer

kkravetsOption: D

D is correct one

AziLaOption: D

correct ans is D

Jay_98_11Option: D

D for sure

kz_dataOption: D

I think the correct answer is D

ranithOption: D

_delta_log contains the max and min of each column for the first 30 odd columns in a table for each partition. Also there is nothing called parquet file footers. Correct answer is D.

lexaneonOption: D

D https://www.databricks.com/discover/pages/optimize-data-workloads-guide#:~:text=Delta%20data%20skipping%20automatically%20collects,to%20speed%20up%20the%20queries.