Certified Data Engineer Associate Exam QuestionsBrowse all questions from this exam

Certified Data Engineer Associate Exam - Question 19


A data engineer runs a statement every day to copy the previous day’s sales into the table transactions. Each day’s sales are in their own file in the location "/transactions/raw".

Today, the data engineer runs the following command to complete this task:

After running the command today, the data engineer notices that the number of records in table transactions has not changed.

Which of the following describes why the statement might not have copied any new records into the table?

Show Answer
Correct Answer: C

The statement might not have copied any new records into the table because the previous day's file has already been copied into the table. The COPY INTO command is idempotent, meaning it only loads files that haven’t been loaded before, skipping those which have already been processed. As a result, if the data from the previous day has already been copied, running the command again will not change the number of records in the table.

Discussion

17 comments
Sign in to comment
ezeikOption: E
Sep 24, 2023

E is the correct answer, because immediately after using copy into you might query the cashed version of the table.

XiltroXOption: C
Apr 4, 2023

Option C is the correct answer.

kniveszOption: C
Apr 4, 2023

Respuesta C, por descarte, A) No es necesario B) No se coloca FILES D) PARQUET si es soportado E) No es necesario refrescar la vista, ya que se esta copiando un archivo

Nika12Option: C
Jan 27, 2024

Just got 100% on the test. C was correct.

sdas1Option: C
Apr 4, 2023

option C

mimzzzOption: C
Apr 4, 2023

i am not sure whether C is the correct answer, but A is definitely not right

Varma_SaraswathulaOption: C
Apr 21, 2023

C- https://docs.databricks.com/ingestion/copy-into/tutorial-notebook.html Because this action is idempotent, you can run it multiple times but data will only be loaded once.

testdbOption: B
May 18, 2023

Answer: B FILES = ('f1.json', 'f2.json', 'f3.json', 'f4.json', 'f5.json') https://docs.databricks.com/ingestion/copy-into/examples.html

[Removed]
May 24, 2023

The correct answer is letter C. The use of specific files names with keyword "FILES" is optional as the syntax of COPY INTO declares: [ FILES = ( file_name [, ...] ) | PATTERN = glob_pattern ] When keyword FILES is not used in the statement all files of the directory is used once (because this operation is idempotent).

junctionOption: C
May 29, 2023

COPY INTO Loads data from a file location into a Delta table. This is a retriable and idempotent operation—files in the source location that have already been loaded are skipped.

AtnafuOption: C
Jul 7, 2023

C The COPY INTO statement copies the data from the specified files into the target table. If the previous day's file has already been copied into the table, then the COPY INTO statement will not copy any new records into the table.

AndreFROption: C
Aug 18, 2023

https://docs.databricks.com/en/ingestion/copy-into/index.html The COPY INTO SQL command lets you load data from a file location into a Delta table. This is a re-triable and idempotent operation; files in the source location that have already been loaded are skipped. if there are no new records, the only consistent choice is C no new files were loaded because already loaded files were skipped.

KalavathiPOption: C
Sep 26, 2023

C is correct ans

DavidRouOption: C
Oct 9, 2023

COPY INTO statement does skip already copied rows.

kishanuOption: E
Oct 15, 2023

If the table "transaction" is an external table, then option E, if its internal C should suffice.

awofalusOption: C
Nov 7, 2023

C is correct

GarynOption: C
Dec 29, 2023

C. The previous day’s file has already been copied into the table. The COPY INTO statement is generally used to copy data from files or a location into a table. If the data engineer runs this statement daily to copy the previous day’s sales into the "transactions" table and the number of records hasn't changed after today's execution, it's possible that the data from today's file might not have differed from the data already present in the table. If the files in the "/transactions/raw" location are expected to contain distinct data for each day and the number of records in the table remains the same, it implies that the data engineer might have already copied today's data previously, or today's data was identical to the data already present in the table. Options A, B, D, and E don't accurately explain why the statement might not have copied new records into the table based on the provided scenario.

SerGreyOption: C
Jan 4, 2024

Correct answer is C