Exam Certified Data Engineer Associate All QuestionsBrowse all questions from this exam
Question 65

A data engineer is attempting to drop a Spark SQL table my_table and runs the following command:

DROP TABLE IF EXISTS my_table;

After running this command, the engineer notices that the data files and metadata files have been deleted from the file system.

Which of the following describes why all of these files were deleted?

    Correct Answer: A

    The table was managed, which means that both the metadata and the data files are controlled by Spark SQL's metastore. When a managed table is dropped, Spark SQL will remove all associated metadata and data files from the file system. This is in contrast to an external table where only the metadata is removed, and the data files remain in the file system.

Discussion
meow_akkOption: A

A is correct , managed tables files and metadata are managed by metastore and will be deleted when the table is dropped . while external tables the metadata is stored in a external location. hence when a external table is dropped you clear off only the metadata and the files (data) remain.

kz_dataOption: A

A is correct

benni_aleOption: A

A is correct

UGOTCOOKIESOption: A

Two types of tables, managed and external. Both table types are treated the same, except when the table is dropped. For a managed table the data is stored in the managed storage location that is configured to the meta store. By default this is dbfs:/user/hive/warehouse. When the table is dropped the meta data and the underlying data is deleted. For external tables the data is stored in a cloud storage location outside of the managed storage location. The underlying data is retained when an external table is dropped, only the metadata is dropped.

GarynOption: A

A. The table was managed. Explanation: In Spark SQL, when a table is managed (or internal), both the metadata that contains information about the table and the actual data files associated with the table are managed by the SQL engine. The DROP TABLE command, when used on a managed table, deletes not only the metadata but also the underlying data files associated with that table from the file system. When a managed table is dropped, it removes all information about the table, including metadata and data files, leading to the deletion of both the metadata and data files from the file system. Options B, C, D, and E don't specifically relate to why the data files and metadata files were deleted. The fact that the table was managed (or internal) is the reason for the removal of both the metadata and data files when the table was dropped using the DROP TABLE command.