DP-201 Exam QuestionsBrowse all questions from this exam

DP-201 Exam - Question 166


HOTSPOT -

Which Azure service and feature should you recommend using to manage the transient data for Data Lake Storage? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Hot Area:

Exam DP-201 Question 166
Show Answer
Correct Answer:
Exam DP-201 Question 166

Scenario: Stage inventory data in Azure Data Lake Storage Gen2 before loading the data into the analytical data store. Litware wants to remove transient data from Data Lake Storage once the data is no longer in use. Files that have a modified date that is older than 14 days must be removed.

Service: Azure Data Factory -

Clean up files by built-in delete activity in Azure Data Factory (ADF).

ADF built-in delete activity, which can be part of your ETL workflow to deletes undesired files without writing code. You can use ADF to delete folder or files from

Azure Blob Storage, Azure Data Lake Storage Gen1, Azure Data Lake Storage Gen2, File System, FTP Server, sFTP Server, and Amazon S3.

You can delete expired files only rather than deleting all the files in one folder. For example, you may want to only delete the files which were last modified more than 13 days ago.

Feature: Delete Activity -

Reference:

https://azure.microsoft.com/sv-se/blog/clean-up-files-by-built-in-delete-activity-in-azure-data-factory/

Design data processing solutions

Discussion

22 comments
Sign in to comment
AhmedReda
Jun 28, 2020

The question asked to remove files older than 14 days which i think ADF & Delete could not do it, so the answer might be = (1) Azure Storage (2) Lifecycle management rule

Sai02
Oct 30, 2020

In ADF, the Metadata activity has the LastModified property through which we can delete the files I believe.

bansal_vikrant
May 2, 2020

The files are stored in ADLS Gen2 which supports Life cycle management rules

Psycho
May 19, 2021

https://azure.microsoft.com/en-au/updates/lifecycle-management-for-azure-data-lake-storage-is-now-generally-available/

vrmei
Jun 23, 2021

Yes, This is correct.

syu31svc
Dec 10, 2020

https://azure.microsoft.com/en-us/updates/lifecycle-management-for-azure-data-lake-storage-is-now-generally-available/ Azure storage and lifecycle management rule are the answers

dcpavelescu
Jul 31, 2020

Answer shall be: Azure Storage & Lifecycle management https://docs.microsoft.com/en-us/azure/storage/blobs/storage-lifecycle-management-concepts?tabs=azure-portal

kempstonjoystick
Apr 1, 2020

Would lifecycle management not fulfill this as well?

HCL1991
Apr 9, 2020

This function is only available for Azure blob storage and not for ADLS. So the only possible answer is ADF.

envy
Jul 16, 2020

the function seems available for ADLS: https://github.com/MicrosoftDocs/azure-docs/issues/42140

M0e
Oct 26, 2020

In Gen 2 it supports all the features of Blob Storage including lifecycle management.

envy
Jul 16, 2020

the function seems available for ADLS: https://github.com/MicrosoftDocs/azure-docs/issues/42140

M0e
Oct 26, 2020

In Gen 2 it supports all the features of Blob Storage including lifecycle management.

BungyTex
Dec 10, 2020

I just tested this in my ADL Gen 2, can set a rule to delete files last modifed more than 14 days ago.

davita8
Apr 30, 2021

Azure storage lifecycle management

mohowzeh
Jan 14, 2021

Seems to me that there are two valid combinations: (Azure Data Factory, delete activity) and (Azure storage, Lifecycle management)

memo43
May 23, 2021

and second one the easiest!!

felmasri
Mar 13, 2021

Azure Data Lake Storage lifecycle management is now generally available https://azure.microsoft.com/en-us/updates/lifecycle-management-for-azure-data-lake-storage-is-now-generally-available/

runningman
May 1, 2020

"Litware wants to remove transient data from Data Lake Storage once the data is no longer in use." wouldn't that make B Azure Storage a better answer?

runningman
May 1, 2020

Never mind. if Delete Activity is correct (which i believe it is), the first answer has to be ADF.

Ash666
Jul 23, 2020

ADLS GEN 2 doesn’t yet support delete blob action https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-known-issues

NikP
Aug 6, 2020

Now, Lifecycle management is supported for accounts that have a hierarchical namespace for General-purpose V2. With this, you can reduce the delete activity (less cost even it is negligible for a pipeline). However, I would prefer to use delete activity in ADF to make sure that they got deleted after I load them to database. Better than auto delete through lifecycle. For me, given answer is correct based on requirement.

ThijsN
Jan 16, 2021

Both ADF with delete or storage with lifecyle will work. I literally build the last one this week. I think that is the best solution as this is the cheapest and easiest. Doesn't cost anything to run, to build or to maintain.

NasRim
Feb 22, 2021

lifecycle management is available in ADLS from July 31, 2020 https://azure.microsoft.com/en-us/updates/lifecycle-management-for-azure-data-lake-storage-is-now-generally-available/

Needium
Mar 8, 2021

The prefered option should be Az Storage and life cycle management rule

kn_shn
Jun 25, 2021

From older comments, ADF + Delete and Azure Storage + Lifecycle management rule seem to have similar functionality to remove files. However there is a difference: Liftcycle is defined based on the creation of the file, and in this question and context, it says:" Files that have a modified date that is older than 14 days must be removed". i.e. the file removal is based on the modified date. As BungyTex confirmed below, ADF + Delete can achieve this objective and the answer is correct.

PhuVu
May 11, 2020

ADLS still not support lifecycle management. so that answer is correct https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-known-issues

Manue
May 19, 2020

https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction "Blob storage features such as diagnostic logging, access tiers, and Blob Storage lifecycle management policies now work with accounts that have a hierarchical namespace. Therefore, you can enable hierarchical namespaces on your Blob storage accounts without losing access to these features."

M0e
Oct 26, 2020

From the shared document: "Lifecycle management policies are supported only on general-purpose v2 accounts. They aren't yet supported in premium BlockBlobStorage storage accounts."

KasiaK
Dec 18, 2020

Lifecycle management policies (delete blob): Generally available in Premium, Generally available in Standard https://docs.microsoft.com/pl-pl/azure/storage/blobs/data-lake-storage-supported-blob-storage-features

lky17
Mar 7, 2021

The correct answer should be Az Store and Lifecycle ... because ADLSG2 lets delete any file, the unique exception is "If you use the Delete Blob API to delete a directory, that directory will be deleted only if it's empty. This means that you can't use the Blob API delete directories recursively." and support all operation in lifecycle management except "Lifecycle management policies with premium tier for Azure Data Lake Storage. You can't move data that's stored in the premium tier between hot, cool, and archive tiers. However, you can copy data from the premium tier to the hot access tier in a different account." Ref https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-known-issues

savin
Jun 6, 2021

Azure storage lifecycle management should be easier option

Dymize
May 28, 2021

The way i see this, if the inventory data is coming from a microsoft SQL server, it is being ingested by ADF and not in Azure Storage, and if using ADF then the delete activity should be used. As per other comments this is proven to work

hoangton
Jun 29, 2021

Given answer is correct (1)ADF (2)Delete activity