Exam Certified Data Engineer Professional All QuestionsBrowse all questions from this exam
Question 114

A data team’s Structured Streaming job is configured to calculate running aggregates for item sales to update a downstream marketing dashboard. The marketing team has introduced a new promotion, and they would like to add a new field to track the number of times this promotion code is used for each item. A junior data engineer suggests updating the existing query as follows. Note that proposed changes are in bold.

Original query:

Proposed query:

Which step must also be completed to put the proposed query into production?

    Correct Answer: A

    To ensure a smooth transition when updating the schema of a streaming job, it is crucial to specify a new checkpoint location. This precaution ensures that the streaming query starts afresh with the updated schema, thus preventing any potential conflicts or issues arising from mismatches between the old and new schemas. This step is particularly important when introducing new fields, as existing state data might not be compatible with the new schema.

Discussion
Deb9753Option: A

Answer: A When updating the schema of a streaming job, specifying a new checkpoint location ensures that the streaming query starts fresh with the new schema. This avoids issues that might arise from schema mismatches between the previous state and the new schema. This is especially relevant when adding new fields because the existing state might not be compatible with the new schema.

MDWPartnersOption: A

This checkpoint location preserves all of the essential information that identifies a query. Each query must have a different checkpoint location. Multiple queries should never have the same location. For more information, see the Structured Streaming Programming Guide. https://docs.databricks.com/en/structured-streaming/query-recovery.html