Professional Machine Learning Engineer Exam QuestionsBrowse all questions from this exam

Professional Machine Learning Engineer Exam - Question 93


You have been asked to build a model using a dataset that is stored in a medium-sized (~10 GB) BigQuery table. You need to quickly determine whether this data is suitable for model development. You want to create a one-time report that includes both informative visualizations of data distributions and more sophisticated statistical analyses to share with other ML engineers on your team. You require maximum flexibility to create your report. What should you do?

Show Answer
Correct Answer: AC

For building a model using a dataset stored in a medium-sized BigQuery table and quickly determining its suitability for model development, Vertex AI Workbench user-managed notebooks will provide maximum flexibility. They allow the creation of one-time reports that include informative visualizations of data distributions and sophisticated statistical analyses. Using Python libraries such as pandas, matplotlib, and seaborn in the notebook environment gives comprehensive and customizable statistical analysis capabilities, making it ideal for sharing detailed insights with other ML engineers.

Discussion

17 comments
Sign in to comment
JamesDoeOption: A
Mar 28, 2023

I think it's A.One time report containing real datasets STATISTICAL measurements to tell if the data is suitable for model development. Target audience is also other ML engineers. Getting a whole report of exactly this with TFDV/Facets is like two lines of code: https://www.tensorflow.org/tfx/data_validation/get_started A similar data studio report for this would take lots of time and work, and there would be no benefit from reuseability since task was a one-time job.

JamesDoe
Mar 28, 2023

Depending on your definition of "You require maximum flexibility to create your report.", it could very well be B too.

frangm23Option: B
Apr 19, 2023

I think has to be B. One of the keys is that it says quickly and BQ makes it very easy to export the query into Looker Studio. The other one is that there's maximum flexibility within the needs for this case (informative visualizations + statistical analysis), as we can develop and write custom formulas. A feels like overkill to use a Deep Learning VM Image to only describe data and perform some analysis. C also feels overkill to start developping a neural net for that. D although you may use Dataprep for this, it is less suited than A

CloudKidaOption: C
May 9, 2023

TensorFlow Data Validation(TFDV) can compute descriptive statistics that provide a quick overview of the data in terms of the features that are present and the shapes of their value distributions. Tools such as Facets Overview can provide a succinct visualization of these statistics for easy browsing.

PST21Option: A
Jun 28, 2023

Correct Answer A . While Google Data Studio (Option B) is a powerful data visualization and reporting tool, it might not provide the same level of flexibility and sophistication for statistical analyses compared to a notebook environment.

lalala_meowOption: A
Sep 24, 2023

A for more sophisticated statistical analyses and maximum flexibility

Krish6488Option: A
Nov 11, 2023

Looker studio is good too but it does not give the same depth in statistical analysis of the data as using matplotlib, seaborn etc gives on a notebook. So Jupyterlab notebook a.k.a Vertex AI workbench for me

gscharlyOption: A
Apr 14, 2024

More Flexbility

kucuk_kaganOption: A
Mar 31, 2023

A seçeneğini öneriyorum çünkü Vertex AI Workbench kullanıcı yönetimli not defterleri (user-managed notebooks), BigQuery tablosundaki verilerin analiz edilmesi ve görselleştirilmesi için daha fazla esneklik ve özelleştirme sağlar. Python kütüphaneleri (pandas, matplotlib, seaborn vb.) kullanarak, veri dağılımlarının görselleştirmelerini oluşturabilir ve daha karmaşık istatistiksel analizler gerçekleştirebilirsiniz.

lucaluca1982Option: A
Apr 30, 2023

A. Flexibility is the key.

SamuelTschOption: A
Jul 8, 2023

went with A, because of max. flexibility

NickHaptonOption: A
Jul 8, 2023

1. one- time 2. flexibility go for A

[Removed]Option: A
Jul 23, 2023

The answer is A. B is wrong because you need more sophisticated statistical analyses and maximum flexibility to create your report.

andresvelascoOption: A
Sep 10, 2023

A (AI workbench): "sophisticated"

MCorsettiOption: A
Oct 22, 2023

A as it is a one off report with maximum flexibility. Dont need a dashboard unless being reused

Mickey321Option: A
Nov 15, 2023

Max flexibility

SubbuJVOption: A
Feb 15, 2024

More Flexbility

dija123Option: A
Jun 20, 2024

It is a data science request that could be ended on Jupiter notebook