Professional Machine Learning Engineer Exam - Question 165

Question

You want to train an AutoML model to predict house prices by using a small public dataset stored in BigQuery. You need to prepare the data and want to use the simplest, most efficient approach. What should you do?

Examice · Accepted Answer

To predict house prices using an AutoML model and a small public dataset stored in BigQuery, the simplest and most efficient approach is to preprocess the data within BigQuery itself by writing a query and creating a new table. Then, create a Vertex AI managed dataset with this new table as the data source. This method leverages BigQuery’s data processing capabilities, keeping the data within the same environment, reducing data movement, and simplifying the workflow.

b1a8fae · Answer

A seems the easiest to me: preprocess the data on BigQuery (where the input table is stored) and export directly as Vertex AI managed dataset.

vale_76_na_xxx · Answer

I go for A:

36bdc1e · Answer

A 
By writing a query that preprocesses the data using BigQuery and creating a new table, you can directly create a Vertex AI managed dataset with the new table as the data source. This approach is efficient because it leverages BigQuery’s powerful data processing capabilities and avoids the need to export data to another format or service. It also simplifies the process by keeping everything within the Google Cloud ecosystem. This makes it easier to manage and monitor your data and model training process.

PhilipKoku · Answer

A) Keep the data in BigQuery and create a new table to avoid latency moving data out of BigQuery

kalle_balle · Answer

Dataflow seems like the easiest and most scalable way to deal with this issue. Option B.

b1a8fae · Answer

Forgot to vote

shadz10 · Answer

can export directly from big query as vertex ai managed dataset to use train an autoML model

gscharly · Answer

I go for A:

nmnm22 · Answer

A seems the correct one

Professional Machine Learning Engineer Exam - Question 165

Discussion