Professional Machine Learning Engineer Exam QuestionsBrowse all questions from this exam

Professional Machine Learning Engineer Exam - Question 42


You work for an advertising company and want to understand the effectiveness of your company's latest advertising campaign. You have streamed 500 MB of campaign data into BigQuery. You want to query the table, and then manipulate the results of that query with a pandas dataframe in an AI Platform notebook.

What should you do?

Show Answer
Correct Answer: A

The most efficient way to query and manipulate 500 MB of campaign data from BigQuery within an AI Platform notebook is to use AI Platform Notebooks' BigQuery cell magic. This allows you to directly query the data in BigQuery and ingest the results into a pandas dataframe with minimal steps and without the need for intermediate file storage or data transfer. This method leverages the integration capabilities of Google Cloud services to streamline the data handling process, making it more straightforward and efficient compared to exporting and importing CSV files through Google Drive or Cloud Storage.

Discussion

16 comments
Sign in to comment
zosoabiOption: A
Jun 10, 2021

A: no "CSV" found in provided link https://cloud.google.com/bigquery/docs/bigquery-storage-python-pandas

John_PongthornOption: A
Jan 15, 2023

%%bigquery df SELECT name, SUM(number) as count FROM `bigquery-public-data.usa_names.usa_1910_current` GROUP BY name ORDER BY count DESC LIMIT 3 print(df.head())

Sum_SumOption: A
Nov 15, 2023

A is the google recommended answer. And what you should use C is what the intern does ...

sharth
Jan 4, 2024

Dude, I laughed so hard

NickNtakenOption: A
Apr 28, 2022

this is the simplest and most straightforward way read BQ data into Pandas dataframe.

Y2DataOption: A
Sep 17, 2021

Just load it https://googleapis.dev/python/bigquery/latest/magics.html

Mohamed_MossadOption: A
Jun 11, 2022

https://googleapis.dev/python/bigquery/latest/magics.html#ipython-magics-for-bigquery

Sachin2360Option: A
Jun 22, 2022

Answer : A . Refer to this link for details: https://cloud.google.com/bigquery/docs/bigquery-storage-python-pandas First 2 points talks about querying the data. Download query results to a pandas DataFrame by using the BigQuery Storage API from the IPython magics for BigQuery in a Jupyter notebook. Download query results to a pandas DataFrame by using the BigQuery client library for Python. Download BigQuery table data to a pandas DataFrame by using the BigQuery client library for Python. Download BigQuery table data to a pandas DataFrame by using the BigQuery Storage API client library for Python.

hiromiOption: A
Dec 10, 2022

A https://cloud.google.com/bigquery/docs/visualize-jupyter

DunnothOption: D
Feb 15, 2023

Why not D? using BQ notebook magic would be ok for a single time use. but usually a DS would reload the data multiple time, and every time you need to stream 500mb data to the notebook instance from BQ. Isn't it cheaper to store the data as a csv in a bucket?

M25Option: A
May 9, 2023

Went with A

NamitSehgalOption: A
Jan 4, 2022

I agree with A

ggorzkiOption: A
Jan 19, 2022

IPython magics for BigQuery https://cloud.google.com/bigquery/docs/bigquery-storage-python-pandas

SlipperySlopeOption: C
Feb 20, 2022

C is the correct answer due to the size of the data. It wouldn't be possible to download it all into an in memory data frame.

u_phoria
Jun 26, 2022

500mb of data into a pandas dataframe generally isn't a problem, far from it.

mmona19Option: C
Apr 14, 2022

both A and C is technically correct. C has more manual step and A has less. The question does not ask which requires least effort. so C is clear answer

wish0035
Dec 15, 2022

"A and C are valid, but C is more difficult than A. they don't ask to be easier so I will go with the more difficult". WHAAAT? Google best practices are always: easier > harder. Even they encourage you to skip ML if you don't need ML.

SergioRubianoOption: A
Mar 28, 2023

A, Using the command %%bigquery df

PhilipKokuOption: A
Jun 6, 2024

A) Magic command