Your BigQuery project has several users. For audit purposes, you need to see how many queries each user ran in the last month. What should you do?
Your BigQuery project has several users. For audit purposes, you need to see how many queries each user ran in the last month. What should you do?
To see how many queries each user ran in the last month, you should use Cloud Audit Logging to view Cloud Audit Logs and create a filter on the query operation to get the required information. Cloud Audit Logs automatically log activity related to BigQuery, including queries, making it the ideal tool for audit purposes. By filtering on the necessary fields, you can efficiently extract the number of queries each user has run over the specified period.
D- reasons: 1.-Cloud Audit Logs maintains audit logs for admin activity, data access and system events. BIGQUERY is automatically send to cloud audit log functionality. 2.- In the filter you can filter relevant BigQuery Audit messages, you can express filters as part of the export https://cloud.google.com/logging/docs/audit https://cloud.google.com/bigquery/docs/reference/auditlogs#ids https://cloud.google.com/bigquery/docs/reference/auditlogs#auditdata_examples
D is the right as you can get the monthly view of the query usage across all the users and projects for auditing purpose. C does need appropriate permission to see the detail level data. Monthly view is tough to get directly from the bq ls or bq show commands.
Answer is D
Answer is D: https://cloud.google.com/bigquery/docs/reference/auditlogs#example_query_cost_breakdown_by_identity
Nailed it
No mention about exporting to bq
Please see the C carefully, it should be Use bq ls to list all jobs, use bq show to show the detail. Hence C is not right.
Yes..D is correct
Final Decision to go with Option D as per https://cloud.google.com/bigquery/docs/reference/auditlogs#example_query_cost_breakdown_by_identity
But data access audit logs are disabled by default .... And u need to enable this functionality before u can get these audit logs...see here https://cloud.google.com/logging/docs/audit
Data access audit logs are enabled for BQ by default
I think C is correct.
D is ok
Try audit logs, D is OK
yep, D is right, https://cloud.google.com/bigquery/docs/reference/auditlogs#data_access_data_access
D is correct
It is B. ex: SELECT project_id, user_email,count(*) as CNT FROM region-us.INFORMATION_SCHEMA.JOBS group by project_id, user_email ;
So do you assume B has the required role of BigQuery Resource Viewer? Because without it, the user can't run the query.
The correct answer is D. Use Cloud Audit Logging to view Cloud Audit Logs, and create a filter on the query operation to get the required information. Google Cloud's Cloud Audit Logging service allows you to view, search, and export audit logs for your Google Cloud projects. These audit logs contain information about the actions that are performed in your project, including queries that are run in BigQuery. To see how many queries each user ran in the last month, you can use Cloud Audit Logging to view the Cloud Audit Logs for your BigQuery project. Then, you can create a filter on the query operation to see only the queries that were run. You can also create a filter on the user field to see the queries that were run by each user. This will allow you to see the number of queries that were run by each user in the last month, which can be useful for audit purposes.
Option A, connecting Google Data Studio to BigQuery and creating a dimension for the users and a metric for the amount of queries per user, is a valid method of visualizing data, but it would not provide the specific information about the number of queries that were run by each user in the last month. Option B, executing a query on the JOBS table to get the required information, is not a viable option because the JOBS table does not contain information about the user who ran the query. Option C, using the 'bq show' and 'bq ls' commands to list job information, is not a viable option because these commands do not provide information about the user who ran the query.
Option A: Why would this not provide correct information? You could show the content of the JOBS view. : SELECT user_email, CAST(creation_time as DATE) as date, count(*) as queries_per_day FROM `region-eu`.INFORMATION_SCHEMA.JOBS WHERE creation_time BETWEEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 45 DAY) AND CURRENT_TIMESTAMP() AND job_type = 'QUERY' group by user_email, CAST(creation_time as DATE)
Option A: Why would this not provide correct information? You could show the content of the JOBS view. : SELECT user_email, CAST(creation_time as DATE) as date, count(*) as queries_per_day FROM `region-eu`.INFORMATION_SCHEMA.JOBS WHERE creation_time BETWEEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 45 DAY) AND CURRENT_TIMESTAMP() AND job_type = 'QUERY' group by user_email, CAST(creation_time as DATE)
B is correct. here's the link - https://cloud.google.com/bigquery/docs/information-schema-jobs
Querying the INFORMATION_SCHEMA.JOBS_BY_USER view in BigQuery is the most efficient and straightforward way to obtain the number of queries each user ran in the last month. This method leverages built-in BigQuery capabilities designed specifically for auditing and monitoring query jobs. Cloud Audit Logs provide detailed logging information but are more complex to query for specific metrics like the number of queries run by each user. BigQuery’s INFORMATION_SCHEMA.JOBS_BY_USER is designed for this purpose and is easier to use for querying job data.
Why B is the Best Answer: Direct Access to Job Metadata: BigQuery maintains metadata about jobs (including query jobs) in the INFORMATION_SCHEMA views, specifically in the INFORMATION_SCHEMA.JOBS table. Detailed Information: This table contains information about all jobs, including who ran them, when they were run, and the type of job. This makes it easy to filter and count queries by user. Querying JOBS Table: You can write a SQL query to count the number of queries executed by each user over the specified period.
C is right. https://stackoverflow.com/questions/30993782/how-retrieve-the-user-and-sql-code-associated-with-a-bigquery-job/30995665#30995665
Finally. this details is exactly why C is not right. LOL. change my mind from C or D to D only.
Big query by default logs into cloud logging. we can add a filter to get data groups by users for specific months. This seems to be a more efficient way of getting data. I will go for option D
There is no JOBS table in the INFORMATION_SCHEMA - so B is not correct. (ref: https://cloud.google.com/bigquery/docs/information-schema-jobs#schema) C needs some extra effort -- so it isn't so good. IMO D is correct. "BigQuery automatically sends audit logs to Cloud Logging" (ref: https://cloud.google.com/bigquery/docs/reference/auditlogs#stackdriver_logging_exports) Ref: https://cloud.google.com/logging/docs/audit Ref: https://cloud.google.com/logging/docs/audit#viewing_audit_logs
D. Use Cloud Audit Logging to view Cloud Audit Logs, and create a filter on the query operation to get the required information.
True C is right: https://stackoverflow.com/questions/43688885/get-queries-executed-on-bigquery-by-user
Agree C
B. I say that since you can query bigquery.jobs table with the client libraries and query for the time range of last month.The correct CLI syntax would be bq ls -a -j -max_creation_time <max-age-in-unix-time> project-name. The -a would simply list all jobs and the -j will limit the age to what ever the unix date of the first day of the month in question was. The bq show command just examines the shcema so answer C is not correct.
Its only for a specific user - refer https://cloud.google.com/bigquery/docs/managing-jobs
Actually you are wrong, read those docs again, B is correct, bigquery.admin allows a user to see jobs from all users.
D is the correct answer for me https://cloud.google.com/bigquery/docs/reference/auditlogs#example_most_popular_datasets
I think it's an outdated question. the easiest way is to use the console to query the metadata table. I am only guessing that the INFORMATION_SCHEMA table used to be called JOBS and hterefore the correct answer was B. Let's hope exam questions are up to date. https://cloud.google.com/bigquery/docs/information-schema-jobs
B is correct. Please refer to "INFORMATION_SCHEMA.JOBS_BY_USER" https://cloud.google.com/bigquery/docs/information-schema-jobs#schema Examples: https://cloud.google.com/bigquery/docs/information-schema-jobs#examples
But the opt. B mentions the "JOBS" table, not JOBS_BY_USER
I found this in google practice exam and the answer is B
D. Use Cloud Audit Logging to view Cloud Audit Logs, and create a filter on the query operation to get the required information. Cloud Audit Logging records activities and API calls in Google Cloud services, including BigQuery. You can use Cloud Audit Logging to view logs and filter them based on specific operations, such as queries in BigQuery. By filtering on the query operation, you can gather the required information about how many queries each user ran in the last month, which is essential for audit purposes.
BigQuery's INFORMATION_SCHEMA: BigQuery provides metadata about datasets, tables, and jobs through the INFORMATION_SCHEMA. The JOBS_BY_USER view within this schema is specifically designed to give you information about jobs run by each user. You can easily query this view to get the number of queries run by each user in the last month.
if all agree C, why the answer is saying B here?
Answer is A : Visualize GCP Billing using BigQuery and Data Studio https://medium.com/google-cloud/visualize-gcp-billing-using-bigquery-and-data-studio-d3e695f90c08
If you read the blog post carefully you can notice that before starting analyzing data in Data Studio they import data from Stackdriver logs (basically, cloud audit logs). Since the task was not to create a beautiful charts, I'd go directly to audit logs.
D is the correct
C is the correct answer..
D seems the right https://cloud.google.com/bigquery/docs/reference/auditlogs#auditdata_examples
answer is D
The bq show command displays information about an object. It can't be used to list all jobs. https://cloud.google.com/bigquery/docs/reference/bq-cli-reference#bq_show As described in the syntax here: https://cloud.google.com/bigquery/docs/managing-jobs#bq, bq show needs a job id (which we don't have) to show the details of the job.
I vote D. Actibity log is automatically logged. We can confirm this on CloudConsole.
https://cloud.google.com/bigquery/docs/reference/auditlogs#ids
Yes, D is correct. Use Cloud Audit Logging to view Cloud Audit Logs, and create a filter on the query operation to get the required information.
JOBS system table does exist and it contains exactly the info we need: one record for each job executed by users (query is one of the type of the jobs)
C - bq show: To view job details (https://cloud.google.com/bigquery/docs/managing-jobs#view_job_details_2) bq ls: To list jobs (https://cloud.google.com/bigquery/docs/managing-jobs#list_jobs) So D is the correct one.
I finally decide to go with Option D over B because we or the auditor might not have access to the metadata. In fact, in our project, not all of us had access to query this view. "To get the permission that you need to query the INFORMATION_SCHEMA.JOBS view, ask your administrator to grant you the BigQuery Resource Viewer" https://cloud.google.com/bigquery/docs/information-schema-jobs#required_role. (And not because of the wordings "Table" instead of "view" - don't think an architect exam will try to assess your memory of whether it is a table or a view or your understanding of the difference between a table and a view).
Answer is B In the BigQuery interface, execute a query on the JOBS table to get the required information. Explanation: JOBS Table:BigQuery automatically logs job information, including queries, in a special table called JOBS. By querying this table, you can retrieve details about each job, including the user who ran it, the query text, and the timestamp. Why the Other Options Are Less Suitable: Connect Google Data Studio to BigQuery: While this can visualize data, you still need to execute a query to pull the data first. This option is not directly querying for the information you need. Use ‘bq show’ and ‘bq ls’: These commands provide metadata about jobs but do not efficiently retrieve the count of queries per user, especially for a large number of jobs over a month. Use Cloud Audit Logging: This approach could work but would be more complex and less efficient for simply counting queries. The JOBS table is specifically designed for this purpose, making it easier to extract the necessary data.
I will go for B because it is more efficient and easy.
answer is B, fastest and efficient method. Question said that they just need to know "how many queries each user ran in the last month" which B can do in a short time.
My choice is Cloud Audit Logs, but I am not sure. Confused
B seems correct. Viewing job data You can view job data and metadata by using the Cloud Console, the classic web UI, command-line interface, and API. This data includes details such as the job type, the job state, and the user who ran the job.
changing to C.
Ok, assuming both B and D can meet the requirements of getting the user activities. The question is explicitly stated for "audit purpose". Usually, it will means get the data from central logging (if this option is available). (as the auditor may audit other areas as well) So, I guess, D maybe is a better answer?
Must be D imho https://cloud.google.com/bigquery/docs/reference/auditlogs
B as, Retrieving jobs metadata by using INFORMATION_SCHEMA tables requires appropriately-scoped permissions: JOBS_BY_USER requires bigquery.jobs.list for the project and is available to the Project Viewer and BigQuery User roles. INFORMATION_SCHEMA.JOBS_BY_* view to retrieve real-time metadata about BigQuery jobs. This view contains currently running jobs, as well as the last 180 days of history of completed jobs.
But the answer mentions the "JOBS" table ?
D is the ans
Can anyone explain why B is not right? https://cloud.google.com/bigquery/docs/information-schema-jobs Is it just because the wording "JOBS table" is not precise? the JOBS_BY_* view can provide all the information. And it is easier that C (if worded properly) and much easier than D.
Please don't go with the flow as everyone in the post is going, if you are using a big query you will know you can self query all details about your query execution. Jobs and all are direct. But if you need additional information like slots used and all then you need to sink bigquery logs to bigquery table to query upon.
Will go with B
D – Use Cloud Audit Logging to view Cloud Audit Logs and create a filter on the query operation to get required info. See Audit Logs for BQ, there is example of similar log query in the end of this page – Most Popular Datasets (statistics about table reads/modifications). https://cloud.google.com/bigquery/docs/reference/auditlogs/ Types of Audit Logs are described here: https://cloud.google.com/logging/docs/audit/ - this Q refers to Data Access audit logs. A - option maybe also considered as an answer, since DataStudio integrates with BQ. But, DataStudio is used just for data visualization, helps to analyze data interactively.
Is this the job table ? google.cloud.bigquery.job.QueryJob ? https://googleapis.dev/python/bigquery/latest/generated/google.cloud.bigquery.job.QueryJob.html Looks like answer is B
I would be surprised if we need to go through the audit logs to count number of queries by users. So D seems to be not correct. I agree that this information may be available in logs, but We may have better way to query this.
vote D
D is right here.
Go for D
Audit logs is the answer
D is correct
D is the correct answer. Its between B & D. 1.Option B could have been the correct answer if the JOBS was not referred as "Table" rather as a "View" whihc are a part of INFORMATION_SCHEMA. Ref - https://cloud.google.com/bigquery/docs/information-schema-jobs#schema 2. D is the most suited choice, since the BigQuery leverages Cloud Audit logs for admin, data access & system events. Ref - https://cloud.google.com/bigquery/docs/reference/auditlogs#overview
D likely the answer, if B says view I would have selected B, but it says table while its a view
D is not the answer there was no mention about exporting to bigquery. A is correct 😎
D is ok
Bug it's Cloud Audit Logs enabled default for Big Query?
d is correct
Cloud Logging
I think B is the correct answer
https://cloud.google.com/bigquery/docs/managing-jobs#list_jobs_in_a_project
Audit logs, Option D
"Audit logs versus INFORMATION_SCHEMA views Audit logs help you answer the question "Who did what, where, and when?" within your Google Cloud resources. Audit logs are the definitive source of information for system activity by user and access patterns and should be your primary source for audit or security questions." https://cloud.google.com/bigquery/docs/introduction-audit-workloads
D is the right answer
B is the better answer. You get more logs relating to BigQuery jobs from BigQuery than you get from Cloud Audit Logs
I agree B could be the solution, but the best option is D. This is the correct and scalable approach: Cloud Audit Logs capture who ran what, including queries. You can filter on methodName = "jobservice.jobcompleted" and analyze logs in Logs Explorer or export to BigQuery for querying. Supports organization-wide, project-wide, and per-user visibility.
This is mentioned in Engineer course from Google