Black Friday Sale Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 65pass65

Professional-Data-Engineer Google Professional Data Engineer Exam is now Stable and With Pass Result | Test Your Knowledge for Free

Exams4sure Dumps

Professional-Data-Engineer Practice Questions

Google Professional Data Engineer Exam

Last Update 8 hours ago
Total Questions : 387

Dive into our fully updated and stable Professional-Data-Engineer practice test platform, featuring all the latest Google Cloud Certified exam questions added this week. Our preparation tool is more than just a Google study aid; it's a strategic advantage.

Our Google Cloud Certified practice questions crafted to reflect the domains and difficulty of the actual exam. The detailed rationales explain the 'why' behind each answer, reinforcing key concepts about Professional-Data-Engineer. Use this test to pinpoint which areas you need to focus your study on.

Professional-Data-Engineer PDF

Professional-Data-Engineer PDF (Printable)
$43.75
$124.99

Professional-Data-Engineer Testing Engine

Professional-Data-Engineer PDF (Printable)
$50.75
$144.99

Professional-Data-Engineer PDF + Testing Engine

Professional-Data-Engineer PDF (Printable)
$63.7
$181.99
Question # 1

What are the minimum permissions needed for a service account used with Google Dataproc?

Options:

A.  

Execute to Google Cloud Storage; write to Google Cloud Logging

B.  

Write to Google Cloud Storage; read to Google Cloud Logging

C.  

Execute to Google Cloud Storage; execute to Google Cloud Logging

D.  

Read and write to Google Cloud Storage; write to Google Cloud Logging

Discussion 0
Question # 2

Scaling a Cloud Dataproc cluster typically involves ____.

Options:

A.  

increasing or decreasing the number of worker nodes

B.  

increasing or decreasing the number of master nodes

C.  

moving memory to run more applications on a single node

D.  

deleting applications from unused nodes periodically

Discussion 0
Question # 3

You are designing the architecture of your application to store data in Cloud Storage. Your application consists of pipelines that read data from a Cloud Storage bucket that contains raw data, and write the data to a second bucket after processing. You want to design an architecture with Cloud Storage resources that are capable of being resilient if a Google Cloud regional failure occurs. You want to minimize the recovery point objective (RPO) if a failure occurs, with no impact on applications that use the stored data. What should you do?

Options:

A.  

Adopt two regional Cloud Storage buckets, and update your application to write the output on both buckets.

B.  

Adopt multi-regional Cloud Storage buckets in your architecture.

C.  

Adopt two regional Cloud Storage buckets, and create a daily task to copy from one bucket to the other.

D.  

Adopt a dual-region Cloud Storage bucket, and enable turbo replication in your architecture.

Discussion 0
Question # 4

Your organization uses a multi-cloud data storage strategy, storing data in Cloud Storage, and data in Amazon Web Services' (AWS) S3 storage buckets. All data resides in US regions. You want to query up-to-date data by using BigQuery. regardless of which cloud the data is stored in. You need to allow users to query the tables from BigQuery without giving direct access to the data in the storage buckets What should you do?

Options:

A.  

Set up a BigQuery Omni connection to the AWS S3 bucket data Create BigLake tables over the Cloud Storage and S3 data and query the data using BigQuery directly.

B.  

Set up a BigQuery Omni connection to the AWS S3 bucket data. Create external tables over the Cloud Storage and S3 data and query the data using BigQuery directly.

C.  

Use the Storage Transfer Service to copy data from the AWS S3 buckets to Cloud Storage buckets Create BigLake tables over the Cloud Storage data and query the data using BigQuery directly.

D.  

Use the Storage Transfer Service to copy data from the AWS S3 buckets to Cloud Storage buckets Create external tables over the Cloud Storage data and query the data using BigQuery directly

Discussion 0
Question # 5

Your team is working on a binary classification problem. You have trained a support vector machine (SVM) classifier with default parameters, and received an area under the Curve (AUC) of 0.87 on the validation set. You want to increase the AUC of the model. What should you do?

Options:

A.  

Perform hyperparameter tuning

B.  

Train a classifier with deep neural networks, because neural networks would always beat SVMs

C.  

Deploy the model and measure the real-world AUC; it’s always higher because of generalization

D.  

Scale predictions you get out of the model (tune a scaling factor as a hyperparameter) in order to get the highest AUC

Discussion 0
Question # 6

You are designing storage for two relational tables that are part of a 10-TB database on Google Cloud. You want to support transactions that scale horizontally. You also want to optimize data for range queries on nonkey columns. What should you do?

Options:

A.  

Use Cloud SQL for storage. Add secondary indexes to support query patterns.

B.  

Use Cloud SQL for storage. Use Cloud Dataflow to transform data to support query patterns.

C.  

Use Cloud Spanner for storage. Add secondary indexes to support query patterns.

D.  

Use Cloud Spanner for storage. Use Cloud Dataflow to transform data to support query patterns.

Discussion 0
Question # 7

You have uploaded 5 years of log data to Cloud Storage A user reported that some data points in the log data are outside of their expected ranges, which indicates errors You need to address this issue and be able to run the process again in the future while keeping the original data for compliance reasons. What should you do?

Options:

A.  

Import the data from Cloud Storage into BigQuery Create a new BigQuery table, and skip the rows with errors.

B.  

Create a Compute Engine instance and create a new copy of the data in Cloud Storage Skip the rows with errors

C.  

Create a Cloud Dataflow workflow that reads the data from Cloud Storage, checks for values outside the expected range, sets the value to an appropriate default, and writes the updated records to a new dataset inCloud Storage

D.  

Create a Cloud Dataflow workflow that reads the data from Cloud Storage, checks for values outside the expected range, sets the value to an appropriate default, and writes the updated records to the same dataset in Cloud Storage

Discussion 0
Question # 8

You have several different unstructured data sources, within your on-premises data center as well as in the cloud. The data is in various formats, such as Apache Parquet and CSV. You want to centralize this data in Cloud Storage. You need to set up an object sink for your data that allows you to use your own encryption keys. You want to use a GUI-based solution. What should you do?

Options:

A.  

Use Cloud Data Fusion to move files into Cloud Storage.

B.  

Use Storage Transfer Service to move files into Cloud Storage.

C.  

Use Dataflow to move files into Cloud Storage.

D.  

Use BigQuery Data Transfer Service to move files into BigQuery.

Discussion 0
Question # 9

Your company uses Looker Studio connected to BigQuery for reporting. Users are experiencing slow dashboard load times due to complex queries on a large table. The queries involve aggregations and filtering on several columns. You need to optimize query performance to decrease the dashboard load times. What should you do?

Options:

A.  

Configure Looker Studio to use a shorter data refresh interval to ensure fresh data is always displayed.

B.  

Create a materialized view in BigQuery that pre-calculates the aggregations and filters used in the Looker Studio dashboards.

C.  

Implement row-level security in BigQuery to restrict data access and reduce the amount of data processed by the queries.

D.  

Use BigQuery BI Engine to accelerate query performance by caching frequently accessed data.

Discussion 0
Question # 10

You have a data pipeline with a Dataflow job that aggregates and writes time series metrics to Bigtable. You notice that data is slow to update in Bigtable. This data feeds a dashboard used by thousands of users across the organization. You need to support additional concurrent users and reduce the amount of time required to write the data. What should you do?

Choose 2 answers

Options:

A.  

Configure your Dataflow pipeline to use local execution.

B.  

Modify your Dataflow pipeline lo use the Flatten transform before writing to Bigtable.

C.  

Modify your Dataflow pipeline to use the CoGrcupByKey transform before writing to Bigtable.

D.  

Increase the maximum number of Dataflow workers by setting maxNumWorkers in PipelineOptions.

E.  

Increase the number of nodes in the Bigtable cluster.

Discussion 0
Get Professional-Data-Engineer dumps and pass your exam in 24 hours!

Free Exams Sample Questions