Pre-Summer Sale Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 65pass65

Professional-Data-Engineer Google Professional Data Engineer Exam is now Stable and With Pass Result | Test Your Knowledge for Free

Exams4sure Dumps

Professional-Data-Engineer Practice Questions

Google Professional Data Engineer Exam

Last Update 1 day ago
Total Questions : 400

Dive into our fully updated and stable Professional-Data-Engineer practice test platform, featuring all the latest Google Cloud Certified exam questions added this week. Our preparation tool is more than just a Google study aid; it's a strategic advantage.

Our free Google Cloud Certified practice questions crafted to reflect the domains and difficulty of the actual exam. The detailed rationales explain the 'why' behind each answer, reinforcing key concepts about Professional-Data-Engineer. Use this test to pinpoint which areas you need to focus your study on.

Professional-Data-Engineer PDF

Professional-Data-Engineer PDF (Printable)
$43.75
$124.99

Professional-Data-Engineer Testing Engine

Professional-Data-Engineer PDF (Printable)
$50.75
$144.99

Professional-Data-Engineer PDF + Testing Engine

Professional-Data-Engineer PDF (Printable)
$63.7
$181.99
Question # 21

You are creating a model to predict housing prices. Due to budget constraints, you must run it on a single resource-constrained virtual machine. Which learning algorithm should you use?

Options:

A.  

Linear regression

B.  

Logistic classification

C.  

Recurrent neural network

D.  

Feedforward neural network

Discussion 0
Question # 22

You want to use a database of information about tissue samples to classify future tissue samples as either normal or mutated. You are evaluating an unsupervised anomaly detection method for classifying the tissue samples. Which two characteristic support this method? (Choose two.)

Options:

A.  

There are very few occurrences of mutations relative to normal samples.

B.  

There are roughly equal occurrences of both normal and mutated samples in the database.

C.  

You expect future mutations to have different features from the mutated samples in the database.

D.  

You expect future mutations to have similar features to the mutated samples in the database.

E.  

You already have labels for which samples are mutated and which are normal in the database.

Discussion 0
Question # 23

You are building a model to predict whether or not it will rain on a given day. You have thousands of input features and want to see if you can improve training speed by removing some features while having a minimum effect on model accuracy. What can you do?

Options:

A.  

Eliminate features that are highly correlated to the output labels.

B.  

Combine highly co-dependent features into one representative feature.

C.  

Instead of feeding in each feature individually, average their values in batches of 3.

D.  

Remove the features that have null values for more than 50% of the training records.

Discussion 0
Question # 24

You work for a car manufacturer and have set up a data pipeline using Google Cloud Pub/Sub to capture anomalous sensor events. You are using a push subscription in Cloud Pub/Sub that calls a custom HTTPS endpoint that you have created to take action of these anomalous events as they occur. Your custom HTTPS endpoint keeps getting an inordinate amount of duplicate messages. What is the most likely cause of these duplicate messages?

Options:

A.  

The message body for the sensor event is too large.

B.  

Your custom endpoint has an out-of-date SSL certificate.

C.  

The Cloud Pub/Sub topic has too many messages published to it.

D.  

Your custom endpoint is not acknowledging messages within the acknowledgement deadline.

Discussion 0
Question # 25

You are building a report-only data warehouse where the data is streamed into BigQuery via the streaming API Following Google's best practices, you have both a staging and a production table for the data How should you design your data loading to ensure that there is only one master dataset without affecting performance on either the ingestion or reporting pieces?

Options:

A.  

Have a staging table that is an append-only model, and then update the production table every three hourswith the changes written to staging

B.  

Have a staging table that is an append-only model, and then update the production table every ninetyminutes with the changes written to staging

C.  

Have a staging table that moves the staged data over to the production table and deletes the contents of thestaging table every three hours

D.  

Have a staging table that moves the staged data over to the production table and deletes the contents of the staging table every thirty minutes

Discussion 0
Question # 26

You are designing the architecture of your application to store data in Cloud Storage. Your application consists of pipelines that read data from a Cloud Storage bucket that contains raw data, and write the data to a second bucket after processing. You want to design an architecture with Cloud Storage resources that are capable of being resilient if a Google Cloud regional failure occurs. You want to minimize the recovery point objective (RPO) if a failure occurs, with no impact on applications that use the stored data. What should you do?

Options:

A.  

Adopt two regional Cloud Storage buckets, and update your application to write the output on both buckets.

B.  

Adopt multi-regional Cloud Storage buckets in your architecture.

C.  

Adopt two regional Cloud Storage buckets, and create a daily task to copy from one bucket to the other.

D.  

Adopt a dual-region Cloud Storage bucket, and enable turbo replication in your architecture.

Discussion 0
Question # 27

Your company has a hybrid cloud initiative. You have a complex data pipeline that moves data between cloud provider services and leverages services from each of the cloud providers. Which cloud-native service should you use to orchestrate the entire pipeline?

Options:

A.  

Cloud Dataflow

B.  

Cloud Composer

C.  

Cloud Dataprep

D.  

Cloud Dataproc

Discussion 0
Question # 28

You have a BigQuery table that ingests data directly from a Pub/Sub subscription. The ingested data is encrypted with a Google-managed encryption key. You need to meet a new organization policy that requires you to use keysfrom a centralized Cloud Key Management Service (Cloud KMS) project to encrypt data at rest. What should you do?

Options:

A.  

Create a new BigOuory table by using customer-managed encryption keys (CMEK), and migrate the data from the old BigQuery table.

B.  

Create a new BigOuery table and Pub/Sub topic by using customer-managed encryption keys (CMEK), and migrate the data from the old Bigauery table.

C.  

Create a new Pub/Sub topic with CMEK and use the existing BigQuery table by using Google-managed encryption key.

D.  

Use Cloud KMS encryption key with Dataflow to ingest the existing Pub/Sub subscription to the existing BigQuery table.

Discussion 0
Question # 29

You are working on a niche product in the image recognition domain. Your team has developed a model that is dominated by custom C++ TensorFlow ops your team has implemented. These ops are used inside your main training loop and are performing bulky matrix multiplications. It currently takes up to several days to train a model. You want to decrease this time significantly and keep the cost low by using an accelerator on Google Cloud. What should you do?

Options:

A.  

Use Cloud TPUs without any additional adjustment to your code.

B.  

Use Cloud TPUs after implementing GPU kernel support for your customs ops.

C.  

Use Cloud GPUs after implementing GPU kernel support for your customs ops.

D.  

Stay on CPUs, and increase the size of the cluster you’re training your model on.

Discussion 0
Question # 30

You have a network of 1000 sensors. The sensors generate time series data: one metric per sensor per second, along with a timestamp. You already have 1 TB of data, and expect the data to grow by 1 GB every day You need to access this data in two ways. The first access pattern requires retrieving the metric from one specific sensor stored at a specific timestamp, with a median single-digit millisecond latency. The second access pattern requires running complex analytic queries on the data, including joins, once a day. How should you store this data?

Options:

A.  

Store your data in Bigtable Concatenate the sensor ID and timestamp and use it as the row key Perform an export to BigQuery every day.

B.  

Store your data in BigQuery Concatenate the sensor ID and timestamp. and use it as the primary key.

C.  

Store your data in Bigtable Concatenate the sensor ID and metric, and use it as the row key Perform an export to BigQuery every day.

D.  

Store your data in BigQuery. Use the metric as a primary key.

Discussion 0
Get Professional-Data-Engineer dumps and pass your exam in 24 hours!

Free Exams Sample Questions