Pre-Summer Sale Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 65pass65

Professional-Data-Engineer Google Professional Data Engineer Exam is now Stable and With Pass Result | Test Your Knowledge for Free

Exams4sure Dumps

Professional-Data-Engineer Practice Questions

Google Professional Data Engineer Exam

Last Update 1 day ago
Total Questions : 400

Dive into our fully updated and stable Professional-Data-Engineer practice test platform, featuring all the latest Google Cloud Certified exam questions added this week. Our preparation tool is more than just a Google study aid; it's a strategic advantage.

Our free Google Cloud Certified practice questions crafted to reflect the domains and difficulty of the actual exam. The detailed rationales explain the 'why' behind each answer, reinforcing key concepts about Professional-Data-Engineer. Use this test to pinpoint which areas you need to focus your study on.

Professional-Data-Engineer PDF

Professional-Data-Engineer PDF (Printable)
$43.75
$124.99

Professional-Data-Engineer Testing Engine

Professional-Data-Engineer PDF (Printable)
$50.75
$144.99

Professional-Data-Engineer PDF + Testing Engine

Professional-Data-Engineer PDF (Printable)
$63.7
$181.99
Question # 51

MJTelco is building a custom interface to share data. They have these requirements:

They need to do aggregations over their petabyte-scale datasets.

They need to scan specific time range rows with a very fast response time (milliseconds).

Which combination of Google Cloud Platform products should you recommend?

Options:

A.  

Cloud Datastore and Cloud Bigtable

B.  

Cloud Bigtable and Cloud SQL

C.  

BigQuery and Cloud Bigtable

D.  

BigQuery and Cloud Storage

Discussion 0
Question # 52

Your company produces 20,000 files every hour. Each data file is formatted as a comma separated values (CSV) file that is less than 4 K

B.  

All files must be ingested on Google Cloud Platform before they can be processed. Your company site has a 200 ms latency to Google Cloud, and your Internet connection bandwidth is limited as 50 Mbps. You currently deploy a secure FTP (SFTP) server on a virtual machine in Google Compute Engine as the data ingestion point. A local SFTP client runs on a dedicated machine to transmit the CSV files as is. The goal is to make reports with data from the previous day available to the executives by 10:00 a.m. each day. This design is barely able to keep up with the current volume, even though the bandwidth utilization is rather low.

You are told that due to seasonality, your company expects the number of files to double for the next three months. Which two actions should you take? (choose two.)

Options:

A.  

Introduce data compression for each file to increase the rate file of file transfer.

B.  

Contact your internet service provider (ISP) to increase your maximum bandwidth to at least 100 Mbps.

C.  

Redesign the data ingestion process to use gsutil tool to send the CSV files to a storage bucket in parallel.

D.  

Assemble 1,000 files into a tape archive (TAR) file. Transmit the TAR files instead, and disassemble the CSV files in the cloud upon receiving them.

E.  

Create an S3-compatible storage endpoint in your network, and use Google Cloud Storage Transfer Service to transfer on-premices data to the designated storage bucket.

Discussion 0
Question # 53

You are deploying a new storage system for your mobile application, which is a media streaming service. You decide the best fit is Google Cloud Datastore. You have entities with multiple properties, some of which can take on multiple values. For example, in the entity ‘Movie’ the property ‘actors’ and the property ‘tags’ have multiple values but the property ‘date released’ does not. A typical query would ask for all movies with actor= ordered by date_released or all movies with tag=Comedy ordered by date_released. How should you avoid a combinatorial explosion in the number of indexes?

Question # 53

Options:

A.  

Option A

B.  

Option

B.  

C.  

Option C

D.  

Option D

Discussion 0
Question # 54

Your company is loading comma-separated values (CSV) files into Google BigQuery. The data is fully imported successfully; however, the imported data is not matching byte-to-byte to the source file. What is the most likely cause of this problem?

Options:

A.  

The CSV data loaded in BigQuery is not flagged as CSV.

B.  

The CSV data has invalid rows that were skipped on import.

C.  

The CSV data loaded in BigQuery is not using BigQuery’s default encoding.

D.  

The CSV data has not gone through an ETL phase before loading into BigQuery.

Discussion 0
Question # 55

You work for a manufacturing plant that batches application log files together into a single log file once a day at 2:00 AM. You have written a Google Cloud Dataflow job to process that log file. You need to make sure the log file in processed once per day as inexpensively as possible. What should you do?

Options:

A.  

Change the processing job to use Google Cloud Dataproc instead.

B.  

Manually start the Cloud Dataflow job each morning when you get into the office.

C.  

Create a cron job with Google App Engine Cron Service to run the Cloud Dataflow job.

D.  

Configure the Cloud Dataflow job as a streaming job so that it processes the log data immediately.

Discussion 0
Question # 56

You are designing the database schema for a machine learning-based food ordering service that will predict what users want to eat. Here is some of the information you need to store:

The user profile: What the user likes and doesn’t like to eat

The user account information: Name, address, preferred meal times

The order information: When orders are made, from where, to whom

The database will be used to store all the transactional data of the product. You want to optimize the data schema. Which Google Cloud Platform product should you use?

Options:

A.  

BigQuery

B.  

Cloud SQL

C.  

Cloud Bigtable

D.  

Cloud Datastore

Discussion 0
Question # 57

You work for a large fast food restaurant chain with over 400,000 employees. You store employee information in Google BigQuery in a Users table consisting of a FirstName field and a LastName field. A member of IT is building an application and asks you to modify the schema and data in BigQuery so the application can query a FullName field consisting of the value of the FirstName field concatenated with a space, followed by the value of the LastName field for each employee. How can you make that data available while minimizing cost?

Options:

A.  

Create a view in BigQuery that concatenates the FirstName and LastName field values to produce the FullName.

B.  

Add a new column called FullName to the Users table. Run an UPDATE statement that updates the FullName column for each user with the concatenation of the FirstName and LastName values.

C.  

Create a Google Cloud Dataflow job that queries BigQuery for the entire Users table, concatenates the FirstName value and LastName value for each user, and loads the proper values for FirstName, LastName, and FullName into a new table in BigQuery.

D.  

Use BigQuery to export the data for the table to a CSV file. Create a Google Cloud Dataproc job to process the CSV file and output a new CSV file containing the proper values for FirstName, LastName and FullName. Run a BigQuery load job to load the new CSV file into BigQuery.

Discussion 0
Question # 58

Your company is running their first dynamic campaign, serving different offers by analyzing real-time data during the holiday season. The data scientists are collecting terabytes of data that rapidly grows every hour during their 30-day campaign. They are using Google Cloud Dataflow to preprocess the data and collect the feature (signals) data that is needed for the machine learning model in Google Cloud Bigtable. The team is observing suboptimal performance with reads and writes of their initial load of 10 TB of data. They want to improve this performance while minimizing cost. What should they do?

Options:

A.  

Redefine the schema by evenly distributing reads and writes across the row space of the table.

B.  

The performance issue should be resolved over time as the site of the BigDate cluster is increased.

C.  

Redesign the schema to use a single row key to identify values that need to be updated frequently in the cluster.

D.  

Redesign the schema to use row keys based on numeric IDs that increase sequentially per user viewing the offers.

Discussion 0
Question # 59

Your company is in a highly regulated industry. One of your requirements is to ensure individual users have access only to the minimum amount of information required to do their jobs. You want to enforce this requirement with Google BigQuery. Which three approaches can you take? (Choose three.)

Options:

A.  

Disable writes to certain tables.

B.  

Restrict access to tables by role.

C.  

Ensure that the data is encrypted at all times.

D.  

Restrict BigQuery API access to approved users.

E.  

Segregate data across multiple tables or databases.

F.  

Use Google Stackdriver Audit Logging to determine policy violations.

Discussion 0
Question # 60

Your company built a TensorFlow neural-network model with a large number of neurons and layers. The model fits well for the training data. However, when tested against new data, it performs poorly. What method can you employ to address this?

Options:

A.  

Threading

B.  

Serialization

C.  

Dropout Methods

D.  

Dimensionality Reduction

Discussion 0
Get Professional-Data-Engineer dumps and pass your exam in 24 hours!

Free Exams Sample Questions