March Special Limited Time 60% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 2493360325

Good News !!! Professional-Machine-Learning-Engineer Google Professional Machine Learning Engineer is now Stable and With Pass Result

Professional-Machine-Learning-Engineer Practice Exam Questions and Answers

Google Professional Machine Learning Engineer

Last Update 1 day ago
Total Questions : 266

Professional-Machine-Learning-Engineer is stable now with all latest exam questions are added 1 day ago. Just download our Full package and start your journey with Google Professional Machine Learning Engineer certification. All these Google Professional-Machine-Learning-Engineer practice exam questions are real and verified by our Experts in the related industry fields.

Professional-Machine-Learning-Engineer PDF

Professional-Machine-Learning-Engineer PDF (Printable)
$48
$119.99

Professional-Machine-Learning-Engineer Testing Engine

Professional-Machine-Learning-Engineer PDF (Printable)
$56
$139.99

Professional-Machine-Learning-Engineer PDF + Testing Engine

Professional-Machine-Learning-Engineer PDF (Printable)
$70.8
$176.99
Question # 1

You are training an ML model using data stored in BigQuery that contains several values that are considered Personally Identifiable Information (Pll). You need to reduce the sensitivity of the dataset before training your model. Every column is critical to your model. How should you proceed?

Options:

A.  

Using Dataflow, ingest the columns with sensitive data from BigQuery, and then randomize the values in each sensitive column.

B.  

Use the Cloud Data Loss Prevention (DLP) API to scan for sensitive data, and use Dataflow with the DLP API to encrypt sensitive values with Format Preserving Encryption

C.  

Use the Cloud Data Loss Prevention (DLP) API to scan for sensitive data, and use Dataflow to replace all sensitive data by using the encryption algorithm AES-256 with a salt.

D.  

Before training, use BigQuery to select only the columns that do not contain sensitive data Create an authorized view of the data so that sensitive values cannot be accessed by unauthorized individuals.

Discussion 0
Question # 2

You work for a semiconductor manufacturing company. You need to create a real-time application that automates the quality control process High-definition images of each semiconductor are taken at the end of the assembly line in real time. The photos are uploaded to a Cloud Storage bucket along with tabular data that includes each semiconductor's batch number serial number dimensions, and weight You need to configure model training and serving while maximizing model accuracy. What should you do?

Options:

A.  

Use Vertex Al Data Labeling Service to label the images and train an AutoML image classification model.

Deploy the model and configure Pub/Sub to publish a message when an image is categorized into the failing class.

B.  

Use Vertex Al Data Labeling Service to label the images and train an AutoML image classification model. Schedule a daily batch prediction job that publishes a Pub/Sub message when the job completes.

C.  

Convert the images into an embedding representation Import this data into BigQuery, and train a BigQuery. ML K-means clustenng model with two clusters Deploy the model and configure Pub/Sub to publish a message when a semiconductor's data is categorized into the failing cluster.

D.  

Import the tabular data into BigQuery use Vertex Al Data Labeling Service to label the data and train an AutoML tabular classification model Deploy the model and configure Pub/Sub to publish a message when a semiconductor's data is categorized into the failing class.

Discussion 0
Question # 3

You are developing a model to identify traffic signs in images extracted from videos taken from the dashboard of a vehicle. You have a dataset of 100 000 images that were cropped to show one out of ten different traffic signs. The images have been labeled accordingly for model training and are stored in a Cloud Storage bucket You need to be able to tune the model during each training run. How should you train the model?

Options:

A.  

Train a model for object detection by using Vertex Al AutoML.

B.  

Train a model for image classification by using Vertex Al AutoML.

C.  

Develop the model training code for object detection and tram a model by using Vertex Al custom training.

D.  

Develop the model training code for image classification and train a model by using Vertex Al custom training.

Discussion 0
Question # 4

You are an ML engineer at a travel company. You have been researching customers’ travel behavior for many years, and you have deployed models that predict customers’ vacation patterns. You have observed that customers’ vacation destinations vary based on seasonality and holidays; however, these seasonal variations are similar across years. You want to quickly and easily store and compare the model versions and performance statistics across years. What should you do?

Options:

A.  

Store the performance statistics in Cloud SQL. Query that database to compare the performance statistics across the model versions.

B.  

Create versions of your models for each season per year in Vertex AI. Compare the performance statistics across the models in the Evaluate tab of the Vertex AI UI.

C.  

Store the performance statistics of each pipeline run in Kubeflow under an experiment for each season per year. Compare the results across the experiments in the Kubeflow UI.

D.  

Store the performance statistics of each version of your models using seasons and years as events in Vertex ML Metadata. Compare the results across the slices.

Discussion 0
Question # 5

You are training and deploying updated versions of a regression model with tabular data by using Vertex Al Pipelines. Vertex Al Training Vertex Al Experiments and Vertex Al Endpoints. The model is deployed in a Vertex Al endpoint and your users call the model by using the Vertex Al endpoint. You want to receive an email when the feature data distribution changes significantly, so you can retrigger the training pipeline and deploy an updated version of your model What should you do?

Options:

A.  

Use Vertex Al Model Monitoring Enable prediction drift monitoring on the endpoint. and specify a notification email.

B.  

In Cloud Logging, create a logs-based alert using the logs in the Vertex Al endpoint. Configure Cloud Logging to send an email when the alert is triggered.

C.  

In Cloud Monitoring create a logs-based metric and a threshold alert for the metric. Configure Cloud Monitoring to send an email when the alert is triggered.

D.  

Export the container logs of the endpoint to BigQuery Create a Cloud Function to run a SQL query over the exported logs and send an email. Use Cloud Scheduler to trigger the Cloud Function.

Discussion 0
Question # 6

You are creating a deep neural network classification model using a dataset with categorical input values. Certain columns have a cardinality greater than 10,000 unique values. How should you encode these categorical values as input into the model?

Options:

A.  

Convert each categorical value into an integer value.

B.  

Convert the categorical string data to one-hot hash buckets.

C.  

Map the categorical variables into a vector of boolean values.

D.  

Convert each categorical value into a run-length encoded string.

Discussion 0
Question # 7

Your team is building a convolutional neural network (CNN)-based architecture from scratch. The preliminary experiments running on your on-premises CPU-only infrastructure were encouraging, but have slow convergence. You have been asked to speed up model training to reduce time-to-market. You want to experiment with virtual machines (VMs) on Google Cloud to leverage more powerful hardware. Your code does not include any manual device placement and has not been wrapped in Estimator model-level abstraction. Which environment should you train your model on?

Options:

A.  

AVM on Compute Engine and 1 TPU with all dependencies installed manually.

B.  

AVM on Compute Engine and 8 GPUs with all dependencies installed manually.

C.  

A Deep Learning VM with an n1-standard-2 machine and 1 GPU with all libraries pre-installed.

D.  

A Deep Learning VM with more powerful CPU e2-highcpu-16 machines with all libraries pre-installed.

Discussion 0
Question # 8

You have developed an AutoML tabular classification model that identifies high-value customers who interact with your organization's website.

You plan to deploy the model to a new Vertex Al endpoint that will integrate with your website application. You expect higher traffic to the website during

nights and weekends. You need to configure the model endpoint's deployment settings to minimize latency and cost. What should you do?

Options:

A.  

Configure the model deployment settings to use an n1-standard-32 machine type.

B.  

Configure the model deployment settings to use an n1-standard-4 machine type. Set the minReplicaCount value to 1 and the maxReplicaCount value to 8.

C.  

Configure the model deployment settings to use an n1-standard-4 machine type and a GPU accelerator. Set the minReplicaCount value to 1 and the maxReplicaCount value to 4.

D.  

Configure the model deployment settings to use an n1-standard-8 machine type and a GPU accelerator.

Discussion 0
Question # 9

You work for a social media company. You need to detect whether posted images contain cars. Each training example is a member of exactly one class. You have trained an object detection neural network and deployed the model version to Al Platform Prediction for evaluation. Before deployment, you created an evaluation job and attached it to the Al Platform Prediction model version. You notice that the precision is lower than your business requirements allow. How should you adjust the model's final layer softmax threshold to increase precision?

Options:

A.  

Increase the recall

B.  

Decrease the recall.

C.  

Increase the number of false positives

D.  

Decrease the number of false negatives

Discussion 0
Question # 10

You work on a growing team of more than 50 data scientists who all use AI Platform. You are designing a strategy to organize your jobs, models, and versions in a clean and scalable way. Which strategy should you choose?

Options:

A.  

Set up restrictive IAM permissions on the AI Platform notebooks so that only a single user or group can access a given instance.

B.  

Separate each data scientist’s work into a different project to ensure that the jobs, models, and versions created by each data scientist are accessible only to that user.

C.  

Use labels to organize resources into descriptive categories. Apply a label to each created resource so that users can filter the results by label when viewing or monitoring the resources.

D.  

Set up a BigQuery sink for Cloud Logging logs that is appropriately filtered to capture information about AI Platform resource usage. In BigQuery, create a SQL view that maps users to the resources they are using

Discussion 0
Question # 11

You are developing an ML model to predict house prices. While preparing the data, you discover that an important predictor variable, distance from the closest school, is often missing and does not have high variance. Every instance (row) in your data is important. How should you handle the missing data?

Options:

A.  

Delete the rows that have missing values.

B.  

Apply feature crossing with another column that does not have missing values.

C.  

Predict the missing values using linear regression.

D.  

Replace the missing values with zeros.

Discussion 0
Question # 12

You are developing a process for training and running your custom model in production. You need to be able to show lineage for your model and predictions. What should you do?

Options:

A.  

1 Create a Vertex Al managed dataset

2 Use a Vertex Ai training pipeline to train your model

3 Generate batch predictions in Vertex Al

B.  

1 Use a Vertex Al Pipelines custom training job component to train your model

2. Generate predictions by using a Vertex Al Pipelines model batch predict component

C.  

1 Upload your dataset to BigQuery

2. Use a Vertex Al custom training job to train your model

3 Generate predictions by using Vertex Al SDK custom prediction routines

D.  

1 Use Vertex Al Experiments to train your model.

2 Register your model in Vertex Al Model Registry

3. Generate batch predictions in Vertex Al

Discussion 0
Question # 13

You are a data scientist at an industrial equipment manufacturing company. You are developing a regression model to estimate the power consumption in the company’s manufacturing plants based on sensor data collected from all of the plants. The sensors collect tens of millions of records every day. You need to schedule daily training runs foryour model that use all the data collected up to the current date. You want your model to scale smoothly and require minimal development work. What should you do?

Options:

A.  

Develop a custom TensorFlow regression model, and optimize it using Vertex Al Training.

B.  

Develop a regression model using BigQuery ML.

C.  

Develop a custom scikit-learn regression model, and optimize it using Vertex Al Training

D.  

Develop a custom PyTorch regression model, and optimize it using Vertex Al Training

Discussion 0
Question # 14

You are creating a model training pipeline to predict sentiment scores from text-based product reviews. You want to have control over how the model parameters are tuned, and you will deploy the model to an endpoint after it has been trained You will use Vertex Al Pipelines to run the pipeline You need to decide which Google Cloud pipeline components to use What components should you choose?

Options:

A.  

B.  

C.  

D.  

Discussion 0
Question # 15

You work for a retail company that is using a regression model built with BigQuery ML to predict product sales. This model is being used to serve online predictions Recently you developed a new version of the model that uses a different architecture (custom model) Initial analysis revealed that both models are performing as expected You want to deploy the new version of the model to production and monitor the performance over the next two months You need to minimize the impact to the existing and future model users How should you deploy the model?

Options:

A.  

Import the new model to the same Vertex Al Model Registry as a different version of the existing model. Deploy the new model to the same Vertex Al endpoint as the existing model, and use traffic splitting to route 95% of production traffic to the BigQuery ML model and 5% of production traffic to the new model.

B.  

Import the new model to the same Vertex Al Model Registry as the existing model Deploy the models to one Vertex Al endpoint Route 95% of production traffic to the BigQuery ML model and 5% of production traffic to the new model

C.  

Import the new model to the same Vertex Al Model Registry as the existing model Deploy each model to a separate Vertex Al endpoint.

D.  

Deploy the new model to a separate Vertex Al endpoint Create a Cloud Run service that routes the prediction requests to the corresponding endpoints based on the input feature values.

Discussion 0
Question # 16

You are developing a recommendation engine for an online clothing store. The historical customer transaction data is stored in BigQuery and Cloud Storage. You need to perform exploratory data analysis (EDA), preprocessing and model training. You plan to rerun these EDA, preprocessing, and training steps as you experiment with different types of algorithms. You want to minimize the cost and development effort of running these steps as you experiment. How should you configure the environment?

Options:

A.  

Create a Vertex Al Workbench user-managed notebook using the default VM instance, and use the %%bigquery magic commands in Jupyter to query the tables.

B.  

Create a Vertex Al Workbench managed notebook to browse and query the tables directly from the JupyterLab interface.

C.  

Create a Vertex Al Workbench user-managed notebook on a Dataproc Hub. and use the %%bigquery magic commands in Jupyter to query the tables.

D.  

Create a Vertex Al Workbench managed notebook on a Dataproc cluster, and use the spark-bigquery-connector to access the tables.

Discussion 0
Question # 17

You are developing an ML model using a dataset with categorical input variables. You have randomly split half of the data into training and test sets. After applying one-hot encoding on the categorical variables in the training set, you discover that one categorical variable is missing from the test set. What should you do?

Options:

A.  

Randomly redistribute the data, with 70% for the training set and 30% for the test set

B.  

Use sparse representation in the test set

C.  

Apply one-hot encoding on the categorical variables in the test data.

D.  

Collect more data representing all categories

Discussion 0
Question # 18

You have built a model that is trained on data stored in Parquet files. You access the data through a Hive table hosted on Google Cloud. You preprocessed these data with PySpark and exported it as a CSV file into Cloud Storage. After preprocessing, you execute additional steps to train and evaluate your model. You want to parametrize this model training in Kubeflow Pipelines. What should you do?

Options:

A.  

Remove the data transformation step from your pipeline.

B.  

Containerize the PySpark transformation step, and add it to your pipeline.

C.  

Add a ContainerOp to your pipeline that spins a Dataproc cluster, runs a transformation, and then saves the transformed data in Cloud Storage.

D.  

Deploy Apache Spark at a separate node pool in a Google Kubernetes Engine cluster. Add a ContainerOp to your pipeline that invokes a corresponding transformation job for this Spark instance.

Discussion 0
Question # 19

You have recently used TensorFlow to train a classification model on tabular data You have created a Dataflow pipeline that can transform several terabytes of data into training or prediction datasets consisting of TFRecords. You now need to productionize the model, and you want the predictions to be automatically uploaded to a BigQuery table on a weekly schedule. What should you do?

Options:

A.  

Import the model into Vertex Al and deploy it to a Vertex Al endpoint On Vertex Al Pipelines create a pipeline that uses the Dataf lowPythonJobop and the Mcdei3archPredictoc components.

B.  

Import the model into Vertex Al and deploy it to a Vertex Al endpoint Create a Dataflow pipeline that reuses the data processing logic sends requests to the endpoint and then uploads predictions to a BigQuery table.

C.  

Import the model into Vertex Al On Vertex Al Pipelines, create a pipeline that uses the DatafIowPythonJobOp and the ModelBatchPredictOp components.

D.  

Import the model into BigQuery Implement the data processing logic in a SQL query On Vertex Al Pipelines create a pipeline that uses the BigqueryQueryJobop and the EigqueryPredictModejobOp components.

Discussion 0
Question # 20

You want to rebuild your ML pipeline for structured data on Google Cloud. You are using PySpark to conduct data transformations at scale, but your pipelines are taking over 12 hours to run. To speed up development and pipeline run time, you want to use a serverless tool and SQL syntax. You have already moved your raw data into Cloud Storage. How should you build the pipeline on Google Cloud while meeting the speed and processing requirements?

Options:

A.  

Use Data Fusion's GUI to build the transformation pipelines, and then write the data into BigQuery

B.  

Convert your PySpark into SparkSQL queries to transform the data and then run your pipeline on Dataproc to write the data into BigQuery.

C.  

Ingest your data into Cloud SQL convert your PySpark commands into SQL queries to transform the data, and then use federated queries from BigQuery for machine learning

D.  

Ingest your data into BigQuery using BigQuery Load, convert your PySpark commands into BigQuery SQL queries to transform the data, and then write the transformations to a new table

Discussion 0
Question # 21

You work for a company that is developing an application to help users with meal planning You want to use machine learning to scan a corpus of recipes and extract each ingredient (e g carrot, rice pasta) and each kitchen cookware (e.g. bowl, pot spoon) mentioned Each recipe is saved in an unstructured text file What should you do?

Options:

A.  

Create a text dataset on Vertex Al for entity extraction Create two entities called ingredient" and cookware" and label at least 200 examples of each entity Train an AutoML entity extraction model to extract occurrences of these entity types Evaluate performance on a holdout dataset.

B.  

Create a multi-label text classification dataset on Vertex Al Create a test dataset and label each recipe that corresponds to its ingredients and cookware Train a multi-class classification model Evaluate the model’s performance on a holdout dataset.

C.  

Use the Entity Analysis method of the Natural Language API to extract the ingredients and cookware from each recipe Evaluate the model's performance on a prelabeled dataset.

D.  

Create a text dataset on Vertex Al for entity extraction Create as many entities as there are different ingredients and cookware Train an AutoML entity extraction model to extract those entities Evaluate the models performance on a holdout dataset.

Discussion 0
Question # 22

You are responsible for building a unified analytics environment across a variety of on-premises data marts. Your company is experiencing data quality and security challenges when integrating data across the servers, caused by the use of a wide rangeof disconnected tools and temporary solutions. You need a fully managed, cloud-native data integration service that will lower the total cost of work and reduce repetitive work. Some members on your team prefer a codeless interface for building Extract, Transform, Load (ETL) process. Which service should you use?

Options:

A.  

Dataflow

B.  

Dataprep

C.  

Apache Flink

D.  

Cloud Data Fusion

Discussion 0
Question # 23

You are training an ML model on a large dataset. You are using a TPU to accelerate the training process You notice that the training process is taking longer than expected. You discover that the TPU is not reaching its full capacity. What should you do?

Options:

A.  

Increase the learning rate

B.  

Increase the number of epochs

C.  

Decrease the learning rate

D.  

Increase the batch size

Discussion 0
Question # 24

You want to migrate a scikrt-learn classifier model to TensorFlow. You plan to train the TensorFlow classifier model using the same training set that was used to train the scikit-learn model and then compare the performances using a common test set. You want to use the Vertex Al Python SDK to manually log the evaluation metrics of each model and compare them based on their F1 scores and confusion matrices. How should you log the metrics?

Options:

A.  

B.  

C.  

D.  

Discussion 0
Question # 25

You work for an online travel agency that also sells advertising placements on its website to other companies.

You have been asked to predict the most relevant web banner that a user should see next. Security is

important to your company. The model latency requirements are 300ms@p99, the inventory is thousands of web banners, and your exploratory analysis has shown thatnavigation context is a good predictor. You want to Implement the simplest solution. How should you configure the prediction pipeline?

Options:

A.  

Embed the client on the website, and then deploy the model on AI Platform Prediction.

B.  

Embed the client on the website, deploy the gateway on App Engine, and then deploy the model on AI Platform Prediction.

C.  

Embed the client on the website, deploy the gateway on App Engine, deploy the database on Cloud

Bigtable for writing and for reading the user’s navigation context, and then deploy the model on AI Platform Prediction.

D.  

Embed the client on the website, deploy the gateway on App Engine, deploy the database on Memorystore for writing and for reading the user’s navigation context, and then deploy the model on Google Kubernetes Engine.

Discussion 0
Question # 26

You need to train a natural language model to perform text classification on product descriptions that contain millions of examples and 100,000 unique words. You want to preprocess the words individually so that they can be fed into a recurrent neural network. What should you do?

Options:

A.  

Create a hot-encoding of words, and feed the encodings into your model.

B.  

Identify word embeddings from a pre-trained model, and use the embeddings in your model.

C.  

Sort the words by frequency of occurrence, and use the frequencies as the encodings in your model.

D.  

Assign a numerical value to each word from 1 to 100,000 and feed the values as inputs in your model.

Discussion 0
Question # 27

Your team has a model deployed to a Vertex Al endpoint You have created a Vertex Al pipeline that automates the model training process and is triggered by a Cloud Function. You need to prioritize keeping the model up-to-date, but also minimize retraining costs. How should you configure retraining'?

Options:

A.  

Configure Pub/Sub to call the Cloud Function when a sufficient amount of new data becomes available.

B.  

Configure a Cloud Scheduler job that calls the Cloud Function at a predetermined frequency that fits your team's budget.

C.  

Enable model monitoring on the Vertex Al endpoint Configure Pub/Sub to call the Cloud Function when anomalies are detected.

D.  

Enable model monitoring on the Vertex Al endpoint Configure Pub/Sub to call the Cloud Function when feature drift is detected.

Discussion 0
Question # 28

You work for a bank and are building a random forest model for fraud detection. You have a dataset that

includes transactions, of which 1% are identified as fraudulent. Which data transformation strategy would likely improve the performance of your classifier?

Options:

A.  

Write your data in TFRecords.

B.  

Z-normalize all the numeric features.

C.  

Oversample the fraudulent transaction 10 times.

D.  

Use one-hot encoding on all categorical features.

Discussion 0
Question # 29

You work for a biotech startup that is experimenting with deep learning ML models based on properties of biological organisms. Your team frequently works on early-stage experiments with new architectures of ML models, and writes custom TensorFlow ops in C++. You train your models on large datasets and large batch sizes. Your typical batch size has 1024 examples, and each example is about 1 MB in size. The average size of a network with all weights and embeddings is 20 G

B.  

What hardware should you choose for your models?

Options:

A.  

A cluster with 2 n1-highcpu-64 machines, each with 8 NVIDIA Tesla V100 GPUs (128 GB GPU memory in total), and a n1-highcpu-64 machine with 64 vCPUs and 58 GB RAM

B.  

A cluster with 2 a2-megagpu-16g machines, each with 16 NVIDIA Tesla A100 GPUs (640 GB GPU memory in total), 96 vCPUs, and 1.4 TB RAM

C.  

A cluster with an n1-highcpu-64 machine with a v2-8 TPU and 64 GB RAM

D.  

A cluster with 4 n1-highcpu-96 machines, each with 96 vCPUs and 86 GB RAM

Discussion 0
Question # 30

You work for a bank You have been asked to develop an ML model that will support loan application decisions. You need to determine which Vertex Al services to include in the workflow You want to track the model's training parameters and the metrics per training epoch. You plan to compare the performance of each version of the model to determine the best model based on your chosen metrics. Which Vertex Al services should you use?

Options:

A.  

Vertex ML Metadata Vertex Al Feature Store, and Vertex Al Vizier

B.  

Vertex Al Pipelines. Vertex Al Experiments, and Vertex Al Vizier

C.  

Vertex ML Metadata Vertex Al Experiments, and Vertex Al TensorBoard

D.  

Vertex Al Pipelines. Vertex Al Feature Store, and Vertex Al TensorBoard

Discussion 0
Question # 31

You built a deep learning-based image classification model by using on-premises data. You want to use Vertex Al to deploy the model to production Due to security concerns you cannot move your data to the cloud. You are aware that the input data distribution might change over time You need to detect model performance changes in production. What should you do?

Options:

A.  

Use Vertex Explainable Al for model explainability Configure feature-based explanations.

B.  

Use Vertex Explainable Al for model explainability Configure example-based explanations.

C.  

Create a Vertex Al Model Monitoring job. Enable training-serving skew detection for your model.

D.  

Create a Vertex Al Model Monitoring job. Enable feature attribution skew and dnft detection for your model.

Discussion 0
Question # 32

You need to train a computer vision model that predicts the type of government ID present in a given image using a GPU-powered virtual machine on Compute Engine. You use the following parameters:

• Optimizer: SGD

• Image shape = 224x224

• Batch size = 64

• Epochs = 10

• Verbose = 2

During training you encounter the following error: ResourceExhaustedError: out of Memory (oom) when allocating tensor. What should you do?

Options:

A.  

Change the optimizer

B.  

Reduce the batch size

C.  

Change the learning rate

D.  

Reduce the image shape

Discussion 0
Question # 33

Your organization manages an online message board A few months ago, you discovered an increase in toxic language and bullying on the message board. You deployed an automated text classifier that flags certain comments as toxic or harmful. Now some users are reporting that benign comments referencing their religion are being misclassified as abusive Upon further inspection, you find that your classifier's false positive rate is higher for comments that reference certain underrepresented religious groups. Your team has a limited budget and is already overextended. What should you do?

Options:

A.  

Add synthetic training data where those phrases are used in non-toxic ways

B.  

Remove the model and replace it with human moderation.

C.  

Replace your model with a different text classifier.

D.  

Raise the threshold for comments to be considered toxic or harmful

Discussion 0
Question # 34

You are developing an image recognition model using PyTorch based on ResNet50 architecture Your code is working fine on your local laptop on a small subsample. Your full dataset has 200klabeled images You want to quickly scale your training workload while minimizing cost. You plan to use 4 V100 GPUs What should you do?

Options:

A.  

Create a Google Kubernetes Engine cluster with a node pool that has 4 V100 GPUs Prepare and submit a TFJob operator to this node pool.

B.  

Configure a Compute Engine VM with all the dependencies that launches the training Tram your model with Vertex Al using a custom tier that contains the required GPUs.

C.  

Create a Vertex Al Workbench user-managed notebooks instance with 4 V100 GPUs, and use it to tram your model.

D.  

Package your code with Setuptools and use a pre-built container. Train your model with Vertex Al using a custom tier that contains the required GPUs.

Discussion 0
Question # 35

You are experimenting with a built-in distributed XGBoost model in Vertex AI Workbench user-managed notebooks. You use BigQuery to split your data into training and validation sets using the following queries:

CREATE OR REPLACE TABLE ‘myproject.mydataset.training‘ AS

(SELECT * FROM ‘myproject.mydataset.mytable‘ WHERE RAND() <= 0.8);

CREATE OR REPLACE TABLE ‘myproject.mydataset.validation‘ AS

(SELECT * FROM ‘myproject.mydataset.mytable‘ WHERE RAND() <= 0.2);

After training the model, you achieve an area under the receiver operating characteristic curve (AUC ROC) value of 0.8, but after deploying the model to production, you notice that your model performance has dropped to an AUC ROC value of 0.65. What problem is most likely occurring?

Options:

A.  

There is training-serving skew in your production environment.

B.  

There is not a sufficient amount of training data.

C.  

The tables that you created to hold your training and validation records share some records, and you may not be using all the data in your initial table.

D.  

The RAND() function generated a number that is less than 0.2 in both instances, so every record in the validation table will also be in the training table.

Discussion 0
Question # 36

You are developing a model to help your company create more targeted online advertising campaigns. You need to create a dataset that you will use to train the model. You want to avoid creating or reinforcing unfair bias in the model. What should you do?

Choose 2 answers

Options:

A.  

Include a comprehensive set of demographic features.

B.  

include only the demographic groups that most frequently interact with advertisements.

C.  

Collect a random sample of production traffic to build the training dataset.

D.  

Collect a stratified sample of production traffic to build the training dataset.

E.  

Conduct fairness tests across sensitive categories and demographics on the trained model.

Discussion 0
Question # 37

You are developing a mode! to detect fraudulent credit card transactions. You need to prioritize detection because missing even one fraudulent transaction could severely impact the credit card holder. You used AutoML to tram a model on users' profile information and credit card transaction data. After training the initial model, you notice that the model is failing to detect many fraudulent transactions. How should you adjust the training parameters in AutoML to improve model performance?

Choose 2 answers

Options:

A.  

Increase the score threshold.

B.  

Decrease the score threshold.

C.  

Add more positive examples to the training set.

D.  

Add more negative examples to the training set.

E.  

Reduce the maximum number of node hours for training.

Discussion 0
Question # 38

Your company stores a large number of audio files of phone calls made to your customer call center in an on-premises database. Each audio file is in wav format and is approximately 5 minutes long. You need to analyze these audio files for customer sentiment. You plan to use the Speech-to-Text API. You want to use the most efficient approach. What should you do?

Options:

A.  

1 Upload the audio files to Cloud Storage

2. Call the speech: Iongrunningrecognize API endpoint to generate transcriptions

3. Call the predict method of an AutoML sentiment analysis model to analyze the transcriptions

B.  

1 Upload the audio files to Cloud Storage

2 Call the speech: Iongrunningrecognize API endpoint to generate transcriptions.

3 Create a Cloud Function that calls the Natural Language API by using the analyzesentiment method

C.  

1 Iterate over your local Tiles in Python

2. Use the Speech-to-Text Python library to create a speech.RecognitionAudio object and set the content to the audio file data

3. Call the speech: recognize API endpoint to generate transcriptions

4. Call the predict method of an AutoML sentiment analysis model to analyze the transcriptions

D.  

1 Iterate over your local files in Python

2 Use the Speech-to-Text Python Library to create a speech.RecognitionAudio object, and set the content to the audio file data

3. Call the speech: lengrunningrecognize API endpoint to generate transcriptions

4 Call the Natural Language API by using the analyzesenriment method

Discussion 0
Question # 39

You work for a retail company. You have a managed tabular dataset in Vertex Al that contains sales data from three different stores. The dataset includes several features such as store name and sale timestamp. You want to use the data to train a model that makes sales predictions for a new store that will open soon You need to split the data between the training, validation, and test sets What approach should you use to split the data?

Options:

A.  

Use Vertex Al manual split, using the store name feature to assign one store for each set.

B.  

Use Vertex Al default data split.

C.  

Use Vertex Al chronological split and specify the sales timestamp feature as the time vanable.

D.  

Use Vertex Al random split assigning 70% of the rows to the training set, 10% to the validation set, and 20% to the test set.

Discussion 0
Question # 40

You need to use TensorFlow to train an image classification model. Your dataset is located in a Cloud Storage directory and contains millions of labeled images Before training the model, you need to prepare the data. You want the data preprocessing and model training workflow to be as efficient scalable, and low maintenance as possible. What should you do?

Options:

A.  

1 Create a Dataflow job that creates sharded TFRecord files in a Cloud Storage directory.

2 Reference tf .data.TFRecordDataset in the training script.

3. Train the model by using Vertex Al Training with a V100 GPU.

B.  

1 Create a Dataflow job that moves the images into multiple Cloud Storage directories, where each directory is named according to the corresponding label.

2 Reference tfds.fclder_da-asst.imageFclder in the training script.

3. Train the model by using Vertex AI Training with a V100 GPU.

C.  

1 Create a Jupyter notebook that uses an n1-standard-64, V100 GPU Vertex Al Workbench instance.

2 Write a Python script that creates sharded TFRecord files in a directory inside the instance

3. Reference tf. da-a.TFRecrrdDataset in the training script.

4. Train the model by using the Workbench instance.

D.  

1 Create a Jupyter notebook that uses an n1-standard-64, V100 GPU Vertex Al Workbench instance.

2 Write a Python scnpt that copies the images into multiple Cloud Storage directories, where each directory is named according to the corresponding label.

3 Reference tf ds. f older_dataset. imageFolder in the training script.

4. Train the model by using the Workbench instance.

Discussion 0
Question # 41

You have trained a model on a dataset that required computationally expensive preprocessing operations. You need to execute the same preprocessing at prediction time. You deployed the model on Al Platform for high-throughput online prediction. Which architecture should you use?

Options:

A.  

• Validate the accuracy of the model that you trained on preprocessed data

• Create a new model that uses the raw data and is available in real time

• Deploy the new model onto Al Platform for online prediction

B.  

• Send incoming prediction requests to a Pub/Sub topic

• Transform the incoming data using a Dataflow job

• Submit a prediction request to Al Platform using the transformed data

• Write the predictions to an outbound Pub/Sub queue

C.  

• Stream incoming prediction request data into Cloud Spanner

• Create a view to abstract your preprocessing logic.

• Query the view every second for new records

• Submit a prediction request to Al Platform using the transformed data

• Write the predictions to an outbound Pub/Sub queue.

D.  

• Send incoming prediction requests to a Pub/Sub topic

• Set up a Cloud Function that is triggered when messages are published to the Pub/Sub topic.

• Implement your preprocessing logic in the Cloud Function

• Submit a prediction request to Al Platform using the transformed data

• Write the predictions to an outbound Pub/Sub queue

Discussion 0
Question # 42

You are developing an image recognition model using PyTorch based on ResNet50 architecture. Your code is working fine on your local laptop on a small subsample. Your full dataset has 200k labeled images You want to quickly scale your training workload while minimizing cost. You plan to use 4 V100 GPUs. What should you do? (Choose Correct Answer and Give References and Explanation)

Options:

A.  

Configure a Compute Engine VM with all the dependencies that launches the training Train your model with Vertex Al using a custom tier that contains the required GPUs.

B.  

Package your code with Setuptools. and use a pre-built container Train your model with Vertex Al using a custom tier that contains the required GPUs.

C.  

Create a Vertex Al Workbench user-managed notebooks instance with 4 V100 GPUs, and use it to train your model

D.  

Create a Google Kubernetes Engine cluster with a node pool that has 4 V100 GPUs Prepare and submit a TFJob operator to this node pool.

Discussion 0
Question # 43

You are an ML engineer at a bank. You have developed a binary classification model using AutoML Tables to predict whether a customer will make loan payments on time. The output is used to approve or reject loan requests. One customer’s loan request has been rejected by your model, and the bank’s risks department is asking you to provide the reasons that contributed to the model’s decision. What should you do?

Options:

A.  

Use local feature importance from the predictions.

B.  

Use the correlation with target values in the data summary page.

C.  

Use the feature importance percentages in the model evaluation page.

D.  

Vary features independently to identify the threshold per feature that changes the classification.

Discussion 0
Question # 44

Your company manages an application that aggregates news articles from many different online sources and sends them to users. You need to build a recommendationmodel that will suggest articles to readers that are similar to the articles they are currently reading. Which approach should you use?

Options:

A.  

Create a collaborative filtering system that recommends articles to a user based on the user’s past behavior.

B.  

Encode all articles into vectors using word2vec, and build a model that returns articles based on vector similarity.

C.  

Build a logistic regression model for each user that predicts whether an article should be recommended to a user.

D.  

Manually label a few hundred articles, and then train an SVM classifier based on the manually classified articles that categorizes additional articles into their respective categories.

Discussion 0
Question # 45

You work for an online publisher that delivers news articles to over 50 million readers. You have built an AI model that recommends content for the company’s weekly newsletter. A recommendation is considered successful if the article is opened within two days of the newsletter’s published date and the user remains on the page for at least one minute.

All the information needed to compute the success metric is available in BigQuery and is updated hourly. The model is trained on eight weeks of data, on average its performance degrades below the acceptable baseline after five weeks, and training time is 12 hours. You want to ensure that the model’s performance is above the acceptable baseline while minimizing cost. How should you monitor the model to determine when retraining is necessary?

Options:

A.  

Use Vertex AI Model Monitoring to detect skew of the input features with a sample rate of 100% and a monitoring frequency of two days.

B.  

Schedule a cron job in Cloud Tasks to retrain the model every week before the newsletter is created.

C.  

Schedule a weekly query in BigQuery to compute the success metric.

D.  

Schedule a daily Dataflow job in Cloud Composer to compute the success metric.

Discussion 0
Question # 46

You are an ML engineer on an agricultural research team working on a crop disease detection tool to detect leaf rust spots in images of crops to determine the presence of a disease. These spots, which can vary in shape and size, are correlated to the severity of the disease. You want to develop a solution that predicts the presence and severity of the disease with high accuracy. What should you do?

Options:

A.  

Create an object detection model that can localize the rust spots.

B.  

Develop an image segmentation ML model to locate the boundaries of the rust spots.

C.  

Develop a template matching algorithm using traditional computer vision libraries.

D.  

Develop an image classification ML model to predict the presence of the disease.

Discussion 0
Question # 47

Your organization's call center has asked you to develop a model that analyzes customer sentiments in each call. The call center receives over one million calls daily, and data is stored in Cloud Storage. The data collected must not leave the region in which the call originated, and no Personally Identifiable Information (Pll) can be stored or analyzed. The data science team has a third-party tool for visualization and access which requires a SQL ANSI-2011 compliant interface. You need to select components for data processing and for analytics. How should the data pipeline be designed?

Question # 47

Options:

A.  

1 = Dataflow, 2 = BigQuery

B.  

1 = Pub/Sub, 2 = Datastore

C.  

1 = Dataflow, 2 = Cloud SQL

D.  

1 = Cloud Function, 2 = Cloud SQL

Discussion 0
Question # 48

You created a model that uses BigQuery ML to perform linear regression. You need to retrain the model on the cumulative data collected every week. You want to minimize the development effort and the scheduling cost. What should you do?

Options:

A.  

Use BigQuerys scheduling service to run the model retraining query periodically.

B.  

Create a pipeline in Vertex Al Pipelines that executes the retraining query and use the Cloud Scheduler API to run the query weekly.

C.  

Use Cloud Scheduler to trigger a Cloud Function every week that runs the query for retraining the model.

D.  

Use the BigQuery API Connector and Cloud Scheduler to trigger. Workflows every week that retrains the model.

Discussion 0
Question # 49

You are developing ML models with Al Platform for image segmentation on CT scans. You frequently update your model architectures based on the newest available research papers, and have to rerun training on the same dataset to benchmark their performance. You want to minimize computation costs and manual intervention while having version control for your code. What should you do?

Options:

A.  

Use Cloud Functions to identify changes to your code in Cloud Storage and trigger a retraining job

B.  

Use the gcloud command-line tool to submit training jobs on Al Platform when you update your code

C.  

Use Cloud Build linked with Cloud Source Repositories to trigger retraining when new code is pushed to the repository

D.  

Create an automated workflow in Cloud Composer that runs daily and looks for changes in code in Cloud Storage using a sensor.

Discussion 0
Question # 50

You are an ML engineer at a regulated insurance company. You are asked to develop an insurance approval model that accepts or rejects insurance applications from potential customers. What factors should you consider before building the model?

Options:

A.  

Redaction, reproducibility, and explainability

B.  

Traceability, reproducibility, and explainability

C.  

Federated learning, reproducibility, and explainability

D.  

Differential privacy federated learning, and explainability

Discussion 0
Question # 51

You are profiling the performance of your TensorFlow model training time and notice a performance issue caused by inefficiencies in the input data pipeline for a single 5 terabyte CSV file dataset on Cloud Storage. You need to optimize the input pipeline performance. Which action should you try first to increase the efficiency of your pipeline?

Options:

A.  

Preprocess the input CSV file into a TFRecord file.

B.  

Randomly select a 10 gigabyte subset of the data to train your model.

C.  

Split into multiple CSV files and use a parallel interleave transformation.

D.  

Set the reshuffle_each_iteration parameter to true in the tf.data.Dataset.shuffle method.

Discussion 0
Question # 52

You are developing a training pipeline for a new XGBoost classification model based on tabular data The data is stored in a BigQuery table You need to complete the following steps

1. Randomly split the data into training and evaluation datasets in a 65/35 ratio

2. Conduct feature engineering

3 Obtain metrics for the evaluation dataset.

4 Compare models trained in different pipeline executions

How should you execute these steps'?

Options:

A.  

1 Using Vertex Al Pipelines, add a component to divide the data into training and evaluation sets, and add another component for feature engineering

2. Enable auto logging of metrics in the training component.

3 Compare pipeline runs in Vertex Al Experiments

B.  

1 Using Vertex Al Pipelines, add a component to divide the data into training and evaluation sets, and add another component for feature engineering

2 Enable autologging of metrics in the training component

3 Compare models using the artifacts lineage in Vertex ML Metadata

C.  

1 In BigQuery ML. use the create model statement with bocstzd_tree_classifier as the model

type and use BigQuery to handle the data splits.

2 Use a SQL view to apply feature engineering and train the model using the data in that view

3. Compare the evaluation metrics of the models by using a SQL query with the ml. training_infc statement.

D.  

1 In BigQuery ML use the create model statement with boosted_tree_classifier as the model

type, and use BigQuery to handle the data splits.

2 Use ml transform to specify the feature engineering transformations, and train the model using the

data in the table

' 3. Compare the evaluation metrics of the models by using a SQL query with the ml. training_info statement.

Discussion 0
Question # 53

You were asked to investigate failures of a production line component based on sensor readings. After receiving the dataset, you discover that less than 1% of the readings are positive examples representing failure incidents. You have tried to train several classification models, but none of them converge. How should you resolve the class imbalance problem?

Options:

A.  

Use the class distribution to generate 10% positive examples

B.  

Use a convolutional neural network with max pooling and softmax activation

C.  

Downsample the data with upweighting to create a sample with 10% positive examples

D.  

Remove negative examples until the numbers of positive and negative examples are equal

Discussion 0
Question # 54

You are an ML engineer at a mobile gaming company. A data scientist on your team recently trained a TensorFlow model, and you are responsible for deploying this model into a mobile application. You discover that the inference latency of the current model doesn’t meet production requirements. You need to reduce the inference time by 50%, and you are willing to accept a small decrease in model accuracy in order to reach the latency requirement. Without training a new model, which model optimization technique for reducing latency should you try first?

Options:

A.  

Weight pruning

B.  

Dynamic range quantization

C.  

Model distillation

D.  

Dimensionality reduction

Discussion 0
Question # 55

Your data science team has requested a system that supports scheduled model retraining, Docker containers, and a service that supports autoscaling and monitoring for online prediction requests. Which platform components should you choose for this system?

Options:

A.  

Vertex AI Pipelines and App Engine

B.  

Vertex AI Pipelines and Al Platform Prediction

C.  

Cloud Composer, BigQuery ML , and Al Platform Prediction

D.  

Cloud Composer, Al Platform Training with custom containers, and App Engine

Discussion 0
Question # 56

You work on a growing team of more than 50 data scientists who all use Al Platform. You are designing a strategy to organize your jobs, models, and versions in a clean and scalable way. Which strategy should you choose?

Options:

A.  

Set up restrictive I AM permissions on the Al Platform notebooks so that only a single user or group can access a given instance.

B.  

Separate each data scientist's work into a different project to ensure that the jobs, models, and versions created by each data scientist are accessible only to that user.

C.  

Use labels to organize resources into descriptive categories. Apply a label to each created resource so that users can filter the results by label when viewing or monitoring the resources

D.  

Set up a BigQuery sink for Cloud Logging logs that is appropriately filtered to capture information about Al Platform resource usage In BigQuery create a SQL view that maps users to the resources they are using.

Discussion 0
Question # 57

You need to design an architecture that serves asynchronous predictions to determine whether a particular mission-critical machine part will fail. Your system collects data from multiple sensors from the machine. You want to build a model that will predict a failure in the next N minutes, given the average of each sensor’s data from the past 12 hours. How should you design the architecture?

Options:

A.  

1. HTTP requests are sent by the sensors to your ML model, which is deployed as a microservice and exposes a REST API for prediction

2. Your application queries a Vertex AI endpoint where you deployed your model.

3. Responses are received by the caller application as soon as the model produces the prediction.

B.  

1. Events are sent by the sensors to Pub/Sub, consumed in real time, and processed by a Dataflow stream processing pipeline.

2. The pipeline invokes the model for prediction and sends the predictions to another Pub/Sub topic.

3. Pub/Sub messages containing predictions are then consumed by a downstream system for monitoring.

C.  

1. Export your data to Cloud Storage using Dataflow.

2. Submit a Vertex AI batch prediction job that uses your trained model in Cloud Storage to perform scoring on the preprocessed data.

3. Export the batch prediction job outputs from Cloud Storage and import them into Cloud SQL.

D.  

1. Export the data to Cloud Storage using the BigQuery command-line tool

2. Submit a Vertex AI batch prediction job that uses your trained model in Cloud Storage to perform scoring on the preprocessed data.

3. Export the batch prediction job outputs from Cloud Storage and import them into BigQuery.

Discussion 0
Question # 58

Your company manages a video sharing website where users can watch and upload videos. You need to

create an ML model to predict which newly uploaded videos will be the most popular so that those videos can be prioritized on your company’s website. Which result should you use to determine whether the model is successful?

Options:

A.  

The model predicts videos as popular if the user who uploads them has over 10,000 likes.

B.  

The model predicts 97.5% of the most popular clickbait videos measured by number of clicks.

C.  

The model predicts 95% of the most popular videos measured by watch time within 30 days of being

uploaded.

D.  

The Pearson correlation coefficient between the log-transformed number of views after 7 days and 30 days after publication is equal to 0.

Discussion 0
Question # 59

You are developing a custom image classification model in Python. You plan to run your training application on Vertex Al Your input dataset contains several hundred thousand small images You need to determine how to store and access the images for training. You want to maximize data throughput and minimize training time while reducing the amount of additional code. What should you do?

Options:

A.  

Store image files in Cloud Storage and access them directly.

B.  

Store image files in Cloud Storage and access them by using serialized records.

C.  

Store image files in Cloud Filestore, and access them by using serialized records.

D.  

Store image files in Cloud Filestore and access them directly by using an NFS mount point.

Discussion 0
Question # 60

You work on a data science team at a bank and are creating an ML model to predict loan default risk. You have collected and cleaned hundreds of millions of records worth of training data in a BigQuery table, and you now want to develop and compare multiple models on this data using TensorFlow and Vertex AI. You want to minimize any bottlenecks during the data ingestion state while considering scalability. What should you do?

Options:

A.  

Use the BigQuery client library to load data into a dataframe, and use tf.data.Dataset.from_tensor_slices() to read it.

B.  

Export data to CSV files in Cloud Storage, and use tf.data.TextLineDataset() to read them.

C.  

Convert the data into TFRecords, and use tf.data.TFRecordDataset() to read them.

D.  

Use TensorFlow I/O’s BigQuery Reader to directly read the data.

Discussion 0
Question # 61

You work as an ML engineer at a social media company, and you are developing a visual filter for users’ profile photos. This requires you to train an ML model to detect bounding boxes around human faces. You want to use this filter in your company’s iOS-based mobile phone application. You want to minimize code development and want the model to be optimized for inference on mobile phones. What should you do?

Options:

A.  

Train a model using AutoML Vision and use the “export for Core ML” option.

B.  

Train a model using AutoML Vision and use the “export for Coral” option.

C.  

Train a model using AutoML Vision and use the “export for TensorFlow.js” option.

D.  

Train a custom TensorFlow model and convert it to TensorFlow Lite (TFLite).

Discussion 0
Question # 62

You created an ML pipeline with multiple input parameters. You want to investigate the tradeoffs between different parameter combinations. The parameter options are

• input dataset

• Max tree depth of the boosted tree regressor

• Optimizer learning rate

You need to compare the pipeline performance of the different parameter combinations measured in F1 score, time to train and model complexity. You want your approach to be reproducible and track all pipeline runs on the same platform. What should you do?

Options:

A.  

1 Use BigQueryML to create a boosted tree regressor and use the hyperparameter tuning capability

2 Configure the hyperparameter syntax to select different input datasets. max tree depths, and optimizer teaming rates Choose the grid search option

B.  

1 Create a Vertex Al pipeline with a custom model training job as part of the pipeline Configure the pipeline's parameters to include those you are investigating

2 In the custom training step, use the Bayesian optimization method with F1 score as the target to maximize

C.  

1 Create a Vertex Al Workbench notebook for each of the different input datasets

2 In each notebook, run different local training jobs with different combinations of the max tree depth and optimizer learning rate parameters

3 After each notebook finishes, append the results to a BigQuery table

D.  

1 Create an experiment in Vertex Al Experiments

2. Create a Vertex Al pipeline with a custom model training job as part of the pipeline. Configure the pipelines parameters to include those you are investigating

3. Submit multiple runs to the same experiment using different values for the parameters

Discussion 0
Question # 63

You work for a gaming company that has millions of customers around the world. All games offer a chat feature that allows players to communicate with each other in real time. Messages can be typed in more than 20 languages and are translated in real time using the Cloud Translation API. You have been asked to build an ML system to moderate the chat in real time while assuring that the performance is uniform across the various languages and without changing the serving infrastructure.

You trained your first model using an in-house word2vec model for embedding the chat messages translated by the Cloud Translation API. However, the model has significant differences in performance across the different languages. How should you improve it?

Options:

A.  

Add a regularization term such as the Min-Diff algorithm to the loss function.

B.  

Train a classifier using the chat messages in their original language.

C.  

Replace the in-house word2vec with GPT-3 or T5.

D.  

Remove moderation for languages for which the false positive rate is too high.

Discussion 0
Question # 64

Your data science team needs to rapidly experiment with various features, model architectures, and hyperparameters. They need to track the accuracy metrics for various experiments and use an API to query the metrics over time. What should they use to track and report their experiments while minimizing manual effort?

Options:

A.  

Use Kubeflow Pipelines to execute the experiments Export the metrics file, and query the results using the Kubeflow Pipelines API.

B.  

Use Al Platform Training to execute the experiments Write the accuracy metrics to BigQuery, and query the results using the BigQueryAPI.

C.  

Use Al Platform Training to execute the experiments Write the accuracy metrics to Cloud Monitoring, and query the results using the Monitoring API.

D.  

Use Al Platform Notebooks to execute the experiments. Collect the results in a shared Google Sheets file, and query the results using the Google Sheets API

Discussion 0
Question # 65

You are an ML engineer at a manufacturing company. You need to build a model that identifies defects in products based on images of the product taken at the end of the assembly line. You want your model to preprocess the images with lower computation to quickly extract features of defects in products. Which approach should you use to build the model?

Options:

A.  

Reinforcement learning

B.  

Recommender system

C.  

Recurrent Neural Networks (RNN)

D.  

Convolutional Neural Networks (CNN)

Discussion 0
Question # 66

You are implementing a batch inference ML pipeline in Google Cloud. The model was developed using TensorFlow and is stored in SavedModel format in Cloud Storage You need to apply the model to a historical dataset containing 10 TB of data that is stored in a BigQuery table How should you perform the inference?

Options:

A.  

Export the historical data to Cloud Storage in Avro format. Configure a Vertex Al batch prediction job to generate predictions for the exported data.

B.  

Import the TensorFlow model by using the create model statement in BigQuery ML Apply the historical data to the TensorFlow model.

C.  

Export the historical data to Cloud Storage in CSV format Configure a Vertex Al batch prediction job to generate predictions for the exported data.

D.  

Configure a Vertex Al batch prediction job to apply the model to the historical data in BigQuery

Discussion 0
Question # 67

Your data science team has requested a system that supports scheduled model retraining, Docker containers, and a service that supports autoscaling and monitoring for online prediction requests. Which platform components should you choose for this system?

Options:

A.  

Vertex AI Pipelines and App Engine

B.  

Vertex AI Pipelines, Vertex AI Prediction, and Vertex AI Model Monitoring

C.  

Cloud Composer, BigQuery ML, and Vertex AI Prediction

D.  

Cloud Composer, Vertex AI Training with custom containers, and App Engine

Discussion 0
Question # 68

You are building a predictive maintenance model to preemptively detect part defects in bridges. You plan to use high definition images of the bridges as model inputs. You need to explain the output of the model to the relevant stakeholders so they can take appropriate action. How should you build the model?

Options:

A.  

Use scikit-learn to build a tree-based model, and use SHAP values to explain the model output.

B.  

Use scikit-lean to build a tree-based model, and use partial dependence plots (PDP) to explain the model output.

C.  

Use TensorFlow to create a deep learning-based model and use Integrated Gradients to explain the model

output.

D.  

Use TensorFlow to create a deep learning-based model and use the sampled Shapley method to explain the model output.

Discussion 0
Question # 69

You recently trained an XGBoost model on tabular data You plan to expose the model for internal use as an HTTP microservice After deployment you expect a small number of incoming requests. You want to productionize the model with the least amount of effort and latency. What should you do?

Options:

A.  

Deploy the model to BigQuery ML by using CREATE model with the BOOSTED-THREE-REGRESSOR statement and invoke the BigQuery API from the microservice.

B.  

Build a Flask-based app Package the app in a custom container on Vertex Al and deploy it to Vertex Al Endpoints.

C.  

Build a Flask-based app Package the app in a Docker image and deploy it to Google Kubernetes Engine in Autopilot mode.

D.  

Use a prebuilt XGBoost Vertex container to create a model and deploy it to Vertex Al Endpoints.

Discussion 0
Question # 70

You developed an ML model with Al Platform, and you want to move it to production. You serve a few thousand queries per second and are experiencing latency issues. Incoming requests are served by a load balancer that distributes them across multiple Kubeflow CPU-only pods running on Google Kubernetes Engine (GKE). Your goal is to improve the serving latency without changing the underlying infrastructure. What should you do?

Options:

A.  

Significantly increase the max_batch_size TensorFlow Serving parameter

B.  

Switch to the tensorflow-model-server-universal version of TensorFlow Serving

C.  

Significantly increase the max_enqueued_batches TensorFlow Serving parameter

D.  

Recompile TensorFlow Serving using the source to support CPU-specific optimizations Instruct GKE to choose an appropriate baseline minimum CPU platform for serving nodes

Discussion 0
Question # 71

You are the Director of Data Science at a large company, and your Data Science team has recently begun using the Kubeflow Pipelines SDK to orchestrate their training pipelines. Your team is struggling to integrate their custom Python code into the Kubeflow Pipelines SDK. How should you instruct them to proceed in order to quickly integrate their code with the Kubeflow Pipelines SDK?

Options:

A.  

Use the func_to_container_op function to create custom components from the Python code.

B.  

Use the predefined components available in the Kubeflow Pipelines SDK to access Dataproc, and run the custom code there.

C.  

Package the custom Python code into Docker containers, and use the load_component_from_file function to import the containers into the pipeline.

D.  

Deploy the custom Python code to Cloud Functions, and use Kubeflow Pipelines to trigger the Cloud Function.

Discussion 0
Question # 72

You need to build an ML model for a social media application to predict whether a user’s submitted profile photo meets the requirements. The application will inform the user if the picture meets the requirements. How should you build a model to ensure that the application does not falsely accept a non-compliant picture?

Options:

A.  

Use AutoML to optimize the model’s recall in order to minimize false negatives.

B.  

Use AutoML to optimize the model’s F1 score in order to balance the accuracy of false positives and false negatives.

C.  

Use Vertex AI Workbench user-managed notebooks to build a custom model that has three times as many examples of pictures that meet the profile photo requirements.

D.  

Use Vertex AI Workbench user-managed notebooks to build a custom model that has three times as many examples of pictures that do not meet the profile photo requirements.

Discussion 0
Question # 73

You recently deployed a model to a Vertex Al endpoint Your data drifts frequently so you have enabled request-response logging and created a Vertex Al Model Monitoring job. You have observed that your model is receiving higher traffic than expected. You need to reduce the model monitoring cost while continuing to quickly detect drift. What should you do?

Options:

A.  

Replace the monitoring job with a DataFlow pipeline that uses TensorFlow Data Validation (TFDV).

B.  

Replace the monitoring job with a custom SQL scnpt to calculate statistics on the features and predictions in BigQuery.

C.  

Decrease the sample_rate parameter in the Randomsampleconfig of the monitoring job.

D.  

Increase the monitor_interval parameter in the scheduieconfig of the monitoring job.

Discussion 0
Question # 74

You are training an LSTM-based model on Al Platform to summarize text using the following job submission script:

Question # 74

You want to ensure that training time is minimized without significantly compromising the accuracy of your model. What should you do?

Options:

A.  

Modify the 'epochs' parameter

B.  

Modify the 'scale-tier' parameter

C.  

Modify the batch size' parameter

D.  

Modify the 'learning rate' parameter

Discussion 0
Question # 75

You are developing models to classify customer support emails. You created models with TensorFlow Estimators using small datasets on your on-premises system, but you now need to train the models using large datasets to ensure high performance. You will port your models to Google Cloud and want to minimize code refactoring and infrastructure overhead for easier migration from on-prem to cloud. What should you do?

Options:

A.  

Use Vertex Al Platform for distributed training

B.  

Create a cluster on Dataproc for training

C.  

Create a Managed Instance Group with autoscaling

D.  

Use Kubeflow Pipelines to train on a Google Kubernetes Engine cluster.

Discussion 0
Question # 76

Your team frequently creates new ML models and runs experiments. Your team pushes code to a single repository hosted on Cloud Source Repositories. You want to create a continuous integration pipeline that automatically retrains the models whenever there is any modification of the code. What should be your first step to set up the CI pipeline?

Options:

A.  

Configure a Cloud Build trigger with the event set as "Pull Request"

B.  

Configure a Cloud Build trigger with the event set as "Push to a branch"

C.  

Configure a Cloud Function that builds the repository each time there is a code change.

D.  

Configure a Cloud Function that builds the repository each time a new branch is created.

Discussion 0
Question # 77

You are developing an ML model to identify your company s products in images. You have access to over one million images in a Cloud Storage bucket. You plan to experiment with different TensorFlow models by using Vertex Al Training You need to read images at scale during training while minimizing data I/O bottlenecks What should you do?

Options:

A.  

Load the images directly into the Vertex Al compute nodes by using Cloud Storage FUSE Read the images by using the tf .data.Dataset.from_tensor_slices function.

B.  

Create a Vertex Al managed dataset from your image data Access the aip_training_data_uri

environment variable to read the images by using the tf. data. Dataset. Iist_flies function.

C.  

Convert the images to TFRecords and store them in a Cloud Storage bucket Read the TFRecords by using the tf. ciata.TFRecordDataset function.

D.  

Store the URLs of the images in a CSV file Read the file by using the tf.data.experomental.CsvDataset function.

Discussion 0
Question # 78

You work for a large retailer and you need to build a model to predict customer churn. The company has a dataset of historical customer data, including customer demographics, purchase history, and website activity. You need to create the model in BigQuery ML and thoroughly evaluate its performance. What should you do?

Options:

A.  

Create a linear regression model in BigQuery ML and register the model in Vertex Al Model Registry Evaluate the model performance in Vertex Al.

B.  

Create a logistic regression model in BigQuery ML and register the model in Vertex Al Model Registry. Evaluate the model performance in Vertex Al.

C.  

Create a linear regression model in BigQuery ML Use the ml. evaluate function to evaluate the model performance.

D.  

Create a logistic regression model in BigQuery ML Use the ml.confusion_matrix function to evaluate the model performance.

Discussion 0
Question # 79

You work for a food product company. Your company's historical sales data is stored in BigQuery You need to use Vertex Al’s custom training service to train multiple TensorFlow models that read the data from BigQuery and predict future sales You plan to implement a data preprocessing algorithm that performs min-max scaling and bucketing on a large number of features before you start experimenting with the models. You want to minimize preprocessing time, cost and development effort How should you configure this workflow?

Options:

A.  

Write the transformations into Spark that uses the spark-bigquery-connector and use Dataproc to preprocess the data.

B.  

Write SQL queries to transform the data in-place in BigQuery.

C.  

Add the transformations as a preprocessing layer in the TensorFlow models.

D.  

Create a Dataflow pipeline that uses the BigQuerylO connector to ingest the data process it and write it back to BigQuery.

Discussion 0
Get Professional-Machine-Learning-Engineer dumps and pass your exam in 24 hours!

Free Exams Sample Questions