Summer Sale Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 65pass65

Data-Engineer-Associate AWS Certified Data Engineer - Associate (DEA-C01) is now Stable and With Pass Result | Test Your Knowledge for Free

Exams4sure Dumps

Data-Engineer-Associate Practice Questions

AWS Certified Data Engineer - Associate (DEA-C01)

Last Update 4 days ago
Total Questions : 302

Dive into our fully updated and stable Data-Engineer-Associate practice test platform, featuring all the latest AWS Certified Data Engineer exam questions added this week. Our preparation tool is more than just a Amazon Web Services study aid; it's a strategic advantage.

Our free AWS Certified Data Engineer practice questions crafted to reflect the domains and difficulty of the actual exam. The detailed rationales explain the 'why' behind each answer, reinforcing key concepts about Data-Engineer-Associate. Use this test to pinpoint which areas you need to focus your study on.

Data-Engineer-Associate PDF

Data-Engineer-Associate PDF (Printable)
$54.25
$154.99

Data-Engineer-Associate Testing Engine

Data-Engineer-Associate PDF (Printable)
$59.5
$169.99

Data-Engineer-Associate PDF + Testing Engine

Data-Engineer-Associate PDF (Printable)
$74.55
$212.99
Question # 1

A company has an application that uses a microservice architecture. The company hosts the application on an Amazon Elastic Kubernetes Services (Amazon EKS) cluster.

The company wants to set up a robust monitoring system for the application. The company needs to analyze the logs from the EKS cluster and the application. The company needs to correlate the cluster ' s logs with the application ' s traces to identify points of failure in the whole application request flow.

Which combination of steps will meet these requirements with the LEAST development effort? (Select TWO.)

Options:

A.  

Use FluentBit to collect logs. Use OpenTelemetry to collect traces.

B.  

Use Amazon CloudWatch to collect logs. Use Amazon Kinesis to collect traces.

C.  

Use Amazon CloudWatch to collect logs. Use Amazon Managed Streaming for Apache Kafka (Amazon MSK) to collect traces.

D.  

Use Amazon OpenSearch to correlate the logs and traces.

E.  

Use AWS Glue to correlate the logs and traces.

Discussion 0
Question # 2

A company wants to migrate an application and an on-premises Apache Kafka server to AWS. The application processes incremental updates that an on-premises Oracle database sends to the Kafka server. The company wants to use the replatform migration strategy instead of the refactor strategy.

Which solution will meet these requirements with the LEAST management overhead?

Options:

A.  

Amazon Kinesis Data Streams

B.  

Amazon Managed Streaming for Apache Kafka (Amazon MSK) provisioned cluster

C.  

Amazon Data Firehose

D.  

Amazon Managed Streaming for Apache Kafka (Amazon MSK) Serverless

Discussion 0
Question # 3

A retail company is developing a data lake solution on Amazon S3 to analyze historical sales data. The solution needs to support frequent schema changes as new product attributes are added. The company must also be able to query point-in-time historical data snapshots for compliance reporting. The solution must provide atomicity, consistency, isolation, and durability (ACID) transaction guarantees for concurrent write operations.

Which solution will meet these requirements?

Options:

A.  

Create an AWS Glue Data Catalog table that uses CSV format. Schedule AWS Glue extract, transform, and load (ETL) jobs to transform the data into Parquet format and partition by date.

B.  

Create an AWS Glue Data Catalog table that uses Apache Iceberg table format. Set the format version to 2. Configure time travel retention policies in the table properties.

C.  

Enable Amazon S3 Versioning on the company ' s S3 bucket. Create an AWS Glue crawler to catalog the data. Use AWS Glue extract, transform, and load (ETL) jobs to read specific S3 version IDs.

D.  

Store the data in Amazon DynamoDB with a composite primary key that includes a timestamp. Use Amazon DynamoDB Streams to capture changes and replicate to Amazon S3 in Parquet format.

Discussion 0
Question # 4

Files from multiple data sources arrive in an Amazon S3 bucket on a regular basis. A data engineer wants to ingest new files into Amazon Redshift in near real time when the new files arrive in the S3 bucket.

Which solution will meet these requirements?

Options:

A.  

Use the query editor v2 to schedule a COPY command to load new files into Amazon Redshift.

B.  

Use the zero-ETL integration between Amazon Aurora and Amazon Redshift to load new files into Amazon Redshift.

C.  

Use AWS Glue job bookmarks to extract, transform, and load (ETL) load new files into Amazon Redshift.

D.  

Use S3 Event Notifications to invoke an AWS Lambda function that loads new files into Amazon Redshift.

Discussion 0
Question # 5

A data engineer needs to build an extract, transform, and load (ETL) job. The ETL job will process daily incoming .csv files that users upload to an Amazon S3 bucket. The size of each S3 object is less than 100 M

B.  

Which solution will meet these requirements MOST cost-effectively?

Options:

A.  

Write a custom Python application. Host the application on an Amazon Elastic Kubernetes Service (Amazon EKS) cluster.

B.  

Write a PySpark ETL script. Host the script on an Amazon EMR cluster.

C.  

Write an AWS Glue PySpark job. Use Apache Spark to transform the data.

D.  

Write an AWS Glue Python shell job. Use pandas to transform the data.

Discussion 0
Question # 6

A company uses an Amazon Redshift provisioned cluster as its database. The Redshift cluster has five reserved ra3.4xlarge nodes and uses key distribution.

A data engineer notices that one of the nodes frequently has a CPU load over 90%. SQL Queries that run on the node are queued. The other four nodes usually have a CPU load under 15% during daily operations.

The data engineer wants to maintain the current number of compute nodes. The data engineer also wants to balance the load more evenly across all five compute nodes.

Which solution will meet these requirements?

Options:

A.  

Change the sort key to be the data column that is most often used in a WHERE clause of the SQL SELECT statement.

B.  

Change the distribution key to the table column that has the largest dimension.

C.  

Upgrade the reserved node from ra3.4xlarqe to ra3.16xlarqe.

D.  

Change the primary key to be the data column that is most often used in a WHERE clause of the SQL SELECT statement.

Discussion 0
Question # 7

A company is uploading log files from on-premises servers to an Amazon S3 bucket. The company needs to validate that the logs from the on-premises servers are the same as the logs that are stored in the S3 bucket.

Which solution will meet this requirement?

Options:

A.  

Use the AWS SDK to automatically compute CRC32 checksums during the upload. Store the checksums in S3 object metadata.

B.  

Create an AWS Lambda function to calculate SHA-256 checksums. Store the results in a separate metadata table. Validate the logs after the upload.

C.  

Enable S3 Object Lock in compliance mode on the S3 bucket. Upload the objects to the bucket.

D.  

After uploading the objects to the S3 bucket, enable S3 Object Lock in governance mode on the S3 objects.

Discussion 0
Question # 8

A data engineer is implementing model governance for machine learning (ML) workflows on AWS. The data engineer needs a solution that can track the complete lifecycle of the ML models, including data preparation, model training, and deployment stages. The solution must ensure reproducibility and audit compliance.

Options:

A.  

Use Amazon SageMaker Debugger to capture metrics. Create associations between datasets and training jobs by monitoring training jobs.

B.  

Use Amazon SageMaker ML Lineage Tracking to create associations between artifacts, training jobs, and datasets by recording metadata.

C.  

Use Amazon SageMaker Model Monitor to create associations between artifacts and training jobs by tracking model performance.

D.  

Use Amazon SageMaker Experiments to create associations between datasets and artifacts by tracking hyperparameters and metrics.

Discussion 0
Question # 9

A company wants to migrate data from an Amazon RDS for PostgreSQL DB instance in the eu-east-1 Region of an AWS account named Account_

A.  

The company will migrate the data to an Amazon Redshift cluster in the eu-west-1 Region of an AWS account named Account_

B.  

Which solution will give AWS Database Migration Service (AWS DMS) the ability to replicate data between two data stores?

Options:

A.  

Set up an AWS DMS replication instance in Account_B in eu-west-1.

B.  

Set up an AWS DMS replication instance in Account_B in eu-east-1.

C.  

Set up an AWS DMS replication instance in a new AWS account in eu-west-1.

D.  

Set up an AWS DMS replication instance in Account_A in eu-east-1.

Discussion 0
Question # 10

A university is developing an educational application that analyzes student essays. The application provides personalized feedback with accurate citations to the university ' s textbooks. The application needs to process essays in multiple languages. Application responses must include direct references to specific sections in the course materials and must be in the student ' s selected language.

Which solution will meet these requirements with the LEAST operational overhead?

Options:

A.  

Build a custom vector database by using Amazon OpenSearch Serverless. Store textbook content as multilingual embeddings. Create an AWS Lambda function that queries the database when generating responses with Amazon Bedrock.

B.  

Create a knowledge base in Amazon Bedrock Knowledge Bases with the university ' s textbooks. Configure a multilingual model to generate responses with source citations.

C.  

Use Amazon Comprehend to detect the language and key topics in the essays. Use Amazon Kendra to search for relevant textbook passages. Create an AWS Lambda function that formats the textbook passages into feedback.

D.  

Use Amazon SageMaker to host a custom-trained large language model (LLM) that has been fine-tuned on the university ' s textbooks to generate personalized feedback with citations.

Discussion 0
Get Data-Engineer-Associate dumps and pass your exam in 24 hours!

Free Exams Sample Questions