Summer Sale Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 65pass65

Data-Engineer-Associate AWS Certified Data Engineer - Associate (DEA-C01) is now Stable and With Pass Result | Test Your Knowledge for Free

Exams4sure Dumps

Data-Engineer-Associate Practice Questions

AWS Certified Data Engineer - Associate (DEA-C01)

Last Update 4 days ago
Total Questions : 302

Dive into our fully updated and stable Data-Engineer-Associate practice test platform, featuring all the latest AWS Certified Data Engineer exam questions added this week. Our preparation tool is more than just a Amazon Web Services study aid; it's a strategic advantage.

Our free AWS Certified Data Engineer practice questions crafted to reflect the domains and difficulty of the actual exam. The detailed rationales explain the 'why' behind each answer, reinforcing key concepts about Data-Engineer-Associate. Use this test to pinpoint which areas you need to focus your study on.

Data-Engineer-Associate PDF

Data-Engineer-Associate PDF (Printable)
$54.25
$154.99

Data-Engineer-Associate Testing Engine

Data-Engineer-Associate PDF (Printable)
$59.5
$169.99

Data-Engineer-Associate PDF + Testing Engine

Data-Engineer-Associate PDF (Printable)
$74.55
$212.99
Question # 71

A data engineer must implement Amazon Redshift Serverless as a data warehouse for a company. The data engineer needs to integrate multiple Amazon Aurora MySQL databases into Amazon Redshift. The solution must maintain near real-time latency and minimize infrastructure management as much as possible.

Which solution will meet these requirements?

Options:

A.  

Use AWS Database Migration Service (AWS DMS) Serverless to ingest data into Amazon Redshift.

B.  

Create a Python module for an AWS Glue job to standardize the data ingestion from Aurora MySQL into Amazon Redshift.

C.  

Create an AWS Lambda function to ingest data into Amazon Redshift.

D.  

Set up a zero-ETL integration between the Aurora MySQL databases and Amazon Redshift Serverless.

Discussion 0
Question # 72

A company hosts its applications on Amazon EC2 instances. The company must use SSL/TLS connections that encrypt data in transit to communicate securely with AWS infrastructure that is managed by a customer.

A data engineer needs to implement a solution to simplify the generation, distribution, and rotation of digital certificates. The solution must automatically renew and deploy SSL/TLS certificates.

Which solution will meet these requirements with the LEAST operational overhead?

Options:

A.  

Store self-managed certificates on the EC2 instances.

B.  

Use AWS Certificate Manager (ACM).

C.  

Implement custom automation scripts in AWS Secrets Manager.

D.  

Use Amazon Elastic Container Service (Amazon ECS) Service Connect.

Discussion 0
Question # 73

An ecommerce company processes millions of orders each day. The company uses AWS Glue ETL to collect data from multiple sources, clean the data, and store the data in an Amazon S3 bucket in CSV format by using the S3 Standard storage class. The company uses the stored data to conduct daily analysis.

The company wants to optimize costs for data storage and retrieval.

Which solution will meet this requirement?

Options:

A.  

Transition the data to Amazon S3 Glacier Flexible Retrieval.

B.  

Transition the data from Amazon S3 to an Amazon Aurora cluster.

C.  

Configure AWS Glue ETL to transform the incoming data to Apache Parquet format.

D.  

Configure AWS Glue ETL to use Amazon EMR to process incoming data in parallel.

Discussion 0
Question # 74

A company saves customer data to an Amazon S3 bucket. The company uses server-side encryption with AWS KMS keys (SSE-KMS) to encrypt the bucket. The dataset includes personally identifiable information (PII) such as social security numbers and account details.

Data that is tagged as PII must be masked before the company uses customer data for analysis. Some users must have secure access to the PII data during the preprocessing phase. The company needs a low-maintenance solution to mask and secure the PII data throughout the entire engineering pipeline.

Which combination of solutions will meet these requirements? (Select TWO.)

Options:

A.  

Use AWS Glue DataBrew to perform extract, transform, and load (ETL) tasks that mask the PII data before analysis.

B.  

Use Amazon GuardDuty to monitor access patterns for the PII data that is used in the engineering pipeline.

C.  

Configure an Amazon Made discovery job for the S3 bucket.

D.  

Use AWS Identity and Access Management (IAM) to manage permissions and to control access to the PII data.

E.  

Write custom scripts in an application to mask the PII data and to control access.

Discussion 0
Question # 75

A company is developing machine learning (ML) models. A data engineer needs to apply data quality rules to training data. The company stores the training data in an Amazon S3 bucket.

Options:

A.  

Create an AWS Lambda function to check data quality and to raise exceptions in the code.

B.  

Create an AWS Glue DataBrew project for the data in the S3 bucket. Create a ruleset for the data quality rules. Create a profile job to run the data quality rules. Use Amazon EventBridge to run the profile job when data is added to the S3 bucket.

C.  

Create an Amazon EMR provisioned cluster. Add a Python data quality package.

D.  

Create AWS Lambda functions to evaluate data quality rules and orchestrate with AWS Step Functions.

Discussion 0
Question # 76

A data engineer configured an AWS Glue Data Catalog for data that is stored in Amazon S3 buckets. The data engineer needs to configure the Data Catalog to receive incremental updates.

The data engineer sets up event notifications for the S3 bucket and creates an Amazon Simple Queue Service (Amazon SQS) queue to receive the S3 events.

Which combination of steps should the data engineer take to meet these requirements with LEAST operational overhead? (Select TWO.)

Options:

A.  

Create an S3 event-based AWS Glue crawler to consume events from the SQS queue.

B.  

Define a time-based schedule to run the AWS Glue crawler, and perform incremental updates to the Data Catalog.

C.  

Use an AWS Lambda function to directly update the Data Catalog based on S3 events that the SQS queue receives.

D.  

Manually initiate the AWS Glue crawler to perform updates to the Data Catalog when there is a change in the S3 bucket.

E.  

Use AWS Step Functions to orchestrate the process of updating the Data Catalog based on 53 events that the SQS queue receives.

Discussion 0
Question # 77

A global ecommerce company processes customer transactions, inventory updates, and user activity logs across multiple AWS services. The company needs a scalable, fully managed, and event-driven orchestration solution to coordinate complex extract, transform, and load (ETL) workflows. The solution must use AWS Glue and Amazon EMR to process data. The data will be stored in Amazon Redshift and Amazon S3. The solution must support dependency management, automated retries, and data pipeline monitoring.

Which solution will meet these requirements?

Options:

A.  

Use AWS Step Functions to define an express workflow that invokes the data transformation and loading tasks across Amazon EMR and AWS Glue.

B.  

Create AWS Lambda functions for each step of the workflow. Configure Amazon EventBridge to invoke AWS Glue jobs. Configure the Lambda functions to process and move data through the pipeline.

C.  

Use Apache Airflow on Amazon Managed Workflows for Apache Airflow (Amazon MWAA) to create Directed Acyclic Graphs (DAGs) to manage ETL workflows.

D.  

Create an AWS Lambda function that runs each step of the workflow. Create an Amazon EventBridge scheduled rule to invoke the function every day.

Discussion 0
Question # 78

A company wants to use Apache Spark jobs that run on an Amazon EMR cluster to process streaming data. The Spark jobs will transform and store the data in an Amazon S3 bucket. The company will use Amazon Athena to perform analysis.

The company needs to optimize the data format for analytical queries.

Which solutions will meet these requirements with the SHORTEST query times? (Select TWO.)

Options:

A.  

Use Avro format. Use AWS Glue Data Catalog to track schema changes.

B.  

Use ORC format. Use AWS Glue Data Catalog to track schema changes.

C.  

Use Apache Parquet format. Use an external Amazon DynamoDB table to track schema changes.

D.  

Use Apache Parquet format. Use AWS Glue Data Catalog to track schema changes.

E.  

Use ORC format. Store schema definitions in separate files in Amazon S3.

Discussion 0
Question # 79

A data engineer is launching an Amazon EMR duster. The data that the data engineer needs to load into the new cluster is currently in an Amazon S3 bucket. The data engineer needs to ensure that data is encrypted both at rest and in transit.

The data that is in the S3 bucket is encrypted by an AWS Key Management Service (AWS KMS) key. The data engineer has an Amazon S3 path that has a Privacy Enhanced Mail (PEM) file.

Which solution will meet these requirements?

Options:

A.  

Create an Amazon EMR security configuration. Specify the appropriate AWS KMS key for at-rest encryption for the S3 bucket. Create a second security configuration. Specify the Amazon S3 path of the PEM file for in-transit encryption. Create the EMR cluster, and attach both security configurations to the cluster.

B.  

Create an Amazon EMR security configuration. Specify the appropriate AWS KMS key for local disk encryption for the S3 bucket. Specify the Amazon S3 path of the PEM file for in-transit encryption. Use the security configuration during EMR cluster creation.

C.  

Create an Amazon EMR security configuration. Specify the appropriate AWS KMS key for at-rest encryption for the S3 bucket. Specify the Amazon S3 path of the PEM file for in-transit encryption. Use the security configuration during EMR cluster creation.

D.  

Create an Amazon EMR security configuration. Specify the appropriate AWS KMS key for at-rest encryption for the S3 bucket. Specify the Amazon S3 path of the PEM file for in-transit encryption. Create the EMR cluster, and attach the security configuration to the cluster.

Discussion 0
Question # 80

A company plans to use Amazon Kinesis Data Firehose to store data in Amazon S3. The source data consists of 2 MB csv files. The company must convert the .csv files to JSON format. The company must store the files in Apache Parquet format.

Which solution will meet these requirements with the LEAST development effort?

Options:

A.  

Use Kinesis Data Firehose to convert the csv files to JSON. Use an AWS Lambda function to store the files in Parquet format.

B.  

Use Kinesis Data Firehose to convert the csv files to JSON and to store the files in Parquet format.

C.  

Use Kinesis Data Firehose to invoke an AWS Lambda function that transforms the .csv files to JSON and stores the files in Parquet format.

D.  

Use Kinesis Data Firehose to invoke an AWS Lambda function that transforms the .csv files to JSON. Use Kinesis Data Firehose to store the files in Parquet format.

Discussion 0
Get Data-Engineer-Associate dumps and pass your exam in 24 hours!

Free Exams Sample Questions