On this site, we will help you to try the exam test first to verify your current strength! And we will also share the PDF mode for everyone to download and learn. Not only that, but we also provide complete Amazon das-c01 test questions and answers https://www.leads4pass.com/das-c01.html. The complete exam questions are verified by Amazon AWS Certified Specialty experts to ensure that all exam questions and answers are valid. Next, I will share some exam practice questions.
Amazon das-c01 free exam PDF download online
Google Drive: https://drive.google.com/file/d/1UGIiWRMaMCKbj5oE9zch0yZwX-Hk8zsv/view?usp=sharing
Amazon das-c01 exam practice test
All answers are announced at the end of the article
QUESTION 1
A company has a business unit uploading .csv files to an Amazon S3 bucket. The company\\’s data platform team has
set up an AWS Glue crawler to do discovery, and create tables and schemas. An AWS Glue job writes processed data
from the created tables to an Amazon Redshift database. The AWS Glue job handles column mapping and creating the
Amazon Redshift table appropriately. When the AWS Glue job is rerun for any reason in a day, duplicate records are
introduced into the Amazon Redshift table.
Which solution will update the Redshift table without duplicates when jobs are rerun?
A. Modify the AWS Glue job to copy the rows into a staging table. Add SQL commands to replace the existing rows in
the main table as postactions in the DynamicFrameWriter class.
B. Load the previously inserted data into a MySQL database in the AWS Glue job. Perform an upsert operation in
MySQL, and copy the results to the Amazon Redshift table.
C. Use Apache Spark\\’s DataFrame dropDuplicates() API to eliminate duplicates and then write the data to Amazon
Redshift.
D. Use the AWS Glue ResolveChoice built-in transform to select the most recent value of the column.
Reference: https://towardsdatascience.com/update-and-insert-upsert-data-from-aws-glue-698ac582e562
QUESTION 2
A company has developed an Apache Hive script to batch process data stared in Amazon S3. The script needs to run
once every day and store the output in Amazon S3. The company tested the script, and it completes within 30 minutes
on a small local three-node cluster.
Which solution is the MOST cost-effective for scheduling and executing the script?
A. Create an AWS Lambda function to spin up an Amazon EMR cluster with a Hive execution step. Set
KeepJobFlowAliveWhenNoSteps to false and disable the termination protection flag. Use Amazon CloudWatch Events
to schedule the Lambda function to run daily.
B. Use the AWS Management Console to spin up an Amazon EMR cluster with Python Hue. Hive, and Apache Oozie.
Set the termination protection flag to true and use Spot Instances for the core nodes of the cluster. Configure an Oozie
workflow in the cluster to invoke the Hive script daily.
C. Create an AWS Glue job with the Hive script to perform the batch operation. Configure the job to run once a day
using a time-based schedule.
D. Use AWS Lambda layers and load the Hive runtime to AWS Lambda and copy the Hive script. Schedule the Lambda
function to run daily by creating a workflow using AWS Step Functions.
QUESTION 3
An online retail company with millions of users around the globe wants to improve its ecommerce analytics capabilities. Currently, clickstream data is uploaded directly to Amazon S3 as compressed files. Several times each day, an application running on Amazon EC2 processes the data and makes search options and reports available for
visualization by editors and marketers. The company wants to make website clicks and aggregated data available to
editors and marketers in minutes to enable them to connect with users more effectively.
Which options will help meet these requirements in the MOST efficient way? (Choose two.)
A. Use Amazon Kinesis Data Firehose to upload compressed and batched clickstream records to Amazon Elasticsearch
Service.
B. Upload clickstream records to Amazon S3 as compressed files. Then use AWS Lambda to send data to Amazon
Elasticsearch Service from Amazon S3.
C. Use Amazon Elasticsearch Service deployed on Amazon EC2 to aggregate, filter, and process the data. Refresh
content performance dashboards in near-real time.
D. Use Kibana to aggregate, filter, and visualize the data stored in Amazon Elasticsearch Service. Refresh content
performance dashboards in near-real time.
E. Upload clickstream records from Amazon S3 to Amazon Kinesis Data Streams and use a Kinesis Data Streams
consumer to send records to Amazon Elasticsearch Service.
QUESTION 4
A company owns facilities with IoT devices installed across the world. The company is using Amazon Kinesis Data
Streams to stream data from the devices to Amazon S3. The company\\’s operations team wants to get insights from
the IoT data to monitor data quality at ingestion. The insights need to be derived in near-real time, and the output must be logged to Amazon DynamoDB for further analysis.
Which solution meets these requirements?
A. Connect Amazon Kinesis Data Analytics to analyze the stream data. Save the output to DynamoDB by using the
default output from Kinesis Data Analytics.
B. Connect Amazon Kinesis Data Analytics to analyze the stream data. Save the output to DynamoDB by using an AWS
Lambda function.
C. Connect Amazon Kinesis Data Firehose to analyze the stream data by using an AWS Lambda function. Save the
output to DynamoDB by using the default output from Kinesis Data Firehose.
D. Connect Amazon Kinesis Data Firehose to analyze the stream data by using an AWS Lambda function. Save the
data to Amazon S3. Then run an AWS Glue job on schedule to ingest the data into DynamoDB.
QUESTION 5
A data analyst is using AWS Glue to organize, cleanse, validate, and format a 200 GB dataset. The data analyst
triggered the job to run with the Standard worker type. After 3 hours, the AWS Glue job status is still RUNNING. Logs
from the job run show no error codes. The data analyst wants to improve the job execution time without
overprovisioning.
Which actions should the data analyst take?
A. Enable job bookmarks in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled
metrics, increase the value of the executor-cores job parameter.
B. Enable job metrics in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled
metrics, increase the value of the maximum capacity job parameter.
C. Enable job metrics in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled
metrics, increase the value of the spark.yarn.executor.memoryOverhead job parameter.
D. Enable job bookmarks in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled
metrics, increase the value of the num-executors job parameter.
Reference: https://docs.aws.amazon.com/glue/latest/dg/monitor-debug-capacity.html
QUESTION 6
A company wants to enrich application logs in near-real-time and use the enriched dataset for further analysis. The
application is running on Amazon EC2 instances across multiple Availability Zones and storing its logs using Amazon
CloudWatch Logs. The enrichment source is stored in an Amazon DynamoDB table.
Which solution meets the requirements for the event collection and enrichment?
A. Use a CloudWatch Logs subscription to send the data to Amazon Kinesis Data Firehose. Use AWS Lambda to
transform the data in the Kinesis Data Firehose delivery stream and enrich it with the data in the DynamoDB table.
Configure Amazon S3 as the Kinesis Data Firehose delivery destination.
B. Export the raw logs to Amazon S3 on an hourly basis using the AWS CLI. Use AWS Glue crawlers to catalog the
logs. Set up an AWS Glue connection for the DynamoDB table and set up an AWS Glue ETL job to enrich the data.
Store the enriched data in Amazon S3.
C. Configure the application to write the logs locally and use Amazon Kinesis Agent to send the data to Amazon Kinesis Data Streams. Configure a Kinesis Data Analytics SQL application with the Kinesis data stream as the source. Join the SQL application input stream with DynamoDB records, and then store the enriched output stream in Amazon S3 using Amazon Kinesis Data Firehose.
D. Export the raw logs to Amazon S3 on an hourly basis using the AWS CLI. Use Apache Spark SQL on Amazon EMR
to read the logs from Amazon S3 and enrich the records with the data from DynamoDB. Store the enriched data in
Amazon S3.
QUESTION 7
A media company has been performing analytics on log data generated by its applications. There has been a recent
increase in the number of concurrent analytics jobs running, and the overall performance of existing jobs is decreasing
as the number of new jobs is increasing. The partitioned data is stored in Amazon S3 One Zone-Infrequent Access (S3
One Zone-IA) and the analytic processing is performed on Amazon EMR clusters using the EMR File System (EMRFS)
with consistent view enabled. A data analyst has determined that it is taking longer for the EMR task nodes to list
objects in Amazon S3.
Which action would MOST likely increase the performance of accessing log data in Amazon S3?
A. Use a hash function to create a random string and add that to the beginning of the object prefixes when storing the
log data in Amazon S3.
B. Use a lifecycle policy to change the S3 storage class to S3 Standard for the log data.
C. Increase the read capacity units (RCUs) for the shared Amazon DynamoDB table.
D. Redeploy the EMR clusters that are running slowly to a different Availability Zone.
QUESTION 8
A company is migrating its existing on-premises ETL jobs to Amazon EMR. The code consists of a series of jobs written
in Java. The company needs to reduce overhead for the system administrators without changing the underlying code.
Due to the sensitivity of the data, compliance requires that the company use root device volume encryption on all nodes in the cluster. Corporate standards require that environments be provisioned though AWS CloudFormation when possible.
Which solution satisfies these requirements?
A. Install open-source Hadoop on Amazon EC2 instances with encrypted root device volumes. Configure the cluster in
the CloudFormation template.
B. Use a CloudFormation template to launch an EMR cluster. In the configuration section of the cluster, define a
bootstrap action to enable TLS.
C. Create a custom AMI with encrypted root device volumes. Configure Amazon EMR to use the custom AMI using the
CustomAmild property in the CloudFormation template.
D. Use a CloudFormation template to launch an EMR cluster. In the configuration section of the cluster, define a
bootstrap action to encrypt the root device volume of every node.
Reference: https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-custom-ami.html
QUESTION 9
A company analyzes its data in an Amazon Redshift data warehouse, which currently has a cluster of three dense
storage nodes. Due to a recent business acquisition, the company needs to load an additional 4 TB of user data into
Amazon Redshift. The engineering team will combine all the user data and apply complex calculations that require I/O intensive resources. The company needs to adjust the cluster\\’s capacity to support the change in analytical and
storage requirements.
Which solution meets these requirements?
A. Resize the cluster using elastic resize with dense compute nodes.
B. Resize the cluster using classic resize with dense compute nodes.
C. Resize the cluster using elastic resize with dense storage nodes.
D. Resize the cluster using classic resize with dense storage nodes.
Reference: https://aws.amazon.com/redshift/pricing/
QUESTION 10
An insurance company has raw data in JSON format that is sent without a predefined schedule through an Amazon
Kinesis Data Firehose delivery stream to an Amazon S3 bucket. An AWS Glue crawler is scheduled to run every 8
hours to update the schema in the data catalog of the tables stored in the S3 bucket. Data analysts analyze the data
using Apache Spark SQL on Amazon EMR set up with AWS Glue Data Catalog as the metastore. Data analysts say
that, occasionally, the data they receive is stale. A data engineer needs to provide access to the most up-to-date data.
Which solution meets these requirements?
A. Create an external schema based on the AWS Glue Data Catalog on the existing Amazon Redshift cluster to query
new data in Amazon S3 with Amazon Redshift Spectrum.
B. Use Amazon CloudWatch Events with the rate (1 hour) expression to execute the AWS Glue crawler every hour.
C. Using the AWS CLI, modify the execution schedule of the AWS Glue crawler from 8 hours to 1 minute.
D. Run the AWS Glue crawler from an AWS Lambda function triggered by an S3:ObjectCreated:* event notification on
the S3 bucket.
QUESTION 11
A streaming application is reading data from Amazon Kinesis Data Streams and immediately writing the data to an
Amazon S3 bucket every 10 seconds. The application is reading data from hundreds of shards. The batch interval
cannot be changed due to a separate requirement. The data is being accessed by Amazon Athena. Users are seeing
degradation in query performance as time progresses.
Which action can help improve query performance?
A. Merge the files in Amazon S3 to form larger files.
B. Increase the number of shards in Kinesis Data Streams.
C. Add more memory and CPU capacity to the streaming application.
D. Write the files to multiple S3 buckets.
QUESTION 12
A manufacturing company has been collecting IoT sensor data from devices on its factory floor for a year and is storing
the data in Amazon Redshift for daily analysis. A data analyst has determined that, at an expected ingestion rate of
about 2 TB per day, the cluster will be undersized in less than 4 months. A longterm solution is needed. The data
analyst has indicated that most queries only reference the most recent 13 months of data, yet there are also quarterly
reports that need to query all the data generated from the past 7 years. The chief technology officer (CTO) is concerned about the costs, administrative effort, and performance of a long-term solution.
Which solution should the data analyst use to meet these requirements?
A. Create a daily job in AWS Glue to UNLOAD records older than 13 months to Amazon S3 and delete those records
from Amazon Redshift. Create an external table in Amazon Redshift to point to the S3 location. Use Amazon Redshift
Spectrum to join to data that is older than 13 months.
B. Take a snapshot of the Amazon Redshift cluster. Restore the cluster to a new cluster using dense storage nodes with
additional storage capacity.
C. Execute a CREATE TABLE AS SELECT (CTAS) statement to move records that are older than 13 months to
quarterly partitioned data in Amazon Redshift Spectrum backed by Amazon S3.
D. Unload all the tables in Amazon Redshift to an Amazon S3 bucket using S3 Intelligent-Tiering. Use AWS Glue to
crawl the S3 bucket location to create external tables in an AWS Glue Data Catalog. Create an Amazon EMR cluster
using Auto Scaling for any daily analytics needs, and use Amazon Athena for the quarterly reports, with both using the
same AWS Glue Data Catalog.
QUESTION 13
A large university has adopted a strategic goal of increasing diversity among enrolled students. The data analytics team
is creating a dashboard with data visualizations to enable stakeholders to view historical trends. All access must be
authenticated using Microsoft Active Directory. All data in transit and at rest must be encrypted.
Which solution meets these requirements?
A. Amazon QuickSight Standard edition configured to perform identity federation using SAML 2.0. and the default
encryption settings.
B. Amazon QuickSight Enterprise edition configured to perform identity federation using SAML 2.0 and the default
encryption settings.
C. Amazon QuckSight Standard edition using AD Connector to authenticate using Active Directory. Configure Amazon
QuickSight to use customer-provided keys imported into AWS KMS.
D. Amazon QuickSight Enterprise edition using AD Connector to authenticate using Active Directory. Configure Amazon QuickSight to use customer-provided keys imported into AWS KMS.
Reference: https://docs.aws.amazon.com/quicksight/latest/user/WhatsNew.html
Publish the answer
Q1 | Q2 | Q3 | Q4 | Q5 | Q6 | Q7 | Q8 | Q9 | Q10 | Q11 | Q12 | Q13 |
B | C | CE | C | B | C | D | C | C | A | C | B | D |
Amazon das-c01 free exam PDF download online
Google Drive: https://drive.google.com/file/d/1UGIiWRMaMCKbj5oE9zch0yZwX-Hk8zsv/view?usp=sharing
This article shares the latest updated Amazon das-c01 exam dumps, exam practice questions and exam PDF, and exam tips. These can help you understand your current strength and promote your progress!
leads4pass das-c01 complete exam questions are verified by our Amazon AWS Certified Specialty experts as valid exam dumps https://www.leads4pass.com/das-c01.html. It can help you pass the exam successfully for the first time!
Awsexamdumps shares Amazon exam questions and answers for free throughout the year. If you like, please bookmark and share! Thanks!