[2021.8 Updated] The latest actual update Amazon DAS-C01 exam dumps | lead4pass PDF and VCE
The latest update Amazon DAS-C01 brain dumps comes from Lead4Pass! Amazon DAS-C01 exam questions are updated throughout the year to ensure that they are actually valid!
Welcome to download the latest Lead4Pass Amazon DAS-C01 dumps with PDF and VCE: https://www.lead4pass.com/das-c01.html (111 Q&A)
[Lead4Pass DAS-C01 pdf] Amazon DAS-C01 exam PDF uploaded from google drive, online download provided by the latest update of Lead4pass:
[Lead4pass DAS-C01 practice test] Latest update Amazon DAS-C01 exam questions and answers online practice test
A company wants to provide its data analysts with uninterrupted access to the data in its Amazon Redshift cluster. All
data is streamed to an Amazon S3 bucket with Amazon Kinesis Data Firehose. An AWS Glue job that is scheduled to
run every 5 minutes issues a COPY command to move the data into Amazon Redshift.
The amount of data delivered is uneven throughout then day, and cluster utilization is high during certain periods. The
COPY command usually completes within a couple of seconds. However, when load spike occurs, locks can exist and
data can be missed. Currently, the AWS Glue job is configured to run without retries, with timeout at 5 minutes and
concurrency at 1.
How should a data analytics specialist configure the AWS Glue job to optimize fault tolerance and improve data
availability in the Amazon Redshift cluster?
A. Increase the number of retries. Decrease the timeout value. Increase the job concurrency.
B. Keep the number of retries at 0. Decrease the timeout value. Increase the job concurrency.
C. Keep the number of retries at 0. Decrease the timeout value. Keep the job concurrency at 1.
D. Keep the number of retries at 0. Increase the timeout value. Keep the job concurrency at 1.
Correct Answer: B
A large company has a central data lake to run analytics across different departments. Each department uses a
separate AWS account and stores its data in an Amazon S3 bucket in that account. Each AWS account uses the AWS
Glue Data Catalog as its data catalog. There are different data lake access requirements based on roles. Associate
analysts should only have read access to their departmental data. Senior data analysts can have access in multiple
departments including theirs, but for a subset of columns only.
Which solution achieves these required access patterns to minimize costs and administrative tasks?
A. Consolidate all AWS accounts into one account. Create different S3 buckets for each department and move all the
data from every account to the central data lake account. Migrate the individual data catalogs into a central data catalog
and apply fine-grained permissions to give to each user the required access to tables and databases in AWS Glue and
B. Keep the account structure and the individual AWS Glue catalogs on each account. Add a central data lake account
and use AWS Glue to catalog data from various accounts. Configure cross-account access for AWS Glue crawlers to
scan the data in each departmental S3 bucket to identify the schema and populate the catalog. Add the senior data
analysts into the central account and apply highly detailed access controls in the Data Catalog and Amazon S3.
C. Set up an individual AWS account for the central data lake. Use AWS Lake Formation to catalog the cross-account
locations. On each individual S3 bucket, modify the bucket policy to grant S3 permissions to the Lake Formation servicelinked role. Use Lake Formation permissions to add fine-grained access controls to allow senior analysts to view specific
tables and columns.
D. Set up an individual AWS account for the central data lake and configure a central S3 bucket. Use an AWS Lake
Formation blueprint to move the data from the various buckets into the central S3 bucket. On each individual bucket,
modify the bucket policy to grant S3 permissions to the Lake Formation service-linked role. Use Lake Formation
permissions to add fine-grained access controls for both associate and senior analysts to view specific tables and
Correct Answer: B
A company wants to collect and process events data from different departments in near-real time. Before storing the
data in Amazon S3, the company needs to clean the data by standardizing the format of the address and timestamp
columns. The data varies in size based on the overall load at each particular point in time. A single data record can be
100 KB-10 MB.
How should a data analytics specialist design the solution for data ingestion?
A. Use Amazon Kinesis Data Streams. Configure a stream for the raw data. Use a Kinesis Agent to write data to the
stream. Create an Amazon Kinesis Data Analytics application that reads data from the raw stream, cleanses it, and
stores the output to Amazon S3.
B. Use Amazon Kinesis Data Firehose. Configure a Firehose delivery stream with a preprocessing AWS Lambda
function for data cleansing. Use a Kinesis Agent to write data to the delivery stream. Configure Kinesis Data Firehose to
deliver the data to Amazon S3.
C. Use Amazon Managed Streaming for Apache Kafka. Configure a topic for the raw data. Use a Kafka producer to
write data to the topic. Create an application on Amazon EC2 that reads data from the topic by using the Apache Kafka
consumer API, cleanses the data, and writes to Amazon S3.
D. Use Amazon Simple Queue Service (Amazon SQS). Configure an AWS Lambda function to read events from the
SQS queue and upload the events to Amazon S3.
Correct Answer: B
A company owns facilities with IoT devices installed across the world. The company is using Amazon Kinesis Data
Streams to stream data from the devices to Amazon S3. The company\\’s operations team wants to get insights from
the IoT data to monitor data quality at ingestion. The insights need to be derived in near-real time, and the output must
be logged to Amazon DynamoDB for further analysis.
Which solution meets these requirements?
A. Connect Amazon Kinesis Data Analytics to analyze the stream data. Save the output to DynamoDB by using the
default output from Kinesis Data Analytics.
B. Connect Amazon Kinesis Data Analytics to analyze the stream data. Save the output to DynamoDB by using an AWS
C. Connect Amazon Kinesis Data Firehose to analyze the stream data by using an AWS Lambda function. Save the
output to DynamoDB by using the default output from Kinesis Data Firehose.
D. Connect Amazon Kinesis Data Firehose to analyze the stream data by using an AWS Lambda function. Save the
data to Amazon S3. Then run an AWS Glue job on schedule to ingest the data into DynamoDB.
Correct Answer: C
A bank operates in a regulated environment. The compliance requirements for the country in which the bank operates
say that customer data for each state should only be accessible by the bank\\’s employees located in the same state.
Bank employees in one state should NOT be able to access data for customers who have provided a home address in a
The bank\\’s marketing team has hired a data analyst to gather insights from customer data for a new campaign being
launched in certain states. Currently, data linking each customer account to its home state is stored in a tabular .csv file
within a single Amazon S3 folder in a private S3 bucket. The total size of the S3 folder is 2 GB uncompressed. Due to
the country\\’s compliance requirements, the marketing team is not able to access this folder.
The data analyst is responsible for ensuring that the marketing team gets one-time access to customer data for their
campaign analytics project, while being subject to all the compliance requirements and controls.
Which solution should the data analyst implement to meet the desired requirements with the LEAST amount of setup
A. Re-arrange data in Amazon S3 to store customer data about each state in a different S3 folder within the same
bucket. Set up S3 bucket policies to provide marketing employees with appropriate data access under compliance
controls. Delete the bucket policies after the project.
B. Load tabular data from Amazon S3 to an Amazon EMR cluster using s3DistCp. Implement a custom Hadoop-based
row-level security solution on the Hadoop Distributed File System (HDFS) to provide marketing employees with
appropriate data access under compliance controls. Terminate the EMR cluster after the project.
C. Load tabular data from Amazon S3 to Amazon Redshift with the COPY command. Use the built-in row-level security
feature in Amazon Redshift to provide marketing employees with appropriate data access under compliance controls.
Delete the Amazon Redshift tables after the project.
D. Load tabular data from Amazon S3 to Amazon QuickSight Enterprise edition by directly importing it as a data source.
Use the built-in row-level security feature in Amazon QuickSight to provide marketing employees with appropriate data
access under compliance controls. Delete Amazon QuickSight data sources after the project is complete.
Correct Answer: C
A company has developed several AWS Glue jobs to validate and transform its data from Amazon S3 and load it into
Amazon RDS for MySQL in batches once every day. The ETL jobs read the S3 data using a DynamicFrame. Currently,
the ETL developers are experiencing challenges in processing only the incremental data on every run, as the AWS Glue
job processes all the S3 input data on each run.
Which approach would allow the developers to solve the issue with minimal coding effort?
A. Have the ETL jobs read the data from Amazon S3 using a DataFrame.
B. Enable job bookmarks on the AWS Glue jobs.
C. Create custom logic on the ETL jobs to track the processed S3 objects.
D. Have the ETL jobs delete the processed objects or data from Amazon S3 after each run.
Correct Answer: D
A data analytics specialist is building an automated ETL ingestion pipeline using AWS Glue to ingest compressed files
that have been uploaded to an Amazon S3 bucket. The ingestion pipeline should support incremental data processing.
Which AWS Glue feature should the data analytics specialist use to meet this requirement?
C. Job bookmarks
Correct Answer: B
An online retail company is migrating its reporting system to AWS. The company\\’s legacy system runs data processing
on online transactions using a complex series of nested Apache Hive queries. Transactional data is exported from the
online system to the reporting system several times a day. Schemas in the files are stable between updates.
A data analyst wants to quickly migrate the data processing to AWS, so any code changes should be minimized. To
keep storage costs low, the data analyst decides to store the data in Amazon S3. It is vital that the data from the reports
and associated analytics is completely up to date based on the data in Amazon S3.
Which solution meets these requirements?
A. Create an AWS Glue Data Catalog to manage the Hive metadata. Create an AWS Glue crawler over Amazon S3 that
runs when data is refreshed to ensure that data changes are updated. Create an Amazon EMR cluster and use the
metadata in the AWS Glue Data Catalog to run Hive processing queries in Amazon EMR.
B. Create an AWS Glue Data Catalog to manage the Hive metadata. Create an Amazon EMR cluster with consistent
view enabled. Run emrfs sync before each analytics step to ensure data changes are updated. Create an EMR cluster
and use the metadata in the AWS Glue Data Catalog to run Hive processing queries in Amazon EMR.
C. Create an Amazon Athena table with CREATE TABLE AS SELECT (CTAS) to ensure data is refreshed from
underlying queries against the raw dataset. Create an AWS Glue Data Catalog to manage the Hive metadata over the
CTAS table. Create an Amazon EMR cluster and use the metadata in the AWS Glue Data Catalog to run Hive
processing queries in Amazon EMR.
D. Use an S3 Select query to ensure that the data is properly updated. Create an AWS Glue Data Catalog to manage
the Hive metadata over the S3 Select table. Create an Amazon EMR cluster and use the metadata in the AWS Glue
Data Catalog to run Hive processing queries in Amazon EMR.
Correct Answer: A
A company has an encrypted Amazon Redshift cluster. The company recently enabled Amazon Redshift audit logs and
needs to ensure that the audit logs are also encrypted at rest. The logs are retained for 1 year. The auditor queries the
logs once a month.
What is the MOST cost-effective way to meet these requirements?
A. Encrypt the Amazon S3 bucket where the logs are stored by using AWS Key Management Service (AWS KMS).
Copy the data into the Amazon Redshift cluster from Amazon S3 on a daily basis. Query the data as required.
B. Disable encryption on the Amazon Redshift cluster, configure audit logging, and encrypt the Amazon Redshift cluster.
Use Amazon Redshift Spectrum to query the data as required.
C. Enable default encryption on the Amazon S3 bucket where the logs are stored by using AES-256 encryption. Copy
the data into the Amazon Redshift cluster from Amazon S3 on a daily basis. Query the data as required.
D. Enable default encryption on the Amazon S3 bucket where the logs are stored by using AES-256 encryption. Use
Amazon Redshift Spectrum to query the data as required.
Correct Answer: A
A media company wants to perform machine learning and analytics on the data residing in its Amazon S3 data lake.
There are two data transformation requirements that will enable the consumers within the company to create reports:
Daily transformations of 300 GB of data with different file formats landing in Amazon S3 at a scheduled time.
One-time transformations of terabytes of archived data residing in the S3 data lake.
Which combination of solutions cost-effectively meets the company\\’s requirements for transforming the data? (Choose
A. For daily incoming data, use AWS Glue crawlers to scan and identify the schema.
B. For daily incoming data, use Amazon Athena to scan and identify the schema.
C. For daily incoming data, use Amazon Redshift to perform transformations.
D. For daily incoming data, use AWS Glue workflows with AWS Glue jobs to perform transformations.
E. For archived data, use Amazon EMR to perform data transformations.
F. For archived data, use Amazon SageMaker to perform data transformations.
Correct Answer: BCD
A retail company has 15 stores across 6 cities in the United States. Once a month, the sales team requests a
visualization in Amazon QuickSight that provides the ability to easily identify revenue trends across cities and stores.
The visualization also helps identify outliers that need to be examined with further analysis.
Which visual type in QuickSight meets the sales team\\’s requirements?
A. Geospatial chart
B. Line chart
C. Heat map
D. Tree map
Correct Answer: A
An online retail company with millions of users around the globe wants to improve its ecommerce analytics capabilities.
Currently, clickstream data is uploaded directly to Amazon S3 as compressed files. Several times each day, an
application running on Amazon EC2 processes the data and makes search options and reports available for
visualization by editors and marketers. The company wants to make website clicks and aggregated data available to
editors and marketers in minutes to enable them to connect with users more effectively.
Which options will help meet these requirements in the MOST efficient way? (Choose two.)
A. Use Amazon Kinesis Data Firehose to upload compressed and batched clickstream records to Amazon Elasticsearch
B. Upload clickstream records to Amazon S3 as compressed files. Then use AWS Lambda to send data to Amazon
Elasticsearch Service from Amazon S3.
C. Use Amazon Elasticsearch Service deployed on Amazon EC2 to aggregate, filter, and process the data. Refresh
content performance dashboards in near-real time.
D. Use Kibana to aggregate, filter, and visualize the data stored in Amazon Elasticsearch Service. Refresh content
performance dashboards in near-real time.
E. Upload clickstream records from Amazon S3 to Amazon Kinesis Data Streams and use a Kinesis Data Streams
consumer to send records to Amazon Elasticsearch Service.
Correct Answer: CE
A company wants to research user turnover by analyzing the past 3 months of user activities. With millions of users, 1.5
TB of uncompressed data is generated each day. A 30-node Amazon Redshift cluster with
2.56 TB of solid state drive (SSD) storage for each node is required to meet the query performance goals.
The company wants to run an additional analysis on a year\\’s worth of historical data to examine trends indicating
which features are most popular. This analysis will be done once a week.
What is the MOST cost-effective solution?
A. Increase the size of the Amazon Redshift cluster to 120 nodes so it has enough storage capacity to hold 1 year of
data. Then use Amazon Redshift for the additional analysis.
B. Keep the data from the last 90 days in Amazon Redshift. Move data older than 90 days to Amazon S3 and store it in
Apache Parquet format partitioned by date. Then use Amazon Redshift Spectrum for the additional analysis.
C. Keep the data from the last 90 days in Amazon Redshift. Move data older than 90 days to Amazon S3 and store it in
Apache Parquet format partitioned by date. Then provision a persistent Amazon EMR cluster and use Apache Presto for
the additional analysis.
D. Resize the cluster node type to the dense storage node type (DS2) for an additional 16 TB storage capacity on each
individual node in the Amazon Redshift cluster. Then use Amazon Redshift for the additional analysis.
Correct Answer: B
Share the latest DAS-C01 exam pdf, DAS-C01 test questions and answers, and get a complete DAS-C01 exam dump. Lead4pass DAS-C01 Dumps.
Please visit: https://www.lead4pass.com/das-c01.html (PDF + VCE) 100% guaranteed! Pass the exam easily!
ps. Get free Amazon DAS-C01 dumps PDF online: https://drive.google.com/file/d/1kwmHfAiDmzP_MkwwGSyvCZM-YX_H6yU0/