leads4pass shares the latest valid DAS-C01 dumps that meet the requirements for passing the AWS Certified Data Analytics – Specialty (DAS-C01) certification exam!
leads4pass DAS-C01 dumps provide two learning solutions, PDF and VCE, to help candidates experience real simulated exam scenarios! Now! Get the latest leads4pass DAS-C01 dumps with PDF and VCE:
https://www.leads4pass.com/das-c01.html (285 Q&A)
From | Exam name | Free Share | Last updated |
leads4pass | AWS Certified Data Analytics – Specialty (DAS-C01) | Q14-Q28 | DAS-C01 dumps (Q1-Q13) |
New Q14:
A company is designing a support ticketing system for its employees. The company has a flattened LDAP dataset that contains employee data. The data includes ticket categories that the employees can access, relevant ticket metadata stored in Amazon S3, and the business unit of each employee.
The company uses Amazon QuickSight to visualize the data. The company needs an automated solution to apply row-level data restriction within the QuickSight group for each business unit. The solution must grant access to an employee when an employee is added to a business unit and must deny access to an employee when an employee is removed from a business unit.
Which solution will meet these requirements?
A. Load the dataset into SPICE from Amazon S3. Create a SPICE query that contains the dataset rules for row-level security. Upload separate .csv files to Amazon S3 for adding and removing users from a group. Apply the permissions dataset on the existing QuickSight users. Create an AWS Lambda function that will run periodically to refresh the direct query cache based on the changes to the .csv file.
B. Load the dataset into SPICE from Amazon S3. Create an AWS Lambda function that will run each time the direct query cache is refreshed. Configure the Lambda function to apply a permissions file to the dataset that is loaded into SPICE. Configure the addition and removal of groups and users by creating a QuickSight IAM policy.
C. Load the dataset into SPICE from Amazon S3. Apply a permissions file to the dataset to dictate which group has access to the dataset. Upload separate .csv files to Amazon S3 for adding and removing groups and users under the path that QuickSight is reading from. Create an AWS Lambda function that will run when a particular object is uploaded to Amazon S3. Configure the Lambda function to make API calls to QuickSight to add or remove users or a group.
D. Move the data from Amazon S3 into Amazon Redshift. Load the dataset into SPICE from Amazon Redshift. Create an AWS Lambda function that will run each time the direct query cache is refreshed. Configure the Lambda function to apply a permissions file to the dataset that is loaded into SPICE.
Correct Answer: C
New Q15:
An e-commerce company extracts a large volume of data from Amazon S3, relational databases, non-relational databases, and custom data stores. A data analytics team wants to analyze the data to discover patterns by using SQL-like queries with a high degree of complexity and store the results in Amazon S3 for further analysis.
How can the data analytics team meet these requirements with the LEAST operational overhead?
A. Query the datasets with Presto running on Amazon EMR.
B. Query the datasets by using Apache Spark SQL running on Amazon EMR.
C. Use AWS Glue jobs to ETL data from various data sources to Amazon S3 and query the data with Amazon Athena.
D. Use federated query functionality in Amazon Athena.
Correct Answer: B
New Q16:
A data analyst is designing an Amazon QuickSight dashboard using centralized sales data that resides in Amazon Redshift. The dashboard must be restricted so that a salesperson in Sydney, Australia, can see only the Australian view and that a salesperson in New York can see only United States (US) data.
What should the data analyst do to ensure the appropriate data security is in place?
A. Place the data sources for Australia and the US into separate SPICE capacity pools.
B. Set up an Amazon Redshift VPC security group for Australia and the US.
C. Deploy QuickSight Enterprise edition to implement row-level security (RLS) to the sales table.
D. Deploy QuickSight Enterprise edition and set up different VPC security groups for Australia and the US.
Correct Answer: D
Reference: https://docs.aws.amazon.com/quicksight/latest/user/working-with-aws-vpc.html
New Q17:
A hospital uses wearable medical sensor devices to collect data from patients. The hospital is architecting a near-real-time solution that can ingest the data securely at scale. The solution should also be able to remove the patient\’s protected health information (PHI) from the streaming data and store the data in durable storage.
Which solution meets these requirements with the least operational overhead?
A. Ingest the data using Amazon Kinesis Data Streams, which invokes an AWS Lambda function using Kinesis Client Library (KCL) to remove all PHI. Write the data in Amazon S3.
B. Ingest the data using Amazon Kinesis Data Firehose to write the data to Amazon S3. Have Amazon S3 trigger an AWS Lambda function that parses the sensor data to remove all PHI in Amazon S3.
C. Ingest the data using Amazon Kinesis Data Streams to write the data to Amazon S3. Have the data stream launch an AWS Lambda function that parses the sensor data and removes all PHI in Amazon S3.
D. Ingest the data using Amazon Kinesis Data Firehose to write the data to Amazon S3. Implement a transformation AWS Lambda function that parses the sensor data to remove all PHI.
Correct Answer: C
New Q18:
A data analyst is using AWS Glue to organize, cleanse, validate, and format a 200 GB dataset. The data analyst triggered the job to run with the Standard worker type. After 3 hours, the AWS Glue job status is still RUNNING. Logs from the job run show no error codes. The data analyst wants to improve the job execution time without overprovisioning.
Which actions should the data analyst take?
A. Enable job bookmarks in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled metrics, increase the value of the executor-cores job parameter.
B. Enable job metrics in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled metrics, increase the value of the maximum capacity job parameter.
C. Enable job metrics in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled metrics, increase the value of the spark.yarn.executor.memoryOverhead job parameter.
D. Enable job bookmarks in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled metrics, increase the value of the num-executors job parameter.
Correct Answer: B
Reference: https://docs.aws.amazon.com/glue/latest/dg/monitor-debug-capacity.html
New Q19:
A marketing company wants to improve its reporting and business intelligence capabilities. During the planning phase, the company interviewed the relevant stakeholders and discovered that:
1.
The operations team reports are run hourly for the current month\’s data.
2.
The sales team wants to use multiple Amazon QuickSight dashboards to show a rolling view of the last 30 days based on several categories. The sales team also wants to view the data as soon as it reaches the reporting backend.
3.
The finance team\’s reports are run daily for last month\’s data and once a month for the last 24 months of data.
Currently, there is 400 TB of data in the system with an expected additional 100 TB added every month. The company is looking for a solution that is as cost-effective as possible.
Which solution meets the company\’s requirements?
A. Store the last 24 months of data in Amazon Redshift. Configure Amazon QuickSight with Amazon Redshift as the data source.
B. Store the last 2 months of data in Amazon Redshift and the rest of the months in Amazon S3. Set up an external schema and table for Amazon Redshift Spectrum. Configure Amazon QuickSight with Amazon Redshift as the data source.
C. Store the last 24 months of data in Amazon S3 and query it using Amazon Redshift Spectrum. Configure Amazon QuickSight with Amazon Redshift Spectrum as the data source.
D. Store the last 2 months of data in Amazon Redshift and the rest of the months in Amazon S3. Use a long-running Amazon EMR with Apache Spark cluster to query the data as needed. Configure Amazon QuickSight with Amazon EMR as the data source.
Correct Answer: B
New Q20:
A company\’s data analyst needs to ensure that queries run in Amazon Athena cannot scan more than a prescribed amount of data for cost control purposes. Queries that exceed the prescribed threshold must be canceled immediately.
What should the data analyst do to achieve this?
A. Configure Athena to invoke an AWS Lambda function that terminates queries when the prescribed threshold is crossed.
B. For each workgroup, set the control limit for each query to the prescribed threshold.
C. Enforce the prescribed threshold on all Amazon S3 bucket policies
D. For each workgroup, set the workgroup-wide data usage control limit to the prescribed threshold.
Correct Answer: B
Reference: https://docs.aws.amazon.com/athena/latest/ug/workgroups-setting-control-limits-cloudwatch.html
New Q21:
A company wants to provide its data analysts with uninterrupted access to the data in its Amazon Redshift cluster. All data is streamed to an Amazon S3 bucket with Amazon Kinesis Data Firehose. An AWS Glue job that is scheduled to run every 5 minutes issues a COPY command to move the data into Amazon Redshift.
The amount of data delivered is uneven throughout the day, and cluster utilization is high during certain periods. The COPY command usually completes within a couple of seconds. However, when a load spike occurs, locks can exist and data can be missed. Currently, the AWS Glue job is configured to run without retries, with a timeout at 5 minutes and concurrency at 1.
How should a data analytics specialist configure the AWS Glue job to optimize fault tolerance and improve data availability in the Amazon Redshift cluster?
A. Increase the number of retries. Decrease the timeout value. Increase the job concurrency.
B. Keep the number of retries at 0. Decrease the timeout value. Increase the job concurrency.
C. Keep the number of retries at 0. Decrease the timeout value. Keep the job concurrency at 1.
D. Keep the number of retries at 0. Increase the timeout value. Keep the job concurrency at 1.
Correct Answer: B
New Q22:
A large company has a central data lake to run analytics across different departments. Each department uses a separate AWS account and stores its data in an Amazon S3 bucket in that account. Each AWS account uses the AWS Glue Data Catalog as its data catalog. There are different data lake access requirements based on roles. Associate analysts should only have read access to their departmental data. Senior data analysts can have access to multiple departments including theirs, but for a subset of columns only.
Which solution achieves these required access patterns to minimize costs and administrative tasks?
A. Consolidate all AWS accounts into one account. Create different S3 buckets for each department and move all the data from every account to the central data lake account. Migrate the individual data catalogs into a central data catalog and apply fine-grained permissions to give each user the required access to tables and databases in AWS Glue and Amazon S3.
B. Keep the account structure and the individual AWS Glue catalogs on each account. Add a central data lake account and use AWS Glue to catalog data from various accounts. Configure cross-account access for AWS Glue crawlers to scan the data in each departmental S3 bucket to identify the schema and populate the catalog. Add the senior data analysts to the central account and apply highly detailed access controls in the Data Catalog and Amazon S3.
C. Set up an individual AWS account for the central data lake. Use AWS Lake Formation to catalog the cross-account locations. On each individual S3 bucket, modify the bucket policy to grant S3 permissions to the Lake Formation service-linked role. Use Lake Formation permissions to add fine-grained access controls to allow senior analysts to view specific tables and columns.
D. Set up an individual AWS account for the central data lake and configure a central S3 bucket. Use an AWS Lake Formation blueprint to move the data from the various buckets into the central S3 bucket. On each individual bucket, modify the bucket policy to grant S3 permissions to the Lake Formation service-linked role. Use Lake Formation permissions to add fine-grained access controls for both associate and senior analysts to view specific tables and columns.
Correct Answer: B
New Q23:
A retail company wants to use Amazon QuickSight to generate dashboards for web and in-store sales. A group of 50 business intelligence professionals will develop and use the dashboards. Once ready, the dashboards will be shared with a group of 1,000 users.
The sales data comes from different stores and is uploaded to Amazon S3 every 24 hours. The data is partitioned by year and month and is stored in Apache Parquet format. The company is using the AWS Glue Data Catalog as its main data catalog and Amazon Athena for querying. The total size of the uncompressed data that the dashboards query from at any point is 200 GB.
Which configuration will provide the MOST cost-effective solution that meets these requirements?
A. Load the data into an Amazon Redshift cluster by using the COPY command. Configure 50 author users and 1,000 reader users. Use QuickSight Enterprise edition. Configure an Amazon Redshift data source with a direct query option.
B. Use QuickSight Standard edition. Configure 50 author users and 1,000 reader users. Configure an Athena data source with a direct query option.
C. Use QuickSight Enterprise edition. Configure 50 author users and 1,000 reader users. Configure an Athena data source and import the data into SPICE. Automatically refresh every 24 hours.
D. Use QuickSight Enterprise edition. Configure 1 administrator and 1,000 reader users. Configure an S3 data source and import the data into SPICE. Automatically refresh every 24 hours.
Correct Answer: C
New Q24:
An online retail company uses Amazon Redshift to store historical sales transactions. The company is required to encrypt data at rest in the clusters to comply with the Payment Card Industry Data Security Standard (PCI DSS). A corporate governance policy mandates the management of encryption keys using an on-premises hardware security module (HSM).
Which solution meets these requirements?
A. Create and manage encryption keys using AWS CloudHSM Classic. Launch an Amazon Redshift cluster in a VPC with the option to use CloudHSM Classic for key management.
B. Create a VPC and establish a VPN connection between the VPC and the on-premises network. Create an HSM connection and client certificate for the on-premises HSM. Launch a cluster in the VPC with the option to use the on-premises HSM to store keys.
C. Create an HSM connection and client certificate for the on-premises HSM. Enable HSM encryption on the existing unencrypted cluster by modifying the cluster. Connect to the VPC where the Amazon Redshift cluster resides from the on-premises network using a VPN.
D. Create a replica of the on-premises HSM in AWS CloudHSM. Launch a cluster in a VPC with the option to use CloudHSM to store keys.
Correct Answer: B
Reference: https://docs.aws.amazon.com/redshift/latest/mgmt/security-key-management.html
New Q25:
A banking company plans to build a data warehouse solution on AWS to run join queries on 20 TB of data. These queries will be complex and analytical. About 10% of the data is from the past 3 months. Data older than 3 months needs to be accessed occasionally to run queries.
Which solution MOST cost-effectively meets these requirements?
A. Use Amazon S3 as the data store and use Amazon Athena for the queries. Use Amazon S3 Glacier Flexible Retrieval for storing data older than 3 months by using S3 lifecycle policies.
B. Use Amazon Redshift to build a data warehouse solution. Create an AWS Lambda function that is orchestrated by AWS Step Functions to run the UNLOAD command on data older than 3 months from the Redshift database to Amazon S3. Use Amazon Redshift Spectrum to query the data in Amazon S3.
C. Use Amazon Redshift to build a data warehouse solution. Use RA3 instances for the Redshift cluster so that data requested for a query is stored in a solid-state drive (SSD) for fast local storage and Amazon S3 for longer-term durable storage.
D. Use Amazon Elastic File System (Amazon EFS) to build a data warehouse solution for data storage. Use Amazon EFS lifecycle management to retire data older than 3 months to the S3 Standard-Infrequent Access (S3 Standard-IA) class. Use Apache Presto on an Amazon EMR cluster to query the data interactively.
Correct Answer: C
New Q26:
An online food delivery company wants to optimize its storage costs. The company has been collecting operational data for the last 10 years in a data lake that was built on Amazon S3 by using a Standard storage class. The company does not keep data that is older than 7 years. The data analytics team frequently uses data from the past 6 months for reporting and runs queries on data from the last 2 years about once a month. Data that is more than 2 years old is rarely accessed and is only used for audit purposes.
Which combination of solutions will optimize the company\’s storage costs? (Choose two.)
A. Create an S3 Lifecycle configuration rule to transition data that is older than 6 months to the S3 Standard-Infrequent Access (S3 Standard-IA) storage class. Create another S3 Lifecycle configuration rule to transition data that is older than 2 years to the S3 Glacier Deep Archive storage class.
B. Create an S3 Lifecycle configuration rule to transition data that is older than 6 months to the S3 One Zone-Infrequent Access (S3 One Zone-IA) storage class. Create another S3 Lifecycle configuration rule to transition data that is older than 2 years to the S3 Glacier Flexible Retrieval storage class.
C. Use the S3 Intelligent-Tiering storage class to store data instead of the S3 Standard storage class.
D. Create an S3 Lifecycle expiration rule to delete data that is older than 7 years.
E. Create an S3 Lifecycle configuration rule to transition data that is older than 7 years to the S3 Glacier Deep Archive storage class.
Correct Answer: CE
New Q27:
A large company has several independent business units. Each business unit is responsible for its own data but needs to share data with other units for collaboration. Each unit stores data in an Amazon S3 data lake created with AWS Lake Formation. To create dashboard reports, the marketing team wants to join its data stored in an Amazon Redshift cluster with the sales team customer table stored in the data lake. The sales team has a large number of tables and schemas, but the marketing team should only have access to the customer table. The solution must be secure and scalable.
Which set of actions meets these requirements?
A. The sales team shares the AWS Glue Data Catalog customer table with the marketing team in read-only mode using the named resource method. The marketing team accepts the data shared using AWS Resource Access Manager (AWS RAM) and creates a resource link to the shared customer table. The marketing team joins its data with the customer table using Amazon Redshift Spectrum.
B. The marketing team creates an S3 cross-account replication between the sales team\’s S3 bucket as the source and the marketing team\’s S3 bucket as the destination. The marketing team runs an AWS Glue crawler on the replicated data in its AWS account to create an AWS Glue Data Catalog customer table. The marketing team joins its data with the customer table using Amazon Redshift Spectrum.
C. The marketing team creates an AWS Lambda function in the sales team\’s account to replicate data between the sales team\’s S3 bucket as the source and the marketing team\’s S3 bucket as the destination. The marketing team runs an AWS Glue crawler on the replicated data in its AWS account to create an AWS Glue Data Catalog customer table. The marketing team joins its data with the customer table using Amazon Redshift Spectrum.
D. The sales team shares the AWS Glue Data Catalog customer table with the marketing team in read-only mode using the Lake Formation tag-based access control (LF-TBAC) method. The sales team updates the AWS Glue Data Catalog resource policy to add relevant permissions for the marketing team. The marketing team creates a resource link to the shared customer table. The marketing team joins its data with the customer table using Amazon Redshift Spectrum.
Correct Answer: B
New Q28:
A data analytics specialist is setting up workload management in manual mode for an Amazon Redshift environment. The data analytics specialist is defining query monitoring rules to manage system performance and user experience of an Amazon Redshift cluster.
Which elements must each query monitoring rule include?
A. A unique rule name, a query runtime condition, and an AWS Lambda function to resubmit any failed queries in off-hours
B. A queue name, a unique rule name, and a predicate-based stop condition
C. A unique rule name, one to three predicates, and an action
D. A workload name, a unique rule name, and a query runtime-based condition
Correct Answer: C
Reference: https://docs.aws.amazon.com/redshift/latest/dg/cm-c-wlm-query-monitoring-rules.html
…
Download the latest leads4pass DAS-C01 dumps with PDF and VCE: https://www.leads4pass.com/das-c01.html (285 Q&A)
Read DAS-C01 exam questions(Q1-Q13): https://awsexamdumps.com/das-c01-exam-dumps-v16-02-2022-for-passing-aws-certified-data-analytics-specialty/