Objectives
• Understand how to implement data warehouses using AWS Lake Formation service
• Use S3 through management console
• Understand the architecture of Snowflake data platform
• Use Snowflake web UI (a.k.a Web Portal, Snowflake Manager, and Snowflake Console)
• Create databases, tables, and warehouses in the Snowflake Web UI
• Understand how Amazon QuickSight builds visualizations, perform ad hoc analysis, and business insights
• Explore the main capabilities of AWS Glue
• Create a Glue crawler to work over a collection of CSV files using a customized classifier to infer their schemas
• Create and run an AWS Glue ETL job
Audience
Data and business analysts, IT architects, developers, and technical managers
Prerequisites
Participants are expected to have a general knowledge of programming
Duration
2 days
Outline for Data Analytics on AWS Training
Chapter 1 - The AWS Lake Formation Service
First, What is a Data Lake?
Data Lakes vs Traditional Data Warehouses
Characteristics of Data Warehouses and Data Lakes
Now, What is AWS Lake Formation?
What are the Benefits of Using Lake Formation?.
How Lake Formation Works.
The Lake Formation Dashboard.
AWS Lake Formation Pricing
Chapter 2 - AWS Simple Storage Service
What is AWS Simple Storage Service (S3).
AWS Storage.
Regions
S3 Regions.
Getting started with S3.
Using BitTorrent
More on Buckets.
Bucket Configurable Properties.
Advanced S3 Bucket Properties.
The Bucket Creation Dialog in the Management Console.
Bucket Permissions.
Bucket-level Operations.
Authorization of REST Requests.
Adding Cross-Origin Resource Sharing Configuration
Event Notifications.
The Requester Pays Option.
The Object Key
Object Versioning.
Example of Object Properties
Object Storage Class Levels.
Object-level Operations.
Object Lifecycle Configuration.
Amazon S3 Data Consistency Model
Observable Data Consistency Behaviors
Eventually Consistent Reads vs Consistent Reads.
Amazon S3 Security.
S3 Use Case: Backup and Archiving.
Another S3 Use Case: Static Web Hosting
More on Static Web Hosting.
S3 Static Website Hosting Dialog in Management Console.
S3 Use Case: Disaster Recovery
AWS S3 Pricing
Storage Pricing.
Request Pricing.
Data Transfer Pricing.
Amazon S3 Transfer Acceleration.
How to Enable Transfer Acceleration
Enabling Transfer Acceleration in the Management Console.
Amazon S3 SLA Definitions
Amazon S3 SLA Service Commitment
S3 CLT
Summary
Chapter 3 - Redshift.
Overview.
Terms
Data Warehouse.
Traditional Extract Transform Load (ETL)
Data Lake.
Database
Redshift Features.
High Performance.
Simple Management.
Cost-effective.
Elasticity
Query your Data Lake.
Security.
Partners Offer Certified Solutions
Data Warehouse Challenges.
Redshift Spectrum.
As islWhere is Data.
Scalability with Redshift Spectrum
Combine data warehouse and the data lake.
SQL Queries across S3 and Redshift.
Redshift Query
Summary
Chapter 4 - Redshift Optimizations
Queues.
Superuser Queue.
Default Queues.
User Defined Queues.
Workload Management (WLM)
Concurrency level.
User groups
Query groups.
Memory %
Timeout.
Sort keys.
Compound Sort Keys.
Tnterleaved Sort Keys.
Cleaning Up Data
Summary
Chapter 5 - Visualization and Reporting.
Amazon QuickSight.
SPTCE.
Data Analyses.
Visuals.
Sheets
Stories.
Dashboards.
Typical Amazon QuickSight Workflow
Create a Data Set
Create an Analysis.
Create a Visual Manually.
Amazon Athena.
Amazon Athena and AWS Data Catalog.
Query Data Using Amazon Athena.
Create a Report Using Tablea
Create a Report Using Tableau (Contd.).
Summary
Chapter 6 - Tntroduction to AWS Glue
What is AWS Glue?.
AWS Glue Components.
AWS Glue Components (Cont'd).
AWS Glue Components (Cont'd).
Managing Notebooks
AWS Glue Components (Cont'd).
Putting it Together: The AWS Glue Environment Architecture
AWS Glue Main Activities
Additional Glue Services
When To Use AWS Glue?
Tntegration with other AWS Services.
Summary
Chapter 7 - AWS Glue PySpark Extensions.
AWS Glue and Spark.
The DynamicFrame Object
The DynamicFrame APT.
The GlueContext Object.
Glue Transforms.
A Sample Glue PySpark Script.
Using PySpark
AWS Glue PySpark SDK.
Summary