Duration: 3 Days
Overview
This exclusive ML AWS course, developed by Web Age Solutions, fills in the gaps to teach the topics that are on the AWS Machine Learning Certification test but not covered in the official AWS Machine Learning Pipeline on AWS course. This course goes beyond the official AWS course curriculum and provides you with all the knowledge and skills needed for success on the certification exam.
If you are pursuing AWS ML certification, we can provide everything you need to prepare for the exam. See our AWS ML Certification Roadmap for the training and support you need to prepare for the test and become certified.
Objectives
By the end of this course, students should be proficient in the following skills:
- AWS Glue and Glue PySpark extensions
- Repairing and normalizing data
- Data Visualization
- Machine Learning Fundamentals
- Machine learning at Scale
- Working in AWS SageMaker
- AutoML
- Production and Deployment
Audience
This course is best-suited for the following student profiles:
Data Scientists, Data Engineers, Software Developers and Solution Architects
Prerequisites
Students should arrive at class with a comprehensive understanding of the following topics:
- Basic knowledge of Python
- Basic understanding of AWS Cloud infrastructure (S3 and Amazon CloudWatch)
- Basic experience working in a Jupyter Notebook environment
Outline for AWS ML Mastery: Bridging the Gap Training
Introduction to AWS Glue
- What is AWS Glue?
- AWS Glue Components
- Managing Notebooks
- Putting it Together: The AWS Glue Environment Architecture
- AWS Glue Main Activities
- Additional Glue Services
- When To Use AWS Glue?
- Integration with other AWS Services
- Summary
AWS Glue PySpark Extensions
- AWS Glue and Spark
- The DynamicFrame Object
- The DynamicFrame API
- The GlueContext Object
- Glue Transforms
- A Sample Glue PySpark Script
- Using PySpark
- AWS Glue PySpark SDK
- Summary
Repairing and Normalizing Data
- Repairing and Normalizing Data
- Dealing with the Missing Data
- Sample Data Set
- Getting Info on Null Data
- Dropping a Column
- Interpolating Missing Data in pandas
- Replacing the Missing Values with the Mean Value
- Scaling (Normalizing) the Data
- Data Preprocessing with scikit-learn
- Scaling with the scale() Function
- The MinMaxScaler Object
- Summary
Data Visualization in Python
- Data Visualization
- Data Visualization in Python
- Matplotlib
- Getting Started with matplotlib
- The matplotlib.pyplot.plot() Function
- The matplotlib.pyplot.bar() Function
- The matplotlib.pyplot.pie () Function
- Subplots
- Using the matplotlib.gridspec.GridSpec Object
- The matplotlib.pyplot.subplot() Function
- Figures
- Saving Figures to a File
- Seaborn
- Getting Started with seaborn
- Histograms and KDE
- Plotting Bivariate Distributions
- Scatter plots in seaborn
- Pair plots in seaborn
- Heatmaps
- ggplot
- Summary
ML Fundamentals and Review
- Exploratory Data Analysis (EDA)
- Machine Learning Fundamentals
- Building ML Models for Inference
- Machine Learning Interpretability
Machine learning at Scale - Hyperparameter tuning and model selection at scale
- Hyperparameter Tuning at Scale
- Hyperparameter Tuning Challenges
- Distributed Hyperparameter Tuning
- Bayesian Optimization
- Distributed Hyperparameter Tuning
- Spark Based Tools
- TensorFlowOnSpark
- Advantages of TensorFlowOnSpark
- BigDL
- Advantages of BigDL
- Horovod
- Advantages of Horovod
- H2O Sparkling Water
- Advantages of Sparkling Water over H2O
Working in AWS SageMaker
- Introduction to SageMaker
- Setting up a SageMaker Environment
- Training and evaluating ML models using SageMakers Built-in Algorithms
Automated Machine Learning
- Overview of AutoML Tools and Techniques
- Automated Machine Learning with auto-sklearn, HO2, Auto Keras
- AWS AutoML
Production and Deployment
- Deploying to an Environment
- Deploying a Model
- Maintaining Models
- DEMO
Labs
Lab 1. AWS Glue Overview
Lab 2. AWS Glue Crawlers and Classifiers
Lab 3. Creating an S3 Bucket for AWS Glue ETL Script Output
Lab 4. Creating and Working with Glue Scripts
Lab 5. Repairing and Normalizing Data
Lab 6. Data Visualization in Python
Lab 7. Data Visualization in Python Project
Lab 8. Data Visualization and EDA in PySpark (Project)
Lab 9. Jupyter Notebook: Section 1.1. Exploratory Data Analysis (EDA)
Lab 10. Jupyter Notebook: Section 1.4. Machine Learning Interpretability
Lab 11. Jupyter Notebook: Section 2.1. Introduction to SageMaker
Lab 12. Jupyter Notebook: Section 2.2. Setting up a SageMaker Environment