Anyone who either has obtained the Azure Data Engineer Associate Certificate or is working towards it. 


DP-900. Having equivalent knowledge of DP-300 and DP-050 is very helpful.


Five Days

Outline for Implementing Data Engineering on Azure Training

Chapter 1. Understanding Data Engineering

Overview Of Traditional Database Engineer And Bi Developer Roles

When Relational Databases And Sql Isn’t Enough

Transitioning From The Traditional Database Engineer Role To The Data Engineer Role

The Modern Data Sources (Relational, Non-Relational, Real-Time & Streaming)

Chapter 2. Azure Data Lake Overview

What Is Azure Data Lake

Importance Of Data Lake

Gen1 Vs. Gen2.

Hierarchical Namespace In Gen2

Chapter 3. Using U-Sql

What Is The U-Sql Language

Write, Run, And Manage Analytics Jobs

Extend U-Sql Using Python

Chapter 4. Monitoring And Optimizing U-Sql Jobs

Schedule U-Sql Jobs

Manage U-Sql Jobs

Troubleshoot U-Sql Jobs

Performance Optimization

Chapter 5. Cosmos DB

MOC 20777 content (there are 6 modules in the course)

Chapter 6. Introduction To Big Data Formats

Why Different Formats Emerged

The Evolution Of Data Formats

Use Cases For Different Formats

Understanding Avro

Understanding Parquet

Understanding Optimized Row Columnar (ORC)

Challenges Involved In Converting Formats

Chapter 7. Azure Data Factory Overview

What Is Azure Data Factory

Understanding Automated Data Pipelines

Understanding Data Sets

Understanding Activities

Chapter 8. Developing Azure Data Factory Pipelines

Understanding Options For Developing Data Factory Pipelines

Setup Source

Setup Sink

Setup Mappings

Validate, Publish, And Test

Developing Data Factory Pipelines Using Python

Chapter 9. Managing Azure Data Factory Jobs

Scheduling Data Factory Jobs

Executing Data Factory Jobs

Monitoring Data Factory Jobs

Understanding Tumbling Windows

Understanding Concurrency

Understanding Dependency

Understanding Troubleshooting

Chapter 10. Getting Started on Microsoft HDInsight

Introduction to Hadoop

Working with MapReduce Function

Introduction to HDInsight

Understanding HDInsight Cluster Types

Deploying HDInsight Clusters

Chapter 11. Applying Data Engineering on Microsoft HDInsight

Understanding various Data Loading Tools

Loading Data into HDInsight

Understanding Apache Hive Solutions

HDInsight Data Queries using Hive and Pig

Chapter 12. Putting It Together

Combining Azure Data Factory With Azure HDinsight

Combining Azure Data Factory With Azure Machine Learning

Transforming And Processing Raw Data Into Predictions And Insights

Chapter 13. Implementing Streaming Solutions with Kafka and HBase

Introduction to Kafka and HBase

Deploying a Kafka Cluster

Publishing, Consuming, and Processing Data

Storing Data to HBase

Querying Data in HBase

Chapter 14. Introduction to Streaming Data using Apache Spark

Exploring Sources and Sinks

Understanding Streaming Data Frames

Understanding Window Operations on Frames

Introduction to Streaming Joins

Monitoring Streaming Queries

Chapter 15. Implementing Streaming Solutions with Databricks

Introduction to Structured Streaming on Azure Databricks

Setting up Azure Databricks

Configuring Source and Sink

Building Streaming Pipeline on Azure Databricks

Working with Timestamps and Windows

Understanding Stateful Operations

Handling Multiple Streams and Datasets

Optimizing Streaming Pipeline for Production Use

Chapter 16. Implementing Real-time Processing Solutions with Apache Storm

How to Persist Long-term Data

Streaming Data with Apache Storm

Understanding Apache Storm Topologies

Creating Apache Storm Topologies

Configuring Apache Storm