Audience
Developers, Architects, Process Automation and Data Management Practitioners
Prerequisites
Participants should be familiar with Python syntax (or have a background in programming)
Duration
Two days
Outline for Workflow Management with Apache Airflow Training
Chapter 1. Apache Airflow Introduction
- A Traditional ETL Approach
- Apache Airflow Defined
- Airflow Core Components
- The Component Collaboration Diagram
- Workflow Building Blocks and Concepts
- Airflow CLI
- Main Configuration File
- Extending Airflow
- Jinja Templates
- Variables and Macros
- Summary
Chapter 2. Apache Airflow Web UI
- Web UI - the Landing (DAGs) Page
- Web UI - the DAG Graph View
- Run Status Legends
- The Pause Button (Trigger Latch)
- The DAG Triggering/Job Checking Sequence
- The Control Panel for a Task
- Sample Log File Messages (Abridged for Space)
- Summary
Chapter 3. Anatomy of a DAG and Scheduling
- What is a DAG?
- Scheduled and Manually Triggered DAG Runs
- The DAG Object
- Tasks
- Task Lifecycle
- Operators
- Idempotent Operators
- Operator Types
- Airflow Common Operators
- Specifying Dependencies
- Associating Operators with a DAG
- Associating Operators Using the "With DAG" Statement Example
- Associating Operators with DAG Using the Operator's Constructor
- The default_args Parameter
- Passing DAG Parameters Through Web UI
- DAG Run Scheduling
- Examples of the schedule_interval Parameter
- DAG Scheduling Nuances
- Understanding The Backfill Process
- Killing/Stopping DAG Runs
- An XCom Messaging Example
- Summary
Lab Exercises
Lab 1 - Learning the Lab Environment
Lab 2 - Learning the Airflow Working Environment
Lab 3 - Airflow DAG – the First Cut
Lab 4 - Scheduling Jobs
Lab 5 - Backfilling
Lab 6 - Passing Parameters
Lab 7 - XCom Messaging
Lab 8 - Task Branching
Lab 9 - Understanding Re-tries
Lab 10 - Using SimpleHttpOperator
Lab 11 - The Task Branching Project
Lab 12 - Backfilling Project