WA2922
Data Engineering for Managers Training
Data Engineering for Managers Training.
Course Details
Duration
0.5 days
Course Outline
- Defining Data Engineering
- Data is King
- What is Data Engineering
- The Data-Related Roles
- The Data Engineer Role
- Core Skills and Competencies
- What is Data Wrangling (Munging)?
- Typical Data Processing Pipeline
- Data Discovery Phase
- Data Harvesting Phase
- Data Priming Phase
- Exploratory Data Analysis
- Model Planning Phase
- Model Building Phase
- Communicating the Results
- Production Roll-out
- Data Logistics and Data Governance
- Data Processing Workflow Engines
- Data Lineage and Provenance
- The Traditional Client–Server Processing Pattern
- Enter Distributed Computing
- Data Physics
- Data Locality (Distributed Computing Economics)
- The CAP Theorem
- Mechanisms to Guarantee a Single CAP Property
- Eventual Consistency
- What is Apache Spark
- The Spark Platform
- Languages Supported by Spark
- Running Spark on a Cluster
- The Resilient Distributed Dataset (RDD)
- The Lineage Concept
- Datasets and DataFrames
- Data Partitioning
- Data Partitioning Diagram
- Python's Value
- Python on AWS
- What is Serverless Computing?
- How Functions Work
- What is AWS Glue?