04/17/2023 - 04/17/2023
10:00 AM - 06:00 PM
Online Virtual Class
USD $680.00
Enroll
05/08/2023 - 05/08/2023
10:00 AM - 06:00 PM
Online Virtual Class
USD $680.00
Enroll
06/12/2023 - 06/12/2023
10:00 AM - 06:00 PM
Online Virtual Class
USD $680.00
Enroll

Duration

1/2 Day

Outline for Data Engineering for Managers Training

Chapter 1. Defining Data Engineering

  • Data is King
  • What is Data Engineering
  • The Data-Related Roles
  • The Data Engineer Role
  • Core Skills and Competencies
  • What is Data Wrangling (Munging)?
  • Typical Data Processing Pipeline
  • Data Discovery Phase
  • Data Harvesting Phase
  • Data Priming Phase
  • Exploratory Data Analysis
  • Model Planning Phase
  • Model Building Phase
  • Communicating the Results
  • Production Roll-out
  • Data Logistics and Data Governance
  • Data Processing Workflow Engines
  • Data Lineage and Provenance
  • The Traditional Client–Server Processing Pattern
  • Enter Distributed Computing
  • Data Physics
  • Data Locality (Distributed Computing Economics)
  • The CAP Theorem
  • Mechanisms to Guarantee a Single CAP Property
  • Eventual Consistency
  • What is Apache Spark
  • The Spark Platform
  • Languages Supported by Spark
  • Running Spark on a Cluster
  • The Resilient Distributed Dataset (RDD)
  • The Lineage Concept
  • Datasets and DataFrames
  • Data Partitioning
  • Data Partitioning Diagram
  • Python's Value
  • Python on AWS
  • What is Serverless Computing?
  • How Functions Work
  • What is AWS Glue?
  • Summary