In this data science and data engineering course investigate how your organization can take advantage of the latest data opportunity trends through both technical and management perspectives.

Topics

• The Rise of Data 

• Tools and Scripting

• Data Management and Technical Tools

• Future trends

Audience

Business Analysts, Developers, IT Architects, and Technical Managers

Prerequisites

Basic awareness of computing and Internet concepts, and an interest in extracting insights from data.

Duration

1 day

 

Outline for Data Science and Data Engineering in 2022 Training

Chapter 1. The Rise of Data

  • Data Collection
  • Different Types of Data
  • Gartner's Definition of Big Data
  • Veracity
  • A Practical Definition of Big Data
  • Challenges Posed by Big Data
  • Enter Distributed Computing
  • What is Data Engineering
  • The Team Players
  • The Data Scientist
  • The Data Engineer
  • Data-Related Roles
  • Knowledge Requirements
  • The Overlapping Data Fields
  • The Data Engineer (DE) Role
  • The Data Scientist (DS) Role
  • DE/DS Core Skills and Competencies
  • An Example of a Data Product
  • What is Data Wrangling (Munging)?
  • The Data Exchange Interoperability Options
  • APIs
  • Summary

Chapter 2. Tools and Scripting

  • Data Storage
  • Data Storage Technologies
  • Apache Hadoop
  • Hadoop Ecosystem Projects
  • Storing Raw Data in HDFS and Schema-on-Demand
  • Apache Spark
  • Storing Data in Spark
  • Apache Cassandra
  • MongoDB
  • Talend Data Fabric
  • Databricks
  • Amazon SageMaker
  • Snowflake
  • Qubole
  • Apache Kafka
  • Apache Airflow
  • Apache Storm
  • TensorFlow
  • Terraform by Hashicorp
  • PySpark
  • Amazon Cloud Development Kit (CDK)
  • Summary

Chapter 3. Data Management and Technical Tools

  • DAMA International
  • Data Governance
  • Data Governance Strategy
  • Data Governance Framework
  • Master Data
  • Data Governance Models
  • Data Governance Tools
  • Comparison Table For Data Governance Tools
  • Data Quality
  • The 6C Data Quality Framework
  • Data Integrity
  • Disparate Databases Integration
  • Business Intelligence
  • The Five Key Stages of Business Intelligence
  • The Five Key Stages of Business Intelligence (cont)
  • Tasks Where BI is Used
  • Analytics Software
  • Analytics Software (cont)
  • Four Types of Data Analysis
  • Four Types of Data Analysis (cont)
  • What is DataOps
  • Summary

Chapter 4. Future Trends

  • Big Trends in 2021
  • MLOps
  • DataOps
  • Computer Vision(CV)
  • Natural Language Processing (NLP)
  • Trends in 2022
  • DATAOps 2.0
  • Data Fabric
  • Cloud-Native platforms
  • Hybrid Forms of Automation
  • AI-As-a-Service Platforms
  • Augmented Data Management
  • Business Intelligence for Performance
  • Data Mesh
  • AI Engineering
  • Machine Learning Services
  • NLP for Smaller Languages
  • Fairness and Privacy as a Mega-trend
  • Computer Vision - 3D
  • MLOps in 2022
  • Summary