In this data science and data engineering course investigate how your organization can take advantage of the latest data opportunity trends through both technical and management perspectives.
Topics
• The Rise of Data
• Tools and Scripting
• Data Management and Technical Tools
• Future trends
Audience
Business Analysts, Developers, IT Architects, and Technical Managers
Prerequisites
Basic awareness of computing and Internet concepts, and an interest in extracting insights from data.
Duration
1 day
Outline for Data Science and Data Engineering in 2022 Training
Chapter 1. The Rise of Data
- Data Collection
- Different Types of Data
- Gartner's Definition of Big Data
- Veracity
- A Practical Definition of Big Data
- Challenges Posed by Big Data
- Enter Distributed Computing
- What is Data Engineering
- The Team Players
- The Data Scientist
- The Data Engineer
- Data-Related Roles
- Knowledge Requirements
- The Overlapping Data Fields
- The Data Engineer (DE) Role
- The Data Scientist (DS) Role
- DE/DS Core Skills and Competencies
- An Example of a Data Product
- What is Data Wrangling (Munging)?
- The Data Exchange Interoperability Options
- APIs
- Summary
Chapter 2. Tools and Scripting
- Data Storage
- Data Storage Technologies
- Apache Hadoop
- Hadoop Ecosystem Projects
- Storing Raw Data in HDFS and Schema-on-Demand
- Apache Spark
- Storing Data in Spark
- Apache Cassandra
- MongoDB
- Talend Data Fabric
- Databricks
- Amazon SageMaker
- Snowflake
- Qubole
- Apache Kafka
- Apache Airflow
- Apache Storm
- TensorFlow
- Terraform by Hashicorp
- PySpark
- Amazon Cloud Development Kit (CDK)
- Summary
Chapter 3. Data Management and Technical Tools
- DAMA International
- Data Governance
- Data Governance Strategy
- Data Governance Framework
- Master Data
- Data Governance Models
- Data Governance Tools
- Comparison Table For Data Governance Tools
- Data Quality
- The 6C Data Quality Framework
- Data Integrity
- Disparate Databases Integration
- Business Intelligence
- The Five Key Stages of Business Intelligence
- The Five Key Stages of Business Intelligence (cont)
- Tasks Where BI is Used
- Analytics Software
- Analytics Software (cont)
- Four Types of Data Analysis
- Four Types of Data Analysis (cont)
- What is DataOps
- Summary
Chapter 4. Future Trends
- Big Trends in 2021
- MLOps
- DataOps
- Computer Vision(CV)
- Natural Language Processing (NLP)
- Trends in 2022
- DATAOps 2.0
- Data Fabric
- Cloud-Native platforms
- Hybrid Forms of Automation
- AI-As-a-Service Platforms
- Augmented Data Management
- Business Intelligence for Performance
- Data Mesh
- AI Engineering
- Machine Learning Services
- NLP for Smaller Languages
- Fairness and Privacy as a Mega-trend
- Computer Vision - 3D
- MLOps in 2022
- Summary