04/24/2023 - 04/24/2023
10:00 AM - 06:00 PM
Online Virtual Class
USD $730.00
Enroll
05/15/2023 - 05/15/2023
10:00 AM - 06:00 PM
Online Virtual Class
USD $730.00
Enroll
06/19/2023 - 06/19/2023
10:00 AM - 06:00 PM
Online Virtual Class
USD $730.00
Enroll

Topics

  • Defining Big Data
  • Big Data Stores Overview
  • NoSQL
  • Big Data Business Intelligence and Analytics
  • Real World Case Studies
  • Adopting NoSQL

Audience

General audience including business and technology team leadership

Pre-requisites

Basic programming skills, some knowledge of SQL

Duration

 1 day

Download Sample Labs Lab Setup Guide

Outline for Introduction to Big Data and NoSQL Training

Chapter 1. Introduction to NoSQL Systems

  • Gartner's Definition of Big Data
  • The V
  • 3
  • Properties
  • Limitations of Relational Databases
  • Limitations of Relational Databases (Cont'd)
  • What are NoSQL (Not Only SQL) Databases?
  • What are NoSQL Databases?
  • The Past and Present of the NoSQL World
  • NoSQL Database Properties
  • NoSQL Benefits
  • Use Cases for NoSQL Database Systems
  • NoSQL Database Storage Types
  • The CAP Theorem
  • Mechanisms to Guarantee a Single CAP Property
  • NoSQL Systems CAP Triangle
  • Limitations of NoSQL Databases
  • Mix-and-Match Approach
  • Big Data Sharding
  • Sharding Example
  • Google BigTable
  • BigTable-based Applications
  • BigTable Design
  • Barriers to Adoption
  • Dismantling Barriers to Adoption
  • Industry trends
  • NoSQL Technology Adoption Action Plan
  • Quiz
  • Quiz Answers
  • Summary

Chapter 2. Introduction to Hadoop

  • The Client – Server Processing Pattern
  • Apache Hadoop
  • Apache Hadoop Logo
  • Typical Hadoop Applications
  • Hadoop Clusters
  • Hadoop Distributions
  • Hadoop's Main Components
  • Hadoop Distributed File System (HDFS)
  • HDFS Considerations
  • Data Blocks
  • HDFS NameNode Directory Diagram
  • HDFS Balancing
  • Accessing HDFS
  • Examples of HDFS Commands
  • Other Supported File Systems
  • YARN
  • Hadoop-based Systems for Data Analysis
  • MapReduce
  • Similarity with SQL Aggregation Operations
  • MapReduce Word Count Example
  • Distributed Computing Economics
  • Discussion: Divide and Conquer
  • Apache Pig
  • Pig Latin
  • Running Pig
  • Pig Latin Script Example
  • What is Hive?
  • Hive's Value Proposition
  • Who uses Hive?
  • What Hive Does Not Have
  • HiveQL
  • Working with Hive Tables
  • Summary

Chapter 3. Apache HBase

  • What is HBase?
  • HBase Design
  • HBase Master (HMaster)
  • Sparse Data Sets
  • Regions and Region Servers
  • HBase Features
  • HBase High Availability
  • The Write-Ahead Log (WAL) and MemStore
  • HBase vs RDBS
  • HBase vs RDBS (Cont'd)
  • Interfacing with HBase
  • HBase Thrift and REST Gateway
  • HBase Table Design
  • Column Families
  • A Cell's Value Versioning
  • Timestamps
  • Accessing Cells
  • HBase Table Design Digest
  • The Conceptual View of an HBase Table
  • HBase Compaction
  • Loading Data in HBase
  • Column Families Notes
  • Cardinality of Column Families
  • Hotspotting
  • Rowkey Design Notes
  • Security
  • HBase Shell
  • HBase Shell Command Groups
  • Creating and Populating a Table Using HBase Shell
  • Getting a Cell's Value
  • Counting Rows in an HBase Table
  • HBase Java Client
  • HBase Scanners
  • The Scan Class
  • The KeyValue Class
  • The Result Class
  • Getting Versions of Cell Values Example
  • The Cell Interface
  • HBase Java Client Example
  • Scanning the Table Rows
  • Dropping a Table
  • The Bytes Utility Class
  • Table Schema Main Rules to Follow
  • Good Use Cases for HBase
  • Not Good Use Cases for HBase
  • Business Continuity Caveats
  • Summary

Chapter 4. Apache Cassandra

  • What is Apache Cassandra?
  • Main Features
  • Peer-to-Peer (No Master)
  • Wide Column Store NoSQL Databases
  • Cassandra Model vs Relational Model
  • Column Families
  • Columns
  • Simplified Data Model
  • Data Model
  • The Cap Placement
  • CQL
  • CQL Simple Examples
  • The Update Statement
  • Update Caveats
  • Update Statement with TTL and TIMESTAMP Examples
  • Collections
  • Example of Using a Set Collection
  • Using the List Collection
  • Data Replication
  • Visualizing Data Replication
  • The Write Path
  • Sequential Data Storage Engine
  • Java Client Code Example
  • Data Distribution
  • Native Aggregate Functions
  • Creating UDFs
  • HBase vs Apache Cassandra
  • Cassandra vs MongoDB
  • Security
  • WAN-Wide High Availability
  • Summary

Lab Exercises

Lab 1. Learning the Lab Environment
Lab 2. The Hadoop Distributed File System
Lab 3. Using HBase Shell
Lab 4. Comparing NoSQL Systems