Apache Spark Training: Introduction to Apache Spark

Course #:TP2455

Apache Spark Training: Introduction to Apache Spark

The Introduction to Apache Spark training course teaches developers the fundamental Spark APIs .   This four day course uses lectures and hands-on exercises to get developers up to speed using Spark for data exploration and analysis.

The course introduces and explores the concepts of Apache Spark development.  Lectures and hands-on labs introduce the student to the Hadoop Distributed File System (HDFS), Resilient Distributed Datasets (RDDs), parallel programming, application programming techniques, and common Spark algorithms.


4 days

Outline of Apache Spark Training: Introduction to Apache Spark

1. Spark Basics

  • What is Apache Spark?
  • Using the Spark Shell
  • Resilient Distributed Datasets (RDDs)
  • Functional Programming with Spark

2. The Hadoop Distributed File System

  • Why HDFS?
  • HDFS Architecture
  • Using HDFS

3. Spark and Hadoop

  • Spark and the Hadoop Ecosystem
  • Spark and MapReduce

4. RDDs

  • RDD Operations
  • KeyValue

5. Pair RDDs

  • MapReduce and Pair RDD Operations

6. Running Spark on a Cluster

  • Standalone Cluster
  • The Spark Standalone Web UI

7. Parallel Programming with Spark

  • RDD Partitions and HDFS Data Locality
  • Working With Partitions
  • Executing Parallel Operations

8. Caching and Persistence

  • Distributed Persistence
  • Caching

9. Writing Spark Applications

  • SparkContext
  • Spark Properties
  • Building and Running a Spark Application
  • Logging

10. Spark Streaming

  • Streaming Overview
  • Sliding Window Operations
  • Spark Streaming Applications

11. Common Spark Algorithms

  • Iterative Algorithms
  • Graph Analysis
  • Machine Learning

12. Improving Spark Performance

  • Shared Variables: Broadcast Variables
  • Shared Variables: Accumulators
  • Common Performance Issues
We regularly offer classes in these and other cities. Atlanta, Austin, Baltimore, Calgary, Chicago, Cleveland, Dallas, Denver, Detroit, Houston, Jacksonville, Miami, Montreal, New York City, Orlando, Ottawa, Philadelphia, Phoenix, Pittsburgh, Seattle, Toronto, Vancouver, Washington DC.