In modern applications, real-time information is continuously generated by applications (publishers/producers) and routed to other applications (subscribers/consumers). Apache Kafka is an open source, distributed publish-subscribe messaging system. Kafka has high-throughput and is built to scale-out in a distributed model on multiple servers. Kafka persists messages on disk and can be used for batched consumption as well as real-time applications.
Objectives
- Understand the use of Kafka for high performance messaging
- Identify the usages for Kafka in Microservices
- Explain the benefits of Kafka patterns
- Differentiate between messaging and message brokers
- Describe Kafka messaging environments
- Develop producers and consumers for Kafka
- Recognize how Kafka enables Cloud-native applications
- Summarize characteristics and architecture for Kafka
- Demonstrate how to process messages with Kafka
- Design distributed high throughput systems based on Kafka
- Describe the built-in partitioning, replication and inherent fault-tolerance of Kafka
Topics
- Introduction to Kafka
- Using Apache Kafka
- Building Data Pipelines
- Integrating Kafka with Other Systems
- Kafka and Schema Management
- Kafka Streams and KSQL
- KSQL UDF and Deployment
Audience
This is a general introduction course for developers, architects, system integrators, security administrators, network administrators, software engineers, technical support individuals, technology leaders & managers, and consultants who are responsible for elements of messaging for data collection, transformation, and integration for your organization supporting Application Modernization, Cloud-Native Development, and Digital Data Supply Chain (Big Data/IoT/AI/Machine Learning/Advanced Analytics/Business Intelligence).
Prerequisites
Basic understanding of messaging, cloud, development, architecture and virtualization would be beneficial
Duration
2 days
Outline for Kafka for Application Developers Training
Chapter 1 - Introduction to Kafka
- Messaging Architectures – What is Messaging?
- Messaging Architectures – Steps to Messaging
- Messaging Architectures – Messaging Models
- What is Kafka?
- What is Kafka? (Contd.)
- Kafka Overview
- Kafka Overview (Contd.)
- Need for Kafka
- When to Use Kafka?
- Kafka Architecture
- Core concepts in Kafka
- Kafka Topic
- Kafka Partitions
- Kafka Producer
- Kafka Consumer
- Kafka Broker
- Kafka Cluster
- Why Kafka Cluster?
- Sample Multi-Broker Cluster
- Overview of ZooKeeper
- Kafka Cluster & ZooKeeper
- Who Uses Kafka?
- Summary
Chapter 2 - Using Apache Kafka
- Installing Apache Kafka
- Configuration Files
- Starting Kafka
- Using Kafka Command Line Client Tools
- Setting up a Multi-Broker Cluster
- Using Multi-Broker Cluster
- Kafka Cluster Planning
- Kafka Cluster Planning – Producer/Consumer Throughput
- Kafka Cluster Planning – Number of Brokers (and ZooKeepers)
- Kafka Cluster Planning – Sizing for Topics and Partitions
- Kafka Cluster Planning – Sizing for Storage
- Kafka Connect
- Kafka Connect – Configuration Files
- Using Kafka Connect to Import/Export Data
- Creating a Spring Boot Producer
- Adding Kafka dependency to pom.xml
- Defining a Spring Boot Service to Send Message(s)
- Defining a Spring Boot Controller
- Testing the Spring Boot Producer
- Creating a Nodejs Consumer
- Summary
Chapter 3 - Building Data Pipelines
- Building Data Pipelines
- What to Consider When Building Data Pipelines
- Timeliness
- Reliability
- High and Varying Throughput
- High and Varying Throughput (Contd.)
- Data Formats
- Data Formats (Contd.)
- Transformations
- Transformations - ELT
- Security
- Failure Handling
- Agility and Coupling
- Ad-hoc Pipelines
- Metadata Loss
- Extreme Processing
- Kafka Connect vs. Producer and Consumer
- Kafka Connect vs. Producer and Consumer (Contd.)
- Summary
Chapter 4 - Integrating Kafka with Other Systems
- Introduction to Kafka Integration
- Kafka Connect
- Kafka Connect (Contd.)
- Running Kafka Connect Operating Modes
- Key Configurations for Connect workers:
- Kafka Connect API
- Kafka Connect Example – File Source
- Kafka Connect Example – File Sink
- Kafka Connector Example – MySQL to Elasticsearch
- Kafka Connector Example – MySQL to Elasticsearch (Contd.)
- Write the data to Elasticsearch
- Building Custom Connectors
- Kafka Connect – Connectors
- Kafka Connect - Tasks
- Kafka Connect - Workers
- Kafka Connect - Offset management
- Alternatives to Kafka Connect
- Introduction to Storm
- Integrating Storm with Kafka
- Integrating Storm with Kafka – Sample Code
- Integrating Storm with Kafka
- Integrating Hadoop with Kafka
- Hadoop Consumers
- Hadoop Consumers (Contd.)
- Hadoop Consumers – Produce Topic
- Hadoop Consumers – Fetch Generated Topic
- Kafka at Uber
- Kafka at Uber (Contd.)
- Kafka at LinkedIn
- Kafka at LinkedIn – Core Kafka Services
- Kafka at LinkedIn – Core Kafka Services (Contd.)
- Kafka at LinkedIn – Libraries
- Kafka at LinkedIn – Monitoring and Stream Processing
- Summary
Chapter 5 - Kafka and Schema Management
- Evolving Schema
- Protobuf (Protocol Buffers) Overview
- Avro Overview
- Managing Data Evolution Using Schemas
- Confluent Platform
- Confluent Schema Registry
- Schema Change and Backward Compatibility
- Collaborating over Schema Change
- Handling Unreadable Messages
- Deleting Data
- Segregating Public and Private Topics
- Summary
Chapter 6 - Kafka Streams and KSQL
- What Kafka can be used for?
- What Kafka can be used for? (Contd.)
- What Exactly is Kafka?
- The APIs for Stream Processing
- Kafka: A Streaming Platform
- What is KSQL?
- What is KSQL? (Contd.)
- Starting KSQL
- Using the KSQL CLI
- KSQL Data Types
- Review the Structure of an Existing STREAM
- Query the STREAM
- KSQL Functions
- Writing to a Topic
- KSQL Table vs. Stream
- KSQL JOIN
- Windows in KSQL Queries
- Miscellaneous KSQL Commands
- Summary
Chapter 7 - KSQL UDF and Deployment
- KSQL Custom Functions
- KSQL UDF/UDAF
- Implement a Custom Function
- Creating UDF and UDAF
- Creating UDF and UDAF (Contd.)
- UDFs and Null Handling
- UDFs and Null Handling (Contd.)
- Sample UDF Class
- Build Engine
- UDAF
- UDAF Sample Class
- Supported Types
- Deploying Custom Functions
- Using Custom Functions
- Summary
Lab Exercises
Lab 1. Kafka Basics
Lab 2. Kafka Multiple Brokers and Import/Export Messages
Lab 3. Apache Kafka with Java
Lab 4. Apache Kafka with Node.js
Lab 5. Kafka Integration With Spark
Lab 6. KSQL Basics
Lab 7. KSQL Create and Deploy UDF