WA2708
Kafka for Application Developers Training
In this course, you will learn how to use Kafka to modernize your applications.
In modern applications, real-time information is continuously generated by applications (publishers/producers) and routed to other applications (subscribers/consumers). Apache Kafka is an open source, distributed publish-subscribe messaging system. Kafka has high-throughput and is built to scale-out in a distributed model on multiple servers. Kafka persists messages on disk and can be used for batched consumption as well as real-time applications.
Course Details
Duration
2 days
Prerequisites
Basic understanding of messaging, cloud, development, architecture and virtualization would be beneficial
Target Audience
- Developers
- Architects
- System Integrators
- Security Administrators
- Network Administrators
- Software Engineers
- Technical Support Individuals
- Technology Leaders and Managers
- Consultants who are responsible for elements of messaging for data collection, transformation, and integration for your organization supporting Application Modernization, Cloud-Native Development, and Digital Data Supply Chain (Big Data/IoT/AI/Machine Learning/Advanced Analytics/Business Intelligence).
Skills Gained
- Understand the use of Kafka for high performance messaging
- Identify the usages for Kafka in Microservices
- Explain the benefits of Kafka patterns
- Differentiate between messaging and message brokers
- Describe Kafka messaging environments
- Develop producers and consumers for Kafka
- Recognize how Kafka enables Cloud-native applications
- Summarize characteristics and architecture for Kafka
- Demonstrate how to process messages with Kafka
- Design distributed high throughput systems based on Kafka
- Describe the built-in partitioning, replication and inherent fault-tolerance of Kafka
Course Outline
- Introduction to Kafka
- Messaging Architectures – What is Messaging?
- Messaging Architectures – Steps to Messaging
- Messaging Architectures – Messaging Models
- What is Kafka?
- Kafka Overview
- Need for Kafka
- When to Use Kafka?
- Kafka Architecture
- Core concepts in Kafka
- Kafka Topic
- Kafka Partitions
- Kafka Producer
- Kafka Consumer
- Kafka Broker
- Kafka Cluster
- Why Kafka Cluster?
- Sample Multi-Broker Cluster
- Overview of ZooKeeper
- Kafka Cluster & ZooKeeper
- Who Uses Kafka?
- Using Apache Kafka
- Installing Apache Kafka
- Configuration Files
- Starting Kafka
- Using Kafka Command Line Client Tools
- Setting up a Multi-Broker Cluster
- Using Multi-Broker Cluster
- Kafka Cluster Planning
- Kafka Cluster Planning – Producer/Consumer Throughput
- Kafka Cluster Planning – Number of Brokers (and ZooKeepers)
- Kafka Cluster Planning – Sizing for Topics and Partitions
- Kafka Cluster Planning – Sizing for Storage
- Kafka Connect
- Kafka Connect – Configuration Files
- Using Kafka Connect to Import/Export Data
- Creating a Spring Boot Producer
- Adding Kafka dependency to pom.xml
- Defining a Spring Boot Service to Send Message(s)
- Defining a Spring Boot Controller
- Testing the Spring Boot Producer
- Creating a Nodejs Consumer
- Building Data Pipelines
- Building Data Pipelines
- What to Consider When Building Data Pipelines
- Timeliness
- Reliability
- High and Varying Throughput
- Data Formats
- Transformations
- Transformations - ELT
- Security
- Failure Handling
- Agility and Coupling
- Ad-hoc Pipelines
- Metadata Loss
- Extreme Processing
- Kafka Connect vs. Producer and Consumer
- Integrating Kafka with Other Systems
- Introduction to Kafka Integration
- Kafka Connect
- Running Kafka Connect Operating Modes
- Key Configurations for Connect workers:
- Kafka Connect API
- Kafka Connect Example – File Source
- Kafka Connect Example – File Sink
- Kafka Connector Example – MySQL to Elasticsearch
- Write the data to Elasticsearch
- Building Custom Connectors
- Kafka Connect – Connectors
- Kafka Connect - Tasks
- Kafka Connect - Workers
- Kafka Connect - Offset management
- Alternatives to Kafka Connect
- Introduction to Storm
- Integrating Storm with Kafka
- Integrating Hadoop with Kafka
- Hadoop Consumers
- Hadoop Consumers – Produce Topic
- Hadoop Consumers – Fetch Generated Topic
- Kafka at Uber
- Kafka at LinkedIn
- Kafka at LinkedIn – Core Kafka Services
- Kafka at LinkedIn – Libraries
- Kafka at LinkedIn – Monitoring and Stream Processing
- Kafka and Schema Management
- Evolving Schema
- Protobuf (Protocol Buffers) Overview
- Avro Overview
- Managing Data Evolution Using Schemas
- Confluent Platform
- Confluent Schema Registry
- Schema Change and Backward Compatibility
- Collaborating over Schema Change
- Handling Unreadable Messages
- Deleting Data
- Segregating Public and Private Topics
- Kafka Streams and KSQL
- What Kafka can be used for?
- What Exactly is Kafka?
- The APIs for Stream Processing
- Kafka: A Streaming Platform
- What is KSQL?
- What is KSQL? (Contd.)
- Starting KSQL
- Using the KSQL CLI
- KSQL Data Types
- Review the Structure of an Existing STREAM
- Query the STREAM
- KSQL Functions
- Writing to a Topic
- KSQL Table vs. Stream
- KSQL JOIN
- Windows in KSQL Queries
- Miscellaneous KSQL Commands
- KSQL UDF and Deployment
- KSQL Custom Functions
- KSQL UDF/UDAF
- Implement a Custom Function
- Creating UDF and UDAF
- UDFs and Null Handling
- Sample UDF Class
- Build Engine
- UDAF
- UDAF Sample Class
- Supported Types
- Deploying Custom Functions
- Using Custom Functions
- Lab Exercises
- Lab 1. Kafka Basics
- Lab 2. Kafka Multiple Brokers and Import/Export Messages
- Lab 3. Apache Kafka with Java
- Lab 4. Apache Kafka with Node.js
- Lab 5. Kafka Integration With Spark
- Lab 6. KSQL Basics
- Lab 7. KSQL Create and Deploy UDF
Upcoming Course Dates