Introduction
In modern applications, real-time information is continuously generated (publishers/producers), and routed to other applications (subscribers/consumers).
Kafka has high throughput, and is built to scale out in a distributed model on multiple servers. Kafka persists messages on disk, and is used for batched consumption as well as real-time applications. It is also used to decouple the frontend from the backend to create scalable applications, and support long-running processes.
This four-day class will introduce students to Kafka’s capabilities using .NET Core, through lecture, labs, and a hands-on project the final day.
Objectives
- Understand the use of Kafka for high performance messaging
- Identify the usages for Kafka in Microservices
- Explain the benefits of Kafka patterns
- Differentiate between messaging and message brokers
- Describe Kafka messaging environments
- Develop producers and consumers for Kafka
- Recognize how Kafka enables Cloud-native applications
- Summarize characteristics and architecture for Kafka
- Demonstrate how to process and consume messages from Kafka using .NET Core Web API, MVC, and Worker (BackgroundService)
- Demonstrate Kafka’s role in the end-to-end project involving .NET MVC frontend, .NET Web API backend, .NET Worker, Azure SQL database, and Redis cache.
- Design distributed high throughput systems based on Kafka
- Describe the built-in partitioning, replication and inherent fault-tolerance of Kafka
Topics
- Introduction to Kafka
- Introduction to Confluent Cloud
- Using Apache Kafka on Confluent Cloud
- Building Data Pipelines
- Integrating Kafka with Other Systems
- Kafka and Schema Management
- Developing an end-to-end application involving Kafka, database server (e.g. Azure SQL), caching solution (e.g. Redis cache), .NET Web API, .NET Worker, and .NET MVC.
Audience
This is an introductory course for .NET/C# developers, architects, system integrators, security administrators, network administrators, software engineers, technical support individuals, technology leaders & managers, and consultants who are responsible for elements of messaging for data collection, transformation, and integration for your organization. Supporting Application Modernization, Cloud-Native Development, and Digital Data Supply Chain (Big Data/IoT/AI/Machine Learning/Advanced Analytics/Business Intelligence).
Prerequisites
Basic understanding of messaging, cloud, development, architecture and virtualization is beneficial. Experience in .NET with C# is highly recommended as the majority of the labs, and the course project uses .NET Core.
Duration
4 days (3 days of lectures + labs and 1 day of project)
Outline for Introduction to Kafka for C# Developers Training
Chapter 1 - Introduction to Kafka
- Messaging Architectures – What is Messaging?
- Messaging Architectures – Steps to Messaging
- Messaging Architectures – Messaging Models
- What is Kafka?
- What is Kafka? (Contd.)
- Kafka Overview
- Kafka Overview (Contd.)
- Need for Kafka
- When to Use Kafka?
- Kafka Architecture
- Core concepts in Kafka
- Kafka Topic
- Kafka Partitions
- Kafka Producer
- Kafka Consumer
- Kafka Broker
- Kafka Cluster
- Why Kafka Cluster?
- Sample Multi-Broker Cluster
- Overview of ZooKeeper
- Kafka Cluster & ZooKeeper
- Schema Registry
- Schema Registry (contd.)
- Who Uses Kafka?
- Summary
Chapter 2 - The Inner Workings of Apache Kafka
- A Kafka Cluster High-Level Interaction Diagram
- Topics & Partitions
- The Terms Event/Message/Record
- Message Offset
- Message Retention Settings
- Deleting Messages
- The Flush Policies
- Writing to Partitions
- Batches
- Batch Compression
- Partitions as a Unit of Parallelism
- Message Ordering
- Kafka Default Partitioner
- The Load Balancing Aspect
- Kafka Message Production Schematics
- ZooKeeper
- Reading from a Topic
- Consumer Lag
- Consumer Group
- Consumer Group Diagram
- The Broker
- Broker Hardware Consideration
- OS and File System
- The Leader and Followers Pattern
- Partition Replication Diagram
- Controlled Shutdown
- Controlling Message Durability with Minimum In-Sync Replicas
- Log Compaction
- Frequent Operational Problems
- Some Kafka Design FAQs
- Summary
Chapter 3 - Using Apache Kafka
- What is Confluent?
- Confluent Cloud
- Confluent Cloud Resource Hierarchy
- Setting up Confluent Cloud on Azure
- Setting up Confluent Cloud using Confluent.io
- Select the Confluent Cloud Cluster Type
- Choose the Cloud Provider
- Setting up Confluent Cloud using Azure Marketplace
- Select Confluent Cloud in Azure Marketplace
- Purchase Confluent Cloud
- The Cluster View
- Exploring the Confluent Cloud Console
- Topics
- Topics Advanced Settings
- Searching for Messages in a Topic
- The Confluent CLI
- The confluent CLI Command Examples
- Kafka Cluster Planning
- Kafka Cluster Planning – Producer/Consumer Throughput
- Kafka Cluster Planning – Sizing for Topics and Partitions
- Kafka Cluster Planning – Sizing for Topics and Partitions (contd.)
- Managing Topics in Confluent Cloud Console
- Editing an Existing Topic
- Editing an Existing Topic (contd.)
- Delete a Topic
- Kafka and .NET
- .NET Kafka Architectures
- Packages
- Installing the Packages
- Navigating .NET Client Documentation
- Important Classes and Interfaces
- appsettings.json Kafka Configuration
- Loading the Configuration from appsettings.json
- Produce and ProduceAsync Methods
- Produce vs ProduceAsync
- Error Handling
- Consuming Messages
- Creating and Deleting Topics
- Copying Data from Between Environments
- Mocking Datasets using Datagen Connector
- Monitoring Confluent Cloud
- Monitoring Confluent Cloud (contd.)
- Monitoring Confluent Cloud using cURL
- Motoring Confluent Cloud using third-party Tools
- Summary
Chapter 4 - Building Data Pipelines
- Building Data Pipelines
- What to Consider When Building Data Pipelines
- Timeliness
- Reliability
- High and Varying Throughput
- High and Varying Throughput (Contd.)
- Evolving Schema
- Data Formats
- Data Formats (Contd.)
- Protobuf (Protocol Buffers) Overview
- Avro Overview
- Avro Schema Example
- JSON Schema Example
- Managing Data Evolution Using Schemas
- Confluent Schema Registry
- Confluent Schema Registry in a Nutshell
- Schema Management on Confluent Cloud
- Create a Schema
- Create a Schema using Confluent CLI
- Create a Schema from the Web UI
- Schema Change and Backward Compatibility
- Collaborating over Schema Change
- Handling Unreadable Messages
- Deleting Data
- Segregating Public and Private Topics
- Transformations
- Transformations - ELT
- Security
- Failure Handling
- Agility and Coupling
- Ad-hoc Pipelines
- Metadata Loss
- Extreme Processing
- Kafka Connect vs. Producer and Consumer
- Kafka Connect vs. Producer and Consumer (Contd.)
- Summary
Chapter 5 - Integrating Kafka with Other Systems
- Introduction to Kafka Integration
- Kafka Connect
- Kafka Connect (Contd.)
- Running Kafka Connect Operating Modes
- Key Configurations for Connect workers:
- Kafka Connect API
- Kafka Connect Example – File Source
- Kafka Connect Example – File Sink
- Summary
Chapter 6 - Kafka Security
- Kafka Security
- Encryption and Authentication using SSL
- Encryption and Authentication using SSL (Contd.)
- Configuring Kafka Brokers
- Configuring Kafka Brokers – Optional Settings
- Authenticating Using SASL
- Authenticating Using SASL – Configuring Kafka Brokers
- Authenticating Using SASL – Configuring Kafka Brokers (Contd.)
- Authorization and ACLs
- Authorization and ACLs (Contd.)
- Securing a Running Cluster
- Securing a Running Cluster (Contd.)
- ZooKeeper Authentication
- ZooKeeper Authentication (Contd.)
- Summary
Chapter 7 - Monitoring Kafka
- Introduction
- Metrics Basics
- JVM Monitoring
- Garbage collection
- Garbage Collection (Contd.)
- Java OS monitoring
- OS Monitoring
- OS Monitoring (Contd.)
- Kafka Broker Metrics
- Under-Replicated Partitions
- Active controller count
- Request handler idle ratio
- Intelligent Thread Usage
- All topics bytes in
- All topics bytes out
- All topics messages in
- Partition count
- Leader count
- Offline partitions
- Request metrics
- Request Metrics (Contd.)
- Logging
- Logging (Contd.)
- Client Monitoring
- Producer Metrics
- Overall producer metrics
- Overall producer metrics (Contd.)
- Per-broker and per-topic metrics
- Consumer Metrics
- Fetch Manager Metrics
- Per-broker and per-topic metrics
- Consumer coordinator metrics
- Quotas
- Quotas (Contd.)
- Lag Monitoring
- Lag Monitoring (Contd.)
- End-to-End Monitoring
- Summary
Chapter 8 - Apache Kafka Best Practices
- Best Practices for Working with Partitions
- Best Practices for Working with Partitions (Contd.)
- Best Practices for Working with Consumers
- Best Practices for Working with Consumers (Contd.)
- Best Practices for Working with Producers
- Best Practices for Working with Producers (Contd.)
- Best Practices for Working with Brokers
- Best Practices for Working with Brokers (Contd.)
- Summary
LABS
Lab 1 - Signing Up for the Free Trial of Confluent Cloud
Lab 2 - Understanding Confluent Cloud Clusters
Lab 3 - Understanding Confluent Cloud CLI
Lab 4 - Understanding Kafka Topics
Lab 5 - Using the Confluent CLI to Consume Messages
Lab 6 - Creating an ASP.NET Core MVC Kafka Client
Lab 7 - Creating an ASP.NET Core Web API Kafka Client
Lab 8 - Creating a .NET Core 3.1 Worker Kafka Client
Lab 9 - Integrating Azure SQL and Kafka
Lab 10 - Integrating Redis cache with .NET Core 3.1 ASP.NET MVC Lab
The Final Project