Introduction

In modern applications, real-time information is continuously generated (publishers/producers), and routed to other applications (subscribers/consumers).

Kafka has high throughput, and is built to scale out in a distributed model on multiple servers. Kafka persists messages on disk, and is used for batched consumption as well as real-time applications. It is also used to decouple the frontend from the backend to create scalable applications, and support long-running processes.

This four-day class will introduce students to Kafka’s capabilities using .NET Core, through lecture, labs, and a hands-on project the final day.

Objectives

  • Understand the use of Kafka for high performance messaging
  • Identify the usages for Kafka in Microservices
  • Explain the benefits of Kafka patterns
  • Differentiate between messaging and message brokers
  • Describe Kafka messaging environments
  • Develop producers and consumers for Kafka
  • Recognize how Kafka enables Cloud-native applications
  • Summarize characteristics and architecture for Kafka
  • Demonstrate how to process and consume messages from Kafka using .NET Core Web API, MVC, and Worker (BackgroundService)
  • Demonstrate Kafka’s role in the end-to-end project involving .NET MVC frontend, .NET Web API backend, .NET Worker, Azure SQL database, and Redis cache.
  • Design distributed high throughput systems based on Kafka
  • Describe the built-in partitioning, replication and inherent fault-tolerance of Kafka

Topics

  • Introduction to Kafka
  • Introduction to Confluent Cloud
  • Using Apache Kafka on Confluent Cloud
  • Building Data Pipelines
  • Integrating Kafka with Other Systems
  • Kafka and Schema Management
  • Developing an end-to-end application involving Kafka, database server (e.g. Azure SQL), caching solution (e.g. Redis cache), .NET Web API, .NET Worker, and .NET MVC.

Audience

This is an introductory course for .NET/C# developers, architects, system integrators, security administrators, network administrators, software engineers, technical support individuals, technology leaders & managers, and consultants who are responsible for elements of messaging for data collection, transformation, and integration for your organization. Supporting Application Modernization, Cloud-Native Development, and Digital Data Supply Chain (Big Data/IoT/AI/Machine Learning/Advanced Analytics/Business Intelligence).

Prerequisites

Basic understanding of messaging, cloud, development, architecture and virtualization is beneficial. Experience in .NET with C# is highly recommended as the majority of the labs, and the course project uses .NET Core.

Duration

4 days (3 days of lectures + labs and 1 day of project)

Outline for Introduction to Kafka for C# Developers Training

Chapter 1 - Introduction to Kafka

  • Messaging Architectures – What is Messaging?
  • Messaging Architectures – Steps to Messaging
  • Messaging Architectures – Messaging Models
  • What is Kafka?
  • What is Kafka? (Contd.)
  • Kafka Overview
  • Kafka Overview (Contd.)
  • Need for Kafka
  • When to Use Kafka?
  • Kafka Architecture
  • Core concepts in Kafka
  • Kafka Topic
  • Kafka Partitions
  • Kafka Producer
  • Kafka Consumer
  • Kafka Broker
  • Kafka Cluster
  • Why Kafka Cluster?
  • Sample Multi-Broker Cluster
  • Overview of ZooKeeper
  • Kafka Cluster & ZooKeeper
  • Schema Registry
  • Schema Registry (contd.)
  • Who Uses Kafka?
  • Summary

Chapter 2 - The Inner Workings of Apache Kafka

  • A Kafka Cluster High-Level Interaction Diagram
  • Topics & Partitions
  • The Terms Event/Message/Record
  • Message Offset
  • Message Retention Settings
  • Deleting Messages
  • The Flush Policies
  • Writing to Partitions
  • Batches
  • Batch Compression
  • Partitions as a Unit of Parallelism
  • Message Ordering
  • Kafka Default Partitioner
  • The Load Balancing Aspect
  • Kafka Message Production Schematics
  • ZooKeeper
  • Reading from a Topic
  • Consumer Lag
  • Consumer Group
  • Consumer Group Diagram
  • The Broker
  • Broker Hardware Consideration
  • OS and File System
  • The Leader and Followers Pattern
  • Partition Replication Diagram
  • Controlled Shutdown
  • Controlling Message Durability with Minimum In-Sync Replicas
  • Log Compaction
  • Frequent Operational Problems
  • Some Kafka Design FAQs
  • Summary

Chapter 3 - Using Apache Kafka

  • What is Confluent?
  • Confluent Cloud
  • Confluent Cloud Resource Hierarchy
  • Setting up Confluent Cloud on Azure
  • Setting up Confluent Cloud using Confluent.io
  • Select the Confluent Cloud Cluster Type
  • Choose the Cloud Provider
  • Setting up Confluent Cloud using Azure Marketplace
  • Select Confluent Cloud in Azure Marketplace
  • Purchase Confluent Cloud
  • The Cluster View
  • Exploring the Confluent Cloud Console
  • Topics
  • Topics Advanced Settings
  • Searching for Messages in a Topic
  • The Confluent CLI
  • The confluent CLI Command Examples
  • Kafka Cluster Planning
  • Kafka Cluster Planning – Producer/Consumer Throughput
  • Kafka Cluster Planning – Sizing for Topics and Partitions
  • Kafka Cluster Planning – Sizing for Topics and Partitions (contd.)
  • Managing Topics in Confluent Cloud Console
  • Editing an Existing Topic
  • Editing an Existing Topic (contd.)
  • Delete a Topic
  • Kafka and .NET
  • .NET Kafka Architectures
  • Packages
  • Installing the Packages
  • Navigating .NET Client Documentation
  • Important Classes and Interfaces
  • appsettings.json Kafka Configuration
  • Loading the Configuration from appsettings.json
  • Produce and ProduceAsync Methods
  • Produce vs ProduceAsync
  • Error Handling
  • Consuming Messages
  • Creating and Deleting Topics
  • Copying Data from Between Environments
  • Mocking Datasets using Datagen Connector
  • Monitoring Confluent Cloud
  • Monitoring Confluent Cloud (contd.)
  • Monitoring Confluent Cloud using cURL
  • Motoring Confluent Cloud using third-party Tools
  • Summary

Chapter 4 - Building Data Pipelines

  • Building Data Pipelines
  • What to Consider When Building Data Pipelines
  • Timeliness
  • Reliability
  • High and Varying Throughput
  • High and Varying Throughput (Contd.)
  • Evolving Schema
  • Data Formats
  • Data Formats (Contd.)
  • Protobuf (Protocol Buffers) Overview
  • Avro Overview
  • Avro Schema Example
  • JSON Schema Example
  • Managing Data Evolution Using Schemas
  • Confluent Schema Registry
  • Confluent Schema Registry in a Nutshell
  • Schema Management on Confluent Cloud
  • Create a Schema
  • Create a Schema using Confluent CLI
  • Create a Schema from the Web UI
  • Schema Change and Backward Compatibility
  • Collaborating over Schema Change
  • Handling Unreadable Messages
  • Deleting Data
  • Segregating Public and Private Topics
  • Transformations
  • Transformations - ELT
  • Security
  • Failure Handling
  • Agility and Coupling
  • Ad-hoc Pipelines
  • Metadata Loss
  • Extreme Processing
  • Kafka Connect vs. Producer and Consumer
  • Kafka Connect vs. Producer and Consumer (Contd.)
  • Summary

Chapter 5 - Integrating Kafka with Other Systems

  • Introduction to Kafka Integration
  • Kafka Connect
  • Kafka Connect (Contd.)
  • Running Kafka Connect Operating Modes
  • Key Configurations for Connect workers:
  • Kafka Connect API
  • Kafka Connect Example – File Source
  • Kafka Connect Example – File Sink
  • Summary

Chapter 6 - Kafka Security

  • Kafka Security
  • Encryption and Authentication using SSL
  • Encryption and Authentication using SSL (Contd.)
  • Configuring Kafka Brokers
  • Configuring Kafka Brokers – Optional Settings
  • Authenticating Using SASL
  • Authenticating Using SASL – Configuring Kafka Brokers
  • Authenticating Using SASL – Configuring Kafka Brokers (Contd.)
  • Authorization and ACLs
  • Authorization and ACLs (Contd.)
  • Securing a Running Cluster
  • Securing a Running Cluster (Contd.)
  • ZooKeeper Authentication
  • ZooKeeper Authentication (Contd.)
  • Summary

Chapter 7 - Monitoring Kafka

  • Introduction
  • Metrics Basics
  • JVM Monitoring
  • Garbage collection
  • Garbage Collection (Contd.)
  • Java OS monitoring
  • OS Monitoring
  • OS Monitoring (Contd.)
  • Kafka Broker Metrics
  • Under-Replicated Partitions
  • Active controller count
  • Request handler idle ratio
  • Intelligent Thread Usage
  • All topics bytes in
  • All topics bytes out
  • All topics messages in
  • Partition count
  • Leader count
  • Offline partitions
  • Request metrics
  • Request Metrics (Contd.)
  • Logging
  • Logging (Contd.)
  • Client Monitoring
  • Producer Metrics
  • Overall producer metrics
  • Overall producer metrics (Contd.)
  • Per-broker and per-topic metrics
  • Consumer Metrics
  • Fetch Manager Metrics
  • Per-broker and per-topic metrics
  • Consumer coordinator metrics
  • Quotas
  • Quotas (Contd.)
  • Lag Monitoring
  • Lag Monitoring (Contd.)
  • End-to-End Monitoring
  • Summary

Chapter 8 - Apache Kafka Best Practices

  • Best Practices for Working with Partitions
  • Best Practices for Working with Partitions (Contd.)
  • Best Practices for Working with Consumers
  • Best Practices for Working with Consumers (Contd.)
  • Best Practices for Working with Producers
  • Best Practices for Working with Producers (Contd.)
  • Best Practices for Working with Brokers
  • Best Practices for Working with Brokers (Contd.)
  • Summary

LABS

Lab 1 -  Signing Up for the Free Trial of Confluent Cloud

Lab 2 -  Understanding Confluent Cloud Clusters 

Lab 3 -  Understanding Confluent Cloud CLI

Lab 4 -  Understanding Kafka Topics 

Lab 5 -  Using the Confluent CLI to Consume Messages

Lab 6 -  Creating an ASP.NET Core MVC Kafka Client

Lab 7 -  Creating an ASP.NET Core Web API Kafka Client

Lab 8 -  Creating a .NET Core 3.1 Worker Kafka Client

Lab 9 -  Integrating Azure SQL and Kafka

Lab 10 -  Integrating Redis cache with .NET Core 3.1 ASP.NET MVC Lab

The Final Project