example

Kafka Basics

In this lab, you will explore the Apache Kafka basics. You will utilize the OOB, command-line tools provided by Kafka to start up ZooKeper and Kafka. Then you will create a Kafka topic and use the producer and consumer to utilize the messages written to the queue. You will also explore single-broker and multi-broker clusters. In the end, you will explore how to import messages into the queue and export messages from the queue.

Start ZooKeper

Kafka uses ZooKeeper so you need to first start a ZooKeeper server. In this part, you will start a single-node ZooKeeper instance

Open Terminal window by clicking Application > Terminal

In menu bar, click Terminal > Set Title…

Enter Zookeeper in the Title field and click OK

Note: You will end up using a lot of Terminal windows. Setting the Title of each Terminal window will make it easier to locate the right window.

Switch to the Kafka directory

cd /software/kafka_2.11-1.1.0

Start Zookeeper

bin/zookeeper-server-start.sh config/zookeeper.properties

Keep the Terminal window open

Start Kafka Server

Open Terminal window by clicking Application > Terminal

In menu bar, click Terminal > Set Title…

Enter Kafka Server in the Title field and click OK

Switch to the Kafka directory

cd /software/kafka_2.11-1.1.0

Start Kafka server

bin/kafka-server-start.sh config/server.properties

Keep the Terminal window open

Create a Kafka Topic

Kafka maintains feeds of messages in categories called topics. In this part, you will create a topic named “webage” with a single partition and only one replica.

Open Terminal window by clicking Application > Terminal

In menu bar, click Terminal > Set Title…

Enter Producer in the Title field and click OK

Switch to the Kafka directory

cd /software/kafka_2.11-1.1.0

Create a Kafka topic

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic webage

Get a list of Kafka topics

bin/kafka-topics.sh --list --zookeeper localhost:2181

Notice it shows webage

Send some messages

Kafka comes with a command line client that will take input from a file or from standard input and send it out as messages to the Kafka cluster

In the Terminal window, which you opened in Part 3 of this lab, execute the following command to send some messages to the Kafka cluster

bin/kafka-console-producer.sh --broker-list localhost:9092 --topic webage

Notice it shows you a prompt > where you can start entering messages which will be sent to the Kafka cluster

Enter the following messages

hello world
another message

Don’t exit the > prompt. You can exit the prompt by pressing Ctrl+Z, but don’t do it until instructed.

Keep the Terminal window open

Read messages from the Kafka cluster

Kafka also has a command line consumer that will dump out messages to standard output.

Open Terminal window by clicking Application > Terminal

In menu bar, click Terminal > Set Title…

Enter Consumer in the Title field and click OK

Switch to the Kafka directory

cd /software/kafka_2.11-1.1.0

In the Terminal window, enter the following command to read the messages

bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning

Notice it shows the following messages

hello world
another message

Switch back to the Producer Terminal window and enter more messages.

Notice the newly entered messages automatically show up in the Consumer Terminal window

Switch to the Producer Terminal window and press Ctrl+Z to stop entering more messages. Keep the Terminal window open.

Switch to the Consumer Terminal window and press Ctrl+Z to stop reading messages. Keep the Terminal window open.

Setting up a multi-broker clusters

So far we have been running against a single broker. A single broker is just a cluster of size one. In this part, you will expand the cluster to three nodes by creating 2 additional nodes.

Open Open Terminal window by clicking Application > Terminal

In menu bar, click Terminal > Set Title…

Enter Kafka Server 1 in the Title field and click OK

Switch to the Kafka directory

cd /software/kafka_2.11-1.1.0

Open server.properties in text editor

gedit config/server.properties

Note: You can use any text editor of your choice, e.g. vi, nano

server.properties is the configuration file used by Kafka server. Notice there’s a property broker.id=0. 0 is the ID of the first Kafka server which you executed earlier in this lab.

Press Ctrl+Q to exit to the Terminal window

Execute the following command to find the number of nodes/brokers in the Kafka cluster

bin/zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids"

Notice it shows 0 which
means there's one broker

Create two copies of server.properties.

cp config/server.properties config/server-1.properties cp config/server.properties config/server-2.properties

Each copy will be used by a separate Kafka server/node

Edit server-1.properties

gedit server-1.properties

Locate broker.id=0 and change it to broker.id=1

Each node must have a unique id

Locate #listeners=PLAINTEXT://:9092 remove the comment (#) symbol, and change change the port number to 9093

All nodes will be running on the same machine, therefore a unique port has to be used.

Locate log.dirs=/tmp/kafka-logs and change it to log.dirs=/tmp/kafka-logs-1

All
nodes will be running on the same machine, therefore a unique log has
to be used

Press
Ctrl+S to save the file

Press
Ctrl+Q to exit to the
Terminal
window

Edit
server-2.properties

gedit server-2.properties

Locate broker.id=0 and change it to broker.id=2

Each node must have a unique id

Locate #listeners=PLAINTEXT://:9092 remove the comment (#) symbol, and change change the port number to 9094

All nodes will be running on the same machine, therefore a unique port has to be used.

Locate log.dirs=/tmp/kafka-logs and change it to log.dirs=/tmp/kafka-logs-2

All
nodes will be running on the same machine, therefore a unique log has
to be used

Press
Ctrl+S to save the file

Press
Ctrl+Q to exit to the
Terminal
window

Start up a multi-broker cluster and verify the cluster

In this part, you will start up the nodes you configured in the previous part and verify the cluster has 3 nodes

In the Kafka Server 1 Terminal window, run the following command to start up the second node

bin/kafka-server-start.sh config/server-1.properties

Open Open Terminal window by clicking Application > Terminal

In menu bar, click Terminal > Set Title…

Enter Kafka Server 2 in the Title field and click OK

Switch to the Kafka directory

cd /software/kafka_2.11-1.1.0

Execute the following command to find the number of nodes/brokers in the Kafka cluster

bin/zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids"

Notice
it shows 0 and 1 which means there are 2 nodes

In the Kafka Server 2 Terminal window, run the following command to start up the second node

bin/kafka-server-start.sh config/server-2.properties

Switch to the
Producer
Terminal
window
and e
xecute
the following command to find the number of nodes/brokers in the
Kafka cluster

bin/zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids"

Notice
it shows 0, 1, and 2 which means there are 3 nodes

Create a new topic on the multi-broker cluster

In this part, you will create a new topic with replication factor of 3, which means it will utilize the 3 nodes you configured and started in the previous parts of this lab.

In the Producer Terminal window, execute the following command to create a new topic

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 1 --topic webage-replicated-topic

Verify the new topic is created

bin/kafka-topics.sh --list --zookeeper localhost:2181

Notice
webage-replicated-topic shows up in addition to webage topic

Get webage topic details to see the node(s) handling the topic

bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic webage

Notice it shows ReplicationFactor 1 (number of nodes) and Replicas 0 (node ID)

Get webage-replicated-topic details to see the node(s) handling the topic

bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic webage-replicated-topic

Notice it shows ReplicationFactor 3 (number of nodes) and Replicas 0, 1, 2 (node IDs). Leader is 0, which means Kafka server with broker.id=0 is acting as the “master” or lead node.

“leader” is the node responsible for all reads and writes for the given partition. Each node will be the leader for a randomly selected portion of the partitions.

“replicas” is the list of nodes that replicate the log for this partition regardless of whether they are the leader or even if they are currently alive.

“isr” is the set of “in-sync” replicas. This is the subset of the replicas list that is currently alive and caught-up to the leader.

Send messages to the new replicated topic

In this part, you will send more messages to the new topic which is utilizing the multi-broker cluster setup

In the Producer Terminal window, enter the following command to send messages

bin/kafka-console-producer.sh --broker-list localhost:9092 --topic webage-replicated-topic

At the Producer prompt, enter the following messages:

message 1

message 2

Remain at the Producer prompt (>) and keep the Terminal window open.

Read messages from the new replicated topic

In this part, you will read messages from the new replicated topic

Open Open Terminal window by clicking Application > Terminal

In menu bar, click Terminal > Set Title…

Enter Consumer in the Title field and click OK

Switch to the Kafka directory

cd /software/kafka_2.11-1.1.0

Execute the following command to read messages from the replicated topic which you created in the previous parts of the lab

bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --from-beginning --topic webage-replicated-topic

Notice it shows message 1
and message 2

Test Cluster Fault Tolerance

In this part, you will test the multi-broker cluster fault tolerance by terminating one of the nodes/Kafka server instance

Open Open Terminal window by clicking Application > Terminal

In menu bar, click Terminal > Set Title…

Enter FaultTolerance in the Title field and click OK

Switch to the Kafka directory

cd /software/kafka_2.11-1.1.0

Find process ID of Kafka Server 1 instance (broker.id = 1) by running the following command in the Terminal window

ps aux | grep server-1.properties

Scroll up and locate the process ID listed in the first line immediately below the command you executed in the previous step.

It should look like this (your process ID will be different)

Kill the process

kill 53749

Note: use your process id instead of 53749

Verify there’s one less node in the cluster

bin/zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids"

Notice it shows 0 and 2. Kafka Server 1 process has been terminated.

You can also verify this by switching to the Kafka Server 1 Terminal window. You will notice it has exited to the prompt.

Get webage topic details to see the node(s) handling the topic

bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic webage

Notice it still shows Replicas: 0,1,2 (originally, there were 3 nodes), but Isr (in-sync replicas) shows 0, 2

Switch to the Producer Terminal window and enter another message

still working

Switch to the Consumer Terminal window and verify it shows “still working” message.

Use Kafka Connect to import/export data

Writing data from the console and writing it back to the console is a convenient place to start, but you’ll probably want to use data from other sources or export data from Kafka to other systems. For many systems, instead of writing custom integration code you can use Kafka Connect to import or export data.

Kafka Connect is a tool included with Kafka that imports and exports data to Kafka. It is an extensible tool that runs connectors, which implement the custom logic for interacting with an external system. In this part, you’ll see how to run Kafka Connect with simple connectors that import data from a file to a Kafka topic and export data from a Kafka topic to a file.

Switch to the Producer Terminal window and press Ctrl+C to exit to the terminal.

Create a text file with some seed data

echo -e "Hello\nworld!" > webage.txt

Verify you have the text file with data

more webage.txt

Notice it shows the following data

Hello
world!

Edit Kafka Connect source configuration

gedit config/connect-file-source.properties

The source file configuration is used to read the messages from some existing text file into a Kafka topic.

Notice the default file configured is named test.txt.

Change test.txt to webage.txt

Change topic from connect-test to webage-replicated-topic

Press Ctrl+S to save the file

Press Ctrl+Q to exit to the Terminal window

Edit Kafka Connect file sink configuration

Sink file configuration is used to read data from a source Kafka topic and write data into a text file on the filesystem.

Start Kafka Connect

bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties config/connect-file-sink.properties

Switch to the Consumer Terminal window and notice two new rows are available as shown below:

{"schema":{"type":"string","optional":false},"payload":"hello"} {"schema":{"type":"string","optional":false},"payload":"world!"}

Cleanup

In this part you will close the Terminal windows which aren’t required.

Close the following Terminal windows. Press Ctrl+C in Terminal windows where any process is running

FaultTolerance
Producer

Note: Leave ZooKeeper, Kafka servers, and Consumer running. You will utilize the consumer in the next lab.

Review

In this lab, you explored the Apache Kafka basics. You utilized the OOB, command-line tools provided by Kafka to start up ZooKeper and Kafka. You also created a Kafka topic and used the producer and consumer to utilize the messages written to the queue. You also explored single-broker and multi-broker clusters. In the end, you explored how to import messages into the queue and export messages from the Kafka cluster.