In this lab, you will explore the Apache Kafka basics. You will utilize the OOB, command-line tools provided by Kafka to start up ZooKeper and Kafka. Then you will create a Kafka topic and use the producer and consumer to utilize the messages written to the queue. You will also explore single-broker and multi-broker clusters. In the end, you will explore how to import messages into the queue and export messages from the queue.

Start ZooKeper

Kafka uses ZooKeeper so you need to first start a ZooKeeper server. In this part, you will start a single-node ZooKeeper instance

1. Open Terminal window by clicking Application > Terminal

2. In menu bar, click Terminal > Set Title…

3. Enter Zookeeper in the Title field and click OK

Note: You will end up using a lot of Terminal windows. Setting the Title of each Terminal window will make it easier to locate the right window.

4. Switch to the Kafka directory

cd /software/kafka_2.11-1.1.0

5. Start Zookeeper

bin/zookeeper-server-start.sh config/zookeeper.properties

6. Keep the Terminal window open

Start Kafka Server

1. Open Terminal window by clicking Application > Terminal

2. In menu bar, click Terminal > Set Title…

3. Enter Kafka Server in the Title field and click OK

4. Switch to the Kafka directory

cd /software/kafka_2.11-1.1.0

5. Start Kafka server

bin/kafka-server-start.sh config/server.properties

6. Keep the Terminal window open

Create a Kafka Topic

Kafka maintains feeds of messages in categories called topics. In this part, you will create a topic named “webage” with a single partition and only one replica.

1. Open Terminal window by clicking Application > Terminal

2. In menu bar, click Terminal > Set Title…

3. Enter Producer in the Title field and click OK

4. Switch to the Kafka directory

cd /software/kafka_2.11-1.1.0

5. Create a Kafka topic

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic webage

6. Get a list of Kafka topics

bin/kafka-topics.sh --list --zookeeper localhost:2181

Notice it shows webage

Send some messages

Kafka comes with a command line client that will take input from a file or from standard input and send it out as messages to the Kafka cluster

1. In the Terminal window, which you opened in Part 3 of this lab, execute the following command to send some messages to the Kafka cluster

bin/kafka-console-producer.sh --broker-list localhost:9092 --topic webage

Notice it shows you a prompt > where you can start entering messages which will be sent to the Kafka cluster

2. Enter the following messages

hello world
another message

Don’t exit the > prompt. You can exit the prompt by pressing Ctrl+Z, but don’t do it until instructed.

3. Keep the Terminal window open

Read messages from the Kafka cluster

Kafka also has a command line consumer that will dump out messages to standard output.

1. Open Terminal window by clicking Application > Terminal

2. In menu bar, click Terminal > Set Title…

3. Enter Consumer in the Title field and click OK

4. Switch to the Kafka directory

cd /software/kafka_2.11-1.1.0

5. In the Terminal window, enter the following command to read the messages

bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning

6. Notice it shows the following messages

hello world
another message

7. Switch back to the Producer Terminal window and enter more messages.

Notice the newly entered messages automatically show up in the Consumer Terminal window

8. Switch to the Producer Terminal window and press Ctrl+Z to stop entering more messages. Keep the Terminal window open.

9. Switch to the Consumer Terminal window and press Ctrl+Z to stop reading messages. Keep the Terminal window open.

Setting up a multi-broker clusters

So far we have been running against a single broker. A single broker is just a cluster of size one. In this part, you will expand the cluster to three nodes by creating 2 additional nodes.

1. Open Open Terminal window by clicking Application > Terminal

2. In menu bar, click Terminal > Set Title…

3. Enter Kafka Server 1 in the Title field and click OK

4. Switch to the Kafka directory

cd /software/kafka_2.11-1.1.0

5. Open server.properties in text editor

gedit config/server.properties

Note: You can use any text editor of your choice, e.g. vi, nano

server.properties is the configuration file used by Kafka server. Notice there’s a property broker.id=0. 0 is the ID of the first Kafka server which you executed earlier in this lab.

6. Press Ctrl+Q to exit to the Terminal window

7. Execute the following command to find the number of nodes/brokers in the Kafka cluster

bin/zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids"

Notice it shows 0 which
means there's one broker

8. Create two copies of server.properties.

cp config/server.properties config/server-1.properties cp config/server.properties config/server-2.properties

Each copy will be used by a separate Kafka server/node

9. Edit server-1.properties

gedit server-1.properties

10. Locate broker.id=0 and change it to broker.id=1

Each node must have a unique id

11. Locate #listeners=PLAINTEXT://:9092 remove the comment (#) symbol, and change change the port number to 9093

All nodes will be running on the same machine, therefore a unique port has to be used.

12.Locate log.dirs=/tmp/kafka-logs and change it to log.dirs=/tmp/kafka-logs-1

All
nodes will be running on the same machine, therefore a unique log has
to be used

13. Press Ctrl+S to save the file

14. Press Ctrl+Q to exit to the Terminalwindow

15. Editserver-2.properties

gedit server-2.properties

16. Locate broker.id=0 and change it to broker.id=2

Each node must have a unique id

17. Locate #listeners=PLAINTEXT://:9092 remove the comment (#) symbol, and change change the port number to 9094

All nodes will be running on the same machine, therefore a unique port has to be used.

18. Locate log.dirs=/tmp/kafka-logs and change it to log.dirs=/tmp/kafka-logs-2

All
nodes will be running on the same machine, therefore a unique log has
to be used

19. Press Ctrl+S to save the file

20. Press Ctrl+Q to exit to the Terminalwindow

Start up a multi-broker cluster and verify the cluster

In this part, you will start up the nodes you configured in the previous part and verify the cluster has 3 nodes

1. In the Kafka Server 1 Terminal window, run the following command to start up the second node

bin/kafka-server-start.sh config/server-1.properties

2. Open Open Terminal window by clicking Application > Terminal

3. In menu bar, click Terminal > Set Title…

4. Enter Kafka Server 2 in the Title field and click OK

5. Switch to the Kafka directory

cd /software/kafka_2.11-1.1.0

6. Execute the following command to find the number of nodes/brokers in the Kafka cluster

bin/zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids"

Notice
it shows 0 and 1 which means there are 2 nodes

7. In the Kafka Server 2 Terminal window, run the following command to start up the second node

bin/kafka-server-start.sh config/server-2.properties

8. Switch to the Producer Terminalwindow andexecute the following command to find the number of nodes/brokers in the Kafka cluster

bin/zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids"

Notice
it shows 0, 1, and 2 which means there are 3 nodes

Create a new topic on the multi-broker cluster

In this part, you will create a new topic with replication factor of 3, which means it will utilize the 3 nodes you configured and started in the previous parts of this lab.

1. In the Producer Terminal window, execute the following command to create a new topic

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 1 --topic webage-replicated-topic

2. Verify the new topic is created

bin/kafka-topics.sh --list --zookeeper localhost:2181

Notice
webage-replicated-topic shows up in addition to webage topic

3. Get webage topic details to see the node(s) handling the topic

bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic webage

Notice it shows ReplicationFactor 1 (number of nodes) and Replicas 0 (node ID)

4. Get webage-replicated-topic details to see the node(s) handling the topic

bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic webage-replicated-topic

Notice it shows ReplicationFactor 3 (number of nodes) and Replicas 0, 1, 2 (node IDs). Leader is 0, which means Kafka server with broker.id=0 is acting as the “master” or lead node.

“leader” is the node responsible for all reads and writes for the given partition. Each node will be the leader for a randomly selected portion of the partitions.

“replicas” is the list of nodes that replicate the log for this partition regardless of whether they are the leader or even if they are currently alive.

“isr” is the set of “in-sync” replicas. This is the subset of the replicas list that is currently alive and caught-up to the leader.

Send messages to the new replicated topic

In this part, you will send more messages to the new topic which is utilizing the multi-broker cluster setup

1. In the Producer Terminal window, enter the following command to send messages

bin/kafka-console-producer.sh --broker-list localhost:9092 --topic webage-replicated-topic

2. At the Producer prompt, enter the following messages:

message 1

message 2

3. Remain at the Producer prompt (>) and keep the Terminal window open.

Read messages from the new replicated topic

In this part, you will read messages from the new replicated topic

1.  Open Terminal window by clicking Application > Terminal

2. In menu bar, click Terminal > Set Title…

3. Enter Consumer in the Title field and click OK

4. Switch to the Kafka directory

cd /software/kafka_2.11-1.1.0

5. Execute the following command to read messages from the replicated topic which you created in the previous parts of the lab

bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --from-beginning --topic webage-replicated-topic

Notice it shows message 1
and message 2

Test Cluster Fault Tolerance

In this part, you will test the multi-broker cluster fault tolerance by terminating one of the nodes/Kafka server instance

1. Open Terminal window by clicking Application > Terminal

2. In menu bar, click Terminal > Set Title…

3. Enter FaultTolerance in the Title field and click OK

4. Switch to the Kafka directory

cd /software/kafka_2.11-1.1.0

5. Find process ID of Kafka Server 1 instance (broker.id = 1) by running the following command in the Terminal window

ps aux | grep server-1.properties

6. Scroll up and locate the process ID listed in the first line immediately below the command you executed in the previous step.

It should look like this (your process ID will be different)

7. Kill the process

kill 53749

Note: use your process id instead of 53749

8. Verify there’s one less node in the cluster

bin/zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids"

Notice it shows 0 and 2. Kafka Server 1 process has been terminated.

You can also verify this by switching to the Kafka Server 1 Terminal window. You will notice it has exited to the prompt.

9. Get webage topic details to see the node(s) handling the topic

bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic webage

Notice it still shows Replicas: 0,1,2 (originally, there were 3 nodes), but Isr (in-sync replicas) shows 0, 2

10. Switch to the Producer Terminal window and enter another message

still working

11. Switch to the Consumer Terminal window and verify it shows “still working” message.

Use Kafka Connect to import/export data

Writing data from the console and writing it back to the console is a convenient place to start, but you’ll probably want to use data from other sources or export data from Kafka to other systems. For many systems, instead of writing custom integration code you can use Kafka Connect to import or export data.

Kafka Connect is a tool included with Kafka that imports and exports data to Kafka. It is an extensible tool that runs connectors, which implement the custom logic for interacting with an external system. In this part, you’ll see how to run Kafka Connect with simple connectors that import data from a file to a Kafka topic and export data from a Kafka topic to a file.

1. Switch to the Producer Terminal window and press Ctrl+C to exit to the terminal.

2. Create a text file with some seed data

echo -e "Hello\nworld!" > webage.txt

3. Verify you have the text file with data

more webage.txt

4. Notice it shows the following data

Hello
world!

5. Edit Kafka Connect source configuration

gedit config/connect-file-source.properties

The source file configuration is used to read the messages from some existing text file into a Kafka topic.

Notice the default file configured is named test.txt.

6. Change test.txt to webage.txt

7. Change topic from connect-test to webage-replicated-topic

8. Press Ctrl+S to save the file

9. Press Ctrl+Q to exit to the Terminal window

10 Edit Kafka Connect file sink configuration

Sink file configuration is used to read data from a source Kafka topic and write data into a text file on the filesystem.

11. Start Kafka Connect

bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties config/connect-file-sink.properties

12. Switch to the Consumer Terminal window and notice two new rows are available as shown below:

{"schema":{"type":"string","optional":false},"payload":"hello"} {"schema":{"type":"string","optional":false},"payload":"world!"}

Cleanup

In this part you will close the Terminal windows which aren’t required.

Close the following Terminal windows. Press Ctrl+C in Terminal windows where any process is running

FaultTolerance
Producer

Note: Leave ZooKeeper, Kafka servers, and Consumer running. You will utilize the consumer in the next lab.

Review

In this lab, you explored the Apache Kafka basics. You utilized the OOB, command-line tools provided by Kafka to start up ZooKeper and Kafka. You also created a Kafka topic and used the producer and consumer to utilize the messages written to the queue. You also explored single-broker and multi-broker clusters. In the end, you explored how to import messages into the queue and export messages from the Kafka cluster.

Cleanup

In this part you will close the Terminal windows which aren’t required.

1. Close the following Terminal windows. Press Ctrl+C in Terminal windows where any process is running

FaultTolerance
Producer

Note: Leave ZooKeeper, Kafka servers, and Consumer running. You will utilize the consumer in the next lab.

Review

In this lab, you explored the Apache Kafka basics. You utilized the OOB, command-line tools provided by Kafka to start up ZooKeper and Kafka. You also created a Kafka topic and used the producer and consumer to utilize the messages written to the queue. You also explored single-broker and multi-broker clusters. In the end, you explored how to import messages into the queue and export messages from the Kafka cluster.