In this lab, you will explore the Apache Kafka basics. You will utilize the OOB, command-line tools provided by Kafka to start up ZooKeper and Kafka. Then you will create a Kafka topic and use the producer and consumer to utilize the messages written to the queue. You will also explore single-broker and multi-broker clusters. In the end, you will explore how to import messages into the queue and export messages from the queue.
Start ZooKeper
Kafka uses ZooKeeper so you need to first start a ZooKeeper server. In this part, you will start a single-node ZooKeeper instance
1. Open Terminal window by clicking Application > Terminal
2. In menu bar, click Terminal > Set Title…
3. Enter Zookeeper in the Title field and click OK
Note: You will end up using a lot of Terminal windows. Setting the Title of each Terminal window will make it easier to locate the right window.
4. Switch to the Kafka directory
cd /software/kafka_2.11-1.1.0
5. Start Zookeeper
bin/zookeeper-server-start.sh config/zookeeper.properties
6. Keep the Terminal window open
Start Kafka Server
1. Open Terminal window by clicking Application > Terminal
2. In menu bar, click Terminal > Set Title…
3. Enter Kafka Server in the Title field and click OK
4. Switch to the Kafka directory
cd /software/kafka_2.11-1.1.0
5. Start Kafka server
bin/kafka-server-start.sh config/server.properties
6. Keep the Terminal window open
Create a Kafka Topic
Kafka maintains feeds of messages in categories called topics. In this part, you will create a topic named “webage” with a single partition and only one replica.
1. Open Terminal window by clicking Application > Terminal
2. In menu bar, click Terminal > Set Title…
3. Enter Producer in the Title field and click OK
4. Switch to the Kafka directory
cd /software/kafka_2.11-1.1.0
5. Create a Kafka topic
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic webage
6. Get a list of Kafka topics
bin/kafka-topics.sh --list --zookeeper localhost:2181
Notice it shows webage
Send some messages
Kafka comes with a command line client that will take input from a file or from standard input and send it out as messages to the Kafka cluster
1. In the Terminal window, which you opened in Part 3 of this lab, execute the following command to send some messages to the Kafka cluster
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic
webage
Notice it shows you a prompt > where you can start entering messages which will be sent to the Kafka cluster
2. Enter the following messages
hello world another message
Don’t exit the > prompt. You can exit the prompt by pressing Ctrl+Z, but don’t do it until instructed.
3. Keep the Terminal window open
Read messages from the Kafka cluster
Kafka also has a command line consumer that will dump out messages to standard output.
1. Open Terminal window by clicking Application > Terminal
2. In menu bar, click Terminal > Set Title…
3. Enter Consumer in the Title field and click OK
4. Switch to the Kafka directory
cd /software/kafka_2.11-1.1.0
5. In the Terminal window, enter the following command to read the messages
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test
--from-beginning
6. Notice it shows the following messages
hello world another message
7. Switch back to the Producer Terminal window and enter more messages.
Notice the newly entered messages automatically show up in the Consumer Terminal window
8. Switch to the Producer Terminal window and press Ctrl+Z to stop entering more messages. Keep the Terminal window open.
9. Switch to the Consumer Terminal window and press Ctrl+Z to stop reading messages. Keep the Terminal window open.
Setting up a multi-broker clusters
So far we have been running against a single broker. A single broker is just a cluster of size one. In this part, you will expand the cluster to three nodes by creating 2 additional nodes.
1. Open Open Terminal window by clicking Application > Terminal
2. In menu bar, click Terminal > Set Title…
3. Enter Kafka Server 1 in the Title field and click OK
4. Switch to the Kafka directory
cd /software/kafka_2.11-1.1.0
5. Open server.properties in text editor
gedit config/server.properties
Note: You can use any text editor of your choice, e.g. vi, nano
server.properties is the configuration file used by Kafka server. Notice there’s a property broker.id=0. 0 is the ID of the first Kafka server which you executed earlier in this lab.
6. Press Ctrl+Q to exit to the Terminal window
7. Execute the following command to find the number of nodes/brokers in the Kafka cluster
bin/zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids"
Notice it shows 0 which
means there's one broker
8. Create two copies of server.properties.
cp
config/server.properties config/server-1.properties
cp
config/server.properties config/server-2.properties
Each copy will be used by a separate Kafka server/node
9. Edit server-1.properties
gedit server-1.properties
10. Locate broker.id=0 and change it to broker.id=1
Each node must have a unique id
11. Locate #listeners=PLAINTEXT://:909
2
remove the comment (#) symbol, and change change the port number to 9093
All nodes will be running on the same machine, therefore a unique port has to be used.
12.Locate log.dirs=/tmp/kafka-logs
and change it to log.dirs=/tmp/kafka-logs-1
All
nodes will be running on the same machine, therefore a unique log has
to be used
13. Press Ctrl+S to save the file
14. Press Ctrl+Q to exit to the
Terminal
window
15. Editserver-2.properties
gedit server-2.properties
16. Locate broker.id=0 and change it to broker.id=2
Each node must have a unique id
17. Locate #listeners=PLAINTEXT://:909
2
remove the comment (#) symbol, and change change the port number to 9094
All nodes will be running on the same machine, therefore a unique port has to be used.
18. Locate log.dirs=/tmp/kafka-logs
and change it to log.dirs=/tmp/kafka-logs-2
All
nodes will be running on the same machine, therefore a unique log has
to be used
19. Press Ctrl+S to save the file
20. Press Ctrl+Q to exit to the
Terminal
window
Start up a multi-broker cluster and verify the cluster
In this part, you will start up the nodes you configured in the previous part and verify the cluster has 3 nodes
1. In the Kafka Server 1 Terminal window, run the following command to start up the second node
bin/kafka-server-start.sh config/server-1.properties
2. Open Open Terminal window by clicking Application > Terminal
3. In menu bar, click Terminal > Set Title…
4. Enter Kafka Server 2 in the Title field and click OK
5. Switch to the Kafka directory
cd /software/kafka_2.11-1.1.0
6. Execute the following command to find the number of nodes/brokers in the Kafka cluster
bin/zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids"
Notice
it shows 0 and 1 which means there are 2 nodes
7. In the Kafka Server 2 Terminal window, run the following command to start up the second node
bin/kafka-server-start.sh config/server-2.properties
8. Switch to the Producer
Terminal
window and
execute the following command to find the number of nodes/brokers in the Kafka cluster
bin/zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids"
Notice
it shows 0, 1, and 2 which means there are 3 nodes
Create a new topic on the multi-broker cluster
In this part, you will create a new topic with replication factor of 3, which means it will utilize the 3 nodes you configured and started in the previous parts of this lab.
1. In the Producer Terminal window, execute the following command to create a new topic
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 1 --topic
webage-replicated-topic
2. Verify the new topic is created
bin/kafka-topics.sh --list --zookeeper localhost:2181
Notice
webage-replicated-topic shows up in addition to webage topic
3. Get webage topic details to see the node(s) handling the topic
bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic
webage
Notice it shows ReplicationFactor 1 (number of nodes) and Replicas 0 (node ID)
4. Get webage-replicated-topic details to see the node(s) handling the topic
bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic
webage-replicated-topic
Notice it shows ReplicationFactor 3 (number of nodes) and Replicas 0, 1, 2 (node IDs). Leader is 0, which means Kafka server with broker.id=0 is acting as the “master” or lead node.
“leader” is the node responsible for all reads and writes for the given partition. Each node will be the leader for a randomly selected portion of the partitions.
“replicas” is the list of nodes that replicate the log for this partition regardless of whether they are the leader or even if they are currently alive.
“isr” is the set of “in-sync” replicas. This is the subset of the replicas list that is currently alive and caught-up to the leader.
Send messages to the new replicated topic
In this part, you will send more messages to the new topic which is utilizing the multi-broker cluster setup
1. In the Producer Terminal window, enter the following command to send messages
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic webage
-replicated-topic
2. At the Producer prompt, enter the following messages:
message 1
message 2
3. Remain at the Producer prompt (>) and keep the Terminal window open.
Read messages from the new replicated topic
In this part, you will read messages from the new replicated topic
1. Open Terminal window by clicking Application > Terminal
2. In menu bar, click Terminal > Set Title…
3. Enter Consumer in the Title field and click OK
4. Switch to the Kafka directory
cd /software/kafka_2.11-1.1.0
5. Execute the following command to read messages from the replicated topic which you created in the previous parts of the lab
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --from-beginning --topic
webage-replicated-topic
Notice it shows message 1
and message 2
Test Cluster Fault Tolerance
In this part, you will test the multi-broker cluster fault tolerance by terminating one of the nodes/Kafka server instance
1. Open Terminal window by clicking Application > Terminal
2. In menu bar, click Terminal > Set Title…
3. Enter FaultTolerance in the Title field and click OK
4. Switch to the Kafka directory
cd /software/kafka_2.11-1.1.0
5. Find process ID of Kafka Server 1 instance (broker.id = 1) by running the following command in the Terminal window
ps aux | grep server-1.properties
6. Scroll up and locate the process ID listed in the first line immediately below the command you executed in the previous step.
It should look like this (your process ID will be different)
7. Kill the process
kill 53749
Note: use your process id instead of 53749
8. Verify there’s one less node in the cluster
bin/zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids"
Notice it shows 0 and 2. Kafka Server 1 process has been terminated.
You can also verify this by switching to the Kafka Server 1 Terminal window. You will notice it has exited to the prompt.
9. Get webage topic details to see the node(s) handling the topic
bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic
webage
Notice it still shows Replicas: 0,1,2 (originally, there were 3 nodes), but Isr (in-sync replicas) shows 0, 2
10. Switch to the Producer Terminal window and enter another message
still working
11. Switch to the Consumer Terminal window and verify it shows “still working” message.
Use Kafka Connect to import/export data
Writing data from the console and writing it back to the console is a convenient place to start, but you’ll probably want to use data from other sources or export data from Kafka to other systems. For many systems, instead of writing custom integration code you can use Kafka Connect to import or export data.
Kafka Connect is a tool included with Kafka that imports and exports data to Kafka. It is an extensible tool that runs connectors, which implement the custom logic for interacting with an external system. In this part, you’ll see how to run Kafka Connect with simple connectors that import data from a file to a Kafka topic and export data from a Kafka topic to a file.
1. Switch to the Producer Terminal window and press Ctrl+C to exit to the terminal.
2. Create a text file with some seed data
echo
-e "Hello\nworld!"
> webage.txt
3. Verify you have the text file with data
more webage.txt
4. Notice it shows the following data
Hello world!
5. Edit Kafka Connect source configuration
gedit config/connect-file-source.properties
The source file configuration is used to read the messages from some existing text file into a Kafka topic.
Notice the default file configured is named test.txt.
6. Change test.txt to webage.txt
7. Change topic from connect-test to webage-replicated-topic
8. Press Ctrl+S to save the file
9. Press Ctrl+Q to exit to the Terminal window
10 Edit Kafka Connect file sink configuration
Sink file configuration is used to read data from a source Kafka topic and write data into a text file on the filesystem.
11. Start Kafka Connect
bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties config/connect-file-sink.properties
12. Switch to the Consumer Terminal window and notice two new rows are available as shown below:
{"schema":{"type":"string","optional":false},"payload":"hello
"}
{"schema":{"type":"string","optional":false},"payload":"world!
"}
Cleanup
In this part you will close the Terminal windows which aren’t required.
Close the following Terminal windows. Press Ctrl+C in Terminal windows where any process is running
FaultTolerance Producer
Note: Leave ZooKeeper, Kafka servers, and Consumer running. You will utilize the consumer in the next lab.
Review
In this lab, you explored the Apache Kafka basics. You utilized the OOB, command-line tools provided by Kafka to start up ZooKeper and Kafka. You also created a Kafka topic and used the producer and consumer to utilize the messages written to the queue. You also explored single-broker and multi-broker clusters. In the end, you explored how to import messages into the queue and export messages from the Kafka cluster.
Cleanup
In this part you will close the Terminal windows which aren’t required.
1. Close the following Terminal windows. Press Ctrl+C in Terminal windows where any process is running
FaultTolerance Producer
Note: Leave ZooKeeper, Kafka servers, and Consumer running. You will utilize the consumer in the next lab.
Review
In this lab, you explored the Apache Kafka basics. You utilized the OOB, command-line tools provided by Kafka to start up ZooKeper and Kafka. You also created a Kafka topic and used the producer and consumer to utilize the messages written to the queue. You also explored single-broker and multi-broker clusters. In the end, you explored how to import messages into the queue and export messages from the Kafka cluster.