Kafka Basics In this lab, you will explore the Apache Kafka basics. You will utilize the OOB, command-line tools provided by Kafka to start up ZooKeper and Kafka. Then you will create a Kafka topic and use the producer and consumer to utilize the messages written to the queue. You will also explore single-broker and multi-broker clusters. In the end, you will explore how to import messages into the queue and export messages from the queue. Start ZooKeper Kafka uses ZooKeeper so you need to first start a ZooKeeper server. In this part, you will start a single-node ZooKeeper instance Open Terminal window by clicking Application > Terminal In menu bar, click Terminal > Set Title… Enter Zookeeper in the Title field and click OK Note: You will end up using a lot of Terminal windows. Setting the Title of each Terminal window will make it easier to locate the right window. Switch to the Kafka directory cd /software/kafka_2.11-1.1.0 Start Zookeeper bin/zookeeper-server-start.sh config/zookeeper.properties Keep the Terminal window open Start Kafka Server Open Terminal window by clicking Application > Terminal In menu bar, click Terminal > Set Title… Enter Kafka Server in the Title field and click OK Switch to the Kafka directory cd /software/kafka_2.11-1.1.0 Start Kafka server bin/kafka-server-start.sh config/server.properties Keep the Terminal window open Create a Kafka Topic Kafka maintains feeds of messages in categories called topics. In this part, you will create a topic named “webage” with a single partition and only one replica. Open Terminal window by clicking Application > Terminal In menu bar, click Terminal > Set Title… Enter Producer in the Title field and click OK Switch to the Kafka directory cd /software/kafka_2.11-1.1.0 Create a Kafka topic bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic webage Get a list of Kafka topics bin/kafka-topics.sh --list --zookeeper localhost:2181 Notice it shows webage Send some messages Kafka comes with a command line client that will take input from a file or from standard input and send it out as messages to the Kafka cluster In the Terminal window, which you opened in Part 3 of this lab, execute the following command to send some messages to the Kafka cluster bin/kafka-console-producer.sh --broker-list localhost:9092 --topic webage Notice it shows you a prompt > where you can start entering messages which will be sent to the Kafka cluster Enter the following messages hello world another message Don’t exit the > prompt. You can exit the prompt by pressing Ctrl+Z, but don’t do it until instructed. Keep the Terminal window open Read messages from the Kafka cluster Kafka also has a command line consumer that will dump out messages to standard output. Open Terminal window by clicking Application > Terminal In menu bar, click Terminal > Set Title… Enter Consumer in the Title field and click OK Switch to the Kafka directory cd /software/kafka_2.11-1.1.0 In the Terminal window, enter the following command to read the messages bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning Notice it shows the following messages hello world another message Switch back to the Producer Terminal window and enter more messages. Notice the newly entered messages automatically show up in the Consumer Terminal window Switch to the Producer Terminal window and press Ctrl+Z to stop entering more messages. Keep the Terminal window open. Switch to the Consumer Terminal window and press Ctrl+Z to stop reading messages. Keep the Terminal window open. Setting up a multi-broker clusters So far we have been running against a single broker. A single broker is just a cluster of size one. In this part, you will expand the cluster to three nodes by creating 2 additional nodes. Open Open Terminal window by clicking Application > Terminal In menu bar, click Terminal > Set Title… Enter Kafka Server 1 in the Title field and click OK Switch to the Kafka directory cd /software/kafka_2.11-1.1.0 Open server.properties in text editor gedit config/server.properties Note: You can use any text editor of your choice, e.g. vi, nano server.properties is the configuration file used by Kafka server. Notice there’s a property broker.id=0. 0 is the ID of the first Kafka server which you executed earlier in this lab. Press Ctrl+Q to exit to the Terminal window Execute the following command to find the number of nodes/brokers in the Kafka cluster bin/zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids" Notice it shows 0 which means there's one broker Create two copies of server.properties. cp config/server.properties config/server-1.properties cp config/server.properties config/server-2.properties Each copy will be used by a separate Kafka server/node Edit server-1.properties gedit server-1.properties Locate broker.id=0 and change it to broker.id=1 Each node must have a unique id Locate #listeners=PLAINTEXT://:9092 remove the comment (#) symbol, and change change the port number to 9093 All nodes will be running on the same machine, therefore a unique port has to be used. Locate log.dirs=/tmp/kafka-logs and change it to log.dirs=/tmp/kafka-logs-1 All nodes will be running on the same machine, therefore a unique log has to be used Press Ctrl+S to save the file Press Ctrl+Q to exit to the Terminal window Edit server-2.properties gedit server-2.properties Locate broker.id=0 and change it to broker.id=2 Each node must have a unique id Locate #listeners=PLAINTEXT://:9092 remove the comment (#) symbol, and change change the port number to 9094 All nodes will be running on the same machine, therefore a unique port has to be used. Locate log.dirs=/tmp/kafka-logs and change it to log.dirs=/tmp/kafka-logs-2 All nodes will be running on the same machine, therefore a unique log has to be used Press Ctrl+S to save the file Press Ctrl+Q to exit to the Terminal window Start up a multi-broker cluster and verify the cluster In this part, you will start up the nodes you configured in the previous part and verify the cluster has 3 nodes In the Kafka Server 1 Terminal window, run the following command to start up the second node bin/kafka-server-start.sh config/server-1.properties Open Open Terminal window by clicking Application > Terminal In menu bar, click Terminal > Set Title… Enter Kafka Server 2 in the Title field and click OK Switch to the Kafka directory cd /software/kafka_2.11-1.1.0 Execute the following command to find the number of nodes/brokers in the Kafka cluster bin/zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids" Notice it shows 0 and 1 which means there are 2 nodes In the Kafka Server 2 Terminal window, run the following command to start up the second node bin/kafka-server-start.sh config/server-2.properties Switch to the Producer Terminal window and execute the following command to find the number of nodes/brokers in the Kafka cluster bin/zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids" Notice it shows 0, 1, and 2 which means there are 3 nodes Create a new topic on the multi-broker cluster In this part, you will create a new topic with replication factor of 3, which means it will utilize the 3 nodes you configured and started in the previous parts of this lab. In the Producer Terminal window, execute the following command to create a new topic bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 1 --topic webage-replicated-topic Verify the new topic is created bin/kafka-topics.sh --list --zookeeper localhost:2181 Notice webage-replicated-topic shows up in addition to webage topic Get webage topic details to see the node(s) handling the topic bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic webage Notice it shows ReplicationFactor 1 (number of nodes) and Replicas 0 (node ID) Get webage-replicated-topic details to see the node(s) handling the topic bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic webage-replicated-topic Notice it shows ReplicationFactor 3 (number of nodes) and Replicas 0, 1, 2 (node IDs). Leader is 0, which means Kafka server with broker.id=0 is acting as the “master” or lead node. “leader” is the node responsible for all reads and writes for the given partition. Each node will be the leader for a randomly selected portion of the partitions. “replicas” is the list of nodes that replicate the log for this partition regardless of whether they are the leader or even if they are currently alive. “isr” is the set of “in-sync” replicas. This is the subset of the replicas list that is currently alive and caught-up to the leader. Send messages to the new replicated topic In this part, you will send more messages to the new topic which is utilizing the multi-broker cluster setup In the Producer Terminal window, enter the following command to send messages bin/kafka-console-producer.sh --broker-list localhost:9092 --topic webage-replicated-topic At the Producer prompt, enter the following messages: message 1 message 2 Remain at the Producer prompt (>) and keep the Terminal window open. Read messages from the new replicated topic In this part, you will read messages from the new replicated topic Open Open Terminal window by clicking Application > Terminal In menu bar, click Terminal > Set Title… Enter Consumer in the Title field and click OK Switch to the Kafka directory cd /software/kafka_2.11-1.1.0 Execute the following command to read messages from the replicated topic which you created in the previous parts of the lab bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --from-beginning --topic webage-replicated-topic Notice it shows message 1 and message 2 Test Cluster Fault Tolerance In this part, you will test the multi-broker cluster fault tolerance by terminating one of the nodes/Kafka server instance Open Open Terminal window by clicking Application > Terminal In menu bar, click Terminal > Set Title… Enter FaultTolerance in the Title field and click OK Switch to the Kafka directory cd /software/kafka_2.11-1.1.0 Find process ID of Kafka Server 1 instance (broker.id = 1) by running the following command in the Terminal window ps aux | grep server-1.properties Scroll up and locate the process ID listed in the first line immediately below the command you executed in the previous step. It should look like this (your process ID will be different) Kill the process kill 53749 Note: use your process id instead of 53749 Verify there’s one less node in the cluster bin/zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids" Notice it shows 0 and 2. Kafka Server 1 process has been terminated. You can also verify this by switching to the Kafka Server 1 Terminal window. You will notice it has exited to the prompt. Get webage topic details to see the node(s) handling the topic bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic webage Notice it still shows Replicas: 0,1,2 (originally, there were 3 nodes), but Isr (in-sync replicas) shows 0, 2 Switch to the Producer Terminal window and enter another message still working Switch to the Consumer Terminal window and verify it shows “still working” message. Use Kafka Connect to import/export data Writing data from the console and writing it back to the console is a convenient place to start, but you’ll probably want to use data from other sources or export data from Kafka to other systems. For many systems, instead of writing custom integration code you can use Kafka Connect to import or export data. Kafka Connect is a tool included with Kafka that imports and exports data to Kafka. It is an extensible tool that runs connectors, which implement the custom logic for interacting with an external system. In this part, you’ll see how to run Kafka Connect with simple connectors that import data from a file to a Kafka topic and export data from a Kafka topic to a file. Switch to the Producer Terminal window and press Ctrl+C to exit to the terminal. Create a text file with some seed data echo -e "Hello\nworld!" > webage.txt Verify you have the text file with data more webage.txt Notice it shows the following data Hello world! Edit Kafka Connect source configuration gedit config/connect-file-source.properties The source file configuration is used to read the messages from some existing text file into a Kafka topic. Notice the default file configured is named test.txt. Change test.txt to webage.txt Change topic from connect-test to webage-replicated-topic Press Ctrl+S to save the file Press Ctrl+Q to exit to the Terminal window Edit Kafka Connect file sink configuration Sink file configuration is used to read data from a source Kafka topic and write data into a text file on the filesystem. Start Kafka Connect bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties config/connect-file-sink.properties Switch to the Consumer Terminal window and notice two new rows are available as shown below: {"schema":{"type":"string","optional":false},"payload":"hello"} {"schema":{"type":"string","optional":false},"payload":"world!"} Cleanup In this part you will close the Terminal windows which aren’t required. Close the following Terminal windows. Press Ctrl+C in Terminal windows where any process is running FaultTolerance Producer Note: Leave ZooKeeper, Kafka servers, and Consumer running. You will utilize the consumer in the next lab. Review In this lab, you explored the Apache Kafka basics. You utilized the OOB, command-line tools provided by Kafka to start up ZooKeper and Kafka. You also created a Kafka topic and used the producer and consumer to utilize the messages written to the queue. You also explored single-broker and multi-broker clusters. In the end, you explored how to import messages into the queue and export messages from the Kafka cluster. example was last modified: October 7th, 2018 by Alex Jimenez