Nemanja Tomic

Provisioning a virtual machine with three running Kafka brokers was quite easy last time. Now let’s kick it up a notch. Instead of provisioning only one VM, let’s provision three and turn all three into Kafka controllers. This results in a highly available Kafka cluster, where each controller is completely independent of one another.

We will use the same tools as last time: Ansible and Vagrant. However, the approach for provisioning the VMs will be different this time, as each VM needs a unique set of variables for the configuration of Kafka. The VMs also need to connect to each other via a subnet. This private network is necessary for the communication between the Kafka nodes, but it adds complexity.

Additionally, the provisioning should be as clear-cut as possible. Simplicity is key in every part of software engineering. While it is possible to run Ansible in Vagrant, it comes at the cost of flexibility. Vagrant is not meant for running Ansible, and the added flexibility of the inventory file in Ansible adds a tremendous amount of flexibility when provisioning the VMs.

Separating Ansible and Vagrant

The best way to utilize the benefits of both tools is to keep things separate.

Ansible and Vagrant are both powerful tools, but using all features of Ansible inside Vagrant is not possible. Using Ansible through the Vagrantfile limits your flexibility. That flexibility is necessary for readability and quality of code, especially as projects get larger. The more hosts you add to the environment, the more important the inventory file in Ansible becomes. Doing this in Vagrant gets messy quickly.

Instead, we will create an Ansible inventory file where we specify the IP addresses, the SSH keys, and the path to the configuration file for Kafka. Then we can use ansible-playbook and ansible-navigator CLI tools as we see fit.

Actually, there is another point why this way of provisioning multiple VMs is the only way one should even do it. The provisioning takes place at the same time for all VMs when you use the Ansible CLI commands. However, with Vagrant, the provisioning occurs in a sequential manner, resulting in significant delays in a multi-VM setup. Surely there is some flag you can set to fix this, but I haven’t found it. Besides, using Ansible in the CLI is one of my favorite things to do right now.

Anyway, Let’s Get Going with Our VMs

The Vagrantfile is a bit more complex than before, but it’s still manageable. Only nine lines of code are necessary to provision the VMs! Since the Ansible CLI will take care of the provisioning, there is no Ansible playbook mentioned this time in the Vagrantfile.

Vagrant.configure("2") do |config|
  # Define the base box for the VM
  config.vm.box = "debian-13"

  # Loop over the creation of three VMs
  (1..3).each do |i|
    # Use interpolation for the name of the VMs
    config.vm.define("node-#{i}") do |node|
      # Configure a private IP address for each VM
      ipAddr = "192.168.56.#{10 + i}"
      node.vm.network("private_network", ip: ipAddr)
    end
  end
end

And… that’s it. Yeah, it is that easy to create VMs nowadays. Ansible does have a few more lines of code, but in essence, it is the same structure over and over again. The real challenge lies in knowing how to configure the Kafka cluster. I won’t include the Kafka configuration files and commands; however, you can find everything in the GitHub repository if you would like to take a look.

Nemanja Tomic

Kafka Cluster With Vagrant

Separating Ansible and Vagrant

Anyway, Let’s Get Going with Our VMs