Prerequisites & Architecture

What you’ll accomplish: Understand the full deployment architecture, size your VMs correctly, and confirm all prerequisites are in place before touching a terminal.

The Big Picture

Here’s what we’re building. The ELK stack spans three hosts — all three run Elasticsearch for clustering, with Kibana and Logstash co-located on two of them:

                                    Elasticsearch Cluster
                               (transport TLS on port 9300)
                      ┌──────────────────┬──────────────────┐
                      │                  │                  │
               ┌──────┴──────┐   ┌──────┴──────┐   ┌──────┴──────┐
               │    es01     │   │    es02     │   │    es03     │
               │             │   │             │   │             │
               │ Elasticsearch│   │ Elasticsearch│   │ Elasticsearch│
               │ (9200/9300) │   │ (9200/9300) │   │ (9200/9300) │
               │             │   │             │   │             │
               │ Kibana      │   │ Logstash    │   │             │
               │ (5601)      │   │ (5044/5514) │   │             │
               │             │   │ (9600 API)  │   │             │
               │ Apache httpd│   │             │   │             │
               │ (443 HTTPS) │   │             │   │             │
               └──────┬──────┘   └──────┬──────┘   └─────────────┘
                      │                  │
                HTTPS (443)         Beats (5044)
                      │            Syslog (5514)
                      │                  │
               ┌──────┴──────┐   ┌──────┴──────┐
               │  Browser    │   │ Log Sources  │
               │  (you)      │   │ (VMs, apps,  │
               │             │   │  network     │
               │             │   │  devices)    │
               └─────────────┘   └──────┬──────┘
                                        │
                                 ┌──────┴──────┐
                                 │  Filebeat   │
                                 │  (fb01+)    │
                                 │             │
                                 │ Ships logs  │
                                 │ → Logstash  │
                                 │ (port 5044) │
                                 │             │
                                 │ Stats API   │
                                 │ (5066 local)│
                                 └─────────────┘

The key design decisions:

Three Elasticsearch nodes for quorum. Elasticsearch uses a majority-based consensus protocol for master election. With 3 nodes, you can lose 1 and still have quorum. With 2 nodes, losing 1 means no quorum and a read-only cluster. Three is the minimum for a real cluster.
Kibana co-locates with es01. Kibana is a lightweight Node.js application. It doesn’t need its own host. Apache sits in front on the same box, terminates SSL, and proxies to Kibana on localhost:5601.
Logstash co-locates with es02. Logstash is the most memory-hungry component (JVM-based, pipeline buffering). Putting it on its own ES node gives it dedicated resources without needing a fourth VM.
Apache terminates SSL for Kibana. Kibana doesn’t handle HTTPS directly in this setup. Apache on es01 listens on port 443, terminates TLS, and proxies to Kibana’s HTTP port. This avoids Java keystore complexity and makes certificate management straightforward.

Component Overview

Component	What It Does	Config Location	Data Location
Elasticsearch	Distributed search engine — stores and indexes logs	`/etc/elasticsearch/`	`/opt/lib/elasticsearch/`
Kibana	Web UI for searching and visualizing logs	`/etc/kibana/`	`/opt/kibana/lib/`
Apache httpd	Reverse proxy with SSL termination for Kibana	`/etc/httpd/conf.d/`	—
Logstash	Log ingestion pipeline — receives, transforms, ships to ES	`/etc/logstash/`	`/opt/lib/logstash/`
Filebeat	Lightweight log shipper — collects host logs and ships to Logstash	`/etc/filebeat/`	`/var/log/filebeat/`
Java 17	Runtime for Elasticsearch and Logstash (bundled with packages)	—	—

Hardware Sizing

Here’s what actually works for a home lab, based on real usage — not Elastic’s marketing recommendations:

Resource	Minimum (per node)	Recommended	Notes
CPU	2 cores	4 cores	ES indexing is CPU-intensive. 2 cores work for light ingestion (<1 GB/day).
RAM	4 GB	8 GB	ES uses ~50% for JVM heap, the rest for filesystem cache. 4 GB is tight but workable. Logstash node benefits most from 8 GB.
Disk	20 GB	50 GB	Depends entirely on retention. 20 GB handles ~30 days of a small home lab. ILM policies prevent runaway growth.
Swappiness	vm.swappiness=0 (minimized)	vm.swappiness=0	The kernel avoids swapping unless memory is critically low. The swap partition still exists as a safety net — see Ch3 for details.

Pro tip: If you’re running on Proxmox, start with 2 cores and 4 GB per VM. You can hot-add CPU and memory later without downtime. Don’t over-provision upfront — you’ll waste resources on a cluster that’s mostly idle.

Network Requirements

Every Elasticsearch node talks to every other node. Kibana and Logstash need to reach Elasticsearch. Your browser needs to reach Kibana’s Apache proxy.

Port	Protocol	Direction	Service	Who Needs Access
9200	TCP	ES nodes ↔ clients	Elasticsearch REST API	Kibana, Logstash, monitoring tools
9300	TCP	ES nodes ↔ ES nodes	Elasticsearch cluster transport	All ES nodes (master election, data replication)
443	TCP	Inbound	Apache HTTPS (Kibana proxy)	Your browser
5601	TCP	Loopback only	Kibana HTTP	Apache proxies to this — never expose directly
5044	TCP	Inbound	Logstash Beats input	Filebeat/Metricbeat agents on your hosts
5514	TCP	Inbound	Logstash syslog input	Network devices, rsyslog forwarding
9600	TCP	All interfaces	Logstash monitoring API	Health checks, Prometheus exporters
5066	TCP	Loopback only	Filebeat stats API	Local health checks (`curl http://127.0.0.1:5066/stats`)

Firewall strategy: The playbooks configure firewalld with source-restricted rich rules for Elasticsearch cluster traffic (ports 9200, 9300) — only the ES, Kibana, and Logstash nodes can reach these ports. Logstash ingestion ports (5044, 5514) use simple port opens without source restrictions, since Beats agents and syslog sources may connect from any host on your network. Port 443 (Apache) is also open to all clients. Filebeat’s stats API (port 5066) listens on loopback only and doesn’t require a firewall rule.

Software Prerequisites

On the ELK hosts (all three Rocky Linux 9 VMs):

Requirement	Version	Notes
Rocky Linux	9.x (minimal install)	Fresh install preferred
SELinux	Enforcing (default)	Don’t disable it. The playbooks configure the required booleans.
Internet access	—	Required for package downloads (Elastic repo)

On your workstation (where you run the deployment playbooks):

Requirement	Version	Notes
Ansible	2.14+ (`ansible-core`)	For running the deployment playbooks
`ansible.posix` collection	1.5+	`ansible-galaxy collection install -r requirements.yml`
SSH access	—	To all 3 hosts, with a sudo-capable user

Pre-Flight Checklist

Run these on each of your three ELK hosts before starting Chapter 3. Every check should pass.

Confirm the OS:

cat /etc/os-release | grep PRETTY_NAME

Expected: PRETTY_NAME="Rocky Linux 9.x (Blue Onyx)"

Confirm SELinux is enforcing:

getenforce

Expected: Enforcing

If this says Permissive or Disabled, fix it before proceeding. Edit /etc/selinux/config, set SELINUX=enforcing, and reboot.

Confirm available RAM:

free -h | grep Mem

You need at least 4 GB total per node. The Logstash node benefits from 8 GB.

Confirm available disk:

df -h /opt

You need at least 20 GB free on /opt (or wherever you’ll store data).

Confirm nodes can reach each other:

# From es01, test connectivity to es02 and es03
ping -c 1 192.168.1.62
ping -c 1 192.168.1.63

Replace IPs with your actual node addresses. All three nodes must be able to reach each other.

What to Have Ready

Before you start Chapter 3, gather these:

The IP addresses or hostnames of your three ELK hosts. We’ll use 192.168.1.61, 192.168.1.62, and 192.168.1.63 as examples throughout this guide.
An SSL certificate and key for the Kibana host. Self-signed is fine for a home lab. If you don’t have one yet, generate one before running the playbooks.
Strong passwords for the Ansible vault — you’ll need three: vault_elk_elastic_password (the Elasticsearch superuser), vault_elk_kibana_system_password (Kibana’s service account), and vault_elk_logstash_system_password (Logstash’s service account). Generate them now and keep them somewhere safe. These are used during deployment to configure built-in user authentication.
Your Ansible inventory configured with all three hosts in the elasticsearch group, and the appropriate hosts in the kibana, logstash, and filebeat groups (see inventory/hosts.yml.example).

Running the Deployment

If you have the playbook bundle, here’s the single command that deploys the entire stack:

cd elk-deploy
ansible-playbook site.yml -i inventory/hosts.yml --vault-password-file .credentials/vault.txt

This runs four plays in sequence: Elasticsearch cluster setup across all three nodes (including transport TLS and built-in user authentication), Kibana with Apache reverse proxy on es01, Logstash on es02, and Filebeat on your log-shipping hosts. Expect it to take 15-25 minutes depending on internet speed (package downloads are the bottleneck).

When it finishes, you’ll have a working 3-node cluster with security enabled, Kibana accessible at https://kibana.example.com, Logstash accepting logs on ports 5044 and 5514, and Filebeat shipping system logs from your hosts.

The rest of this guide walks through what that playbook does and why each decision was made.

Two paths through this guide: Each chapter shows the manual commands first (so you understand what’s happening), then summarizes what the playbook automates in a “What Automation Looks Like” section at the end. If you’re using the playbook bundle, you don’t need to run the manual commands — read them for understanding, then skip to the automation summary. If you’re following the manual path, every command you need is shown with copy-paste blocks.

Everything in place? Let’s deploy Elasticsearch.