What you’ll accomplish: Understand the full deployment architecture, size your VMs correctly, and confirm all prerequisites are in place before touching a terminal.
The Big Picture
Here’s what we’re building. The ELK stack spans three hosts — all three run Elasticsearch for clustering, with Kibana and Logstash co-located on two of them:
Elasticsearch Cluster
(transport TLS on port 9300)
┌──────────────────┬──────────────────┐
│ │ │
┌──────┴──────┐ ┌──────┴──────┐ ┌──────┴──────┐
│ es01 │ │ es02 │ │ es03 │
│ │ │ │ │ │
│ Elasticsearch│ │ Elasticsearch│ │ Elasticsearch│
│ (9200/9300) │ │ (9200/9300) │ │ (9200/9300) │
│ │ │ │ │ │
│ Kibana │ │ Logstash │ │ │
│ (5601) │ │ (5044/5514) │ │ │
│ │ │ (9600 API) │ │ │
│ Apache httpd│ │ │ │ │
│ (443 HTTPS) │ │ │ │ │
└──────┬──────┘ └──────┬──────┘ └─────────────┘
│ │
HTTPS (443) Beats (5044)
│ Syslog (5514)
│ │
┌──────┴──────┐ ┌──────┴──────┐
│ Browser │ │ Log Sources │
│ (you) │ │ (VMs, apps, │
│ │ │ network │
│ │ │ devices) │
└─────────────┘ └──────┬──────┘
│
┌──────┴──────┐
│ Filebeat │
│ (fb01+) │
│ │
│ Ships logs │
│ → Logstash │
│ (port 5044) │
│ │
│ Stats API │
│ (5066 local)│
└─────────────┘
The key design decisions:
- Three Elasticsearch nodes for quorum. Elasticsearch uses a majority-based consensus protocol for master election. With 3 nodes, you can lose 1 and still have quorum. With 2 nodes, losing 1 means no quorum and a read-only cluster. Three is the minimum for a real cluster.
- Kibana co-locates with es01. Kibana is a lightweight Node.js application. It doesn’t need its own host. Apache sits in front on the same box, terminates SSL, and proxies to Kibana on localhost:5601.
- Logstash co-locates with es02. Logstash is the most memory-hungry component (JVM-based, pipeline buffering). Putting it on its own ES node gives it dedicated resources without needing a fourth VM.
- Apache terminates SSL for Kibana. Kibana doesn’t handle HTTPS directly in this setup. Apache on es01 listens on port 443, terminates TLS, and proxies to Kibana’s HTTP port. This avoids Java keystore complexity and makes certificate management straightforward.
Component Overview
| Component | What It Does | Config Location | Data Location |
|---|---|---|---|
| Elasticsearch | Distributed search engine — stores and indexes logs | /etc/elasticsearch/ | /opt/lib/elasticsearch/ |
| Kibana | Web UI for searching and visualizing logs | /etc/kibana/ | /opt/kibana/lib/ |
| Apache httpd | Reverse proxy with SSL termination for Kibana | /etc/httpd/conf.d/ | — |
| Logstash | Log ingestion pipeline — receives, transforms, ships to ES | /etc/logstash/ | /opt/lib/logstash/ |
| Filebeat | Lightweight log shipper — collects host logs and ships to Logstash | /etc/filebeat/ | /var/log/filebeat/ |
| Java 17 | Runtime for Elasticsearch and Logstash (bundled with packages) | — | — |
Hardware Sizing
Here’s what actually works for a home lab, based on real usage — not Elastic’s marketing recommendations:
| Resource | Minimum (per node) | Recommended | Notes |
|---|---|---|---|
| CPU | 2 cores | 4 cores | ES indexing is CPU-intensive. 2 cores work for light ingestion (<1 GB/day). |
| RAM | 4 GB | 8 GB | ES uses ~50% for JVM heap, the rest for filesystem cache. 4 GB is tight but workable. Logstash node benefits most from 8 GB. |
| Disk | 20 GB | 50 GB | Depends entirely on retention. 20 GB handles ~30 days of a small home lab. ILM policies prevent runaway growth. |
| Swappiness | vm.swappiness=0 (minimized) | vm.swappiness=0 | The kernel avoids swapping unless memory is critically low. The swap partition still exists as a safety net — see Ch3 for details. |
Pro tip: If you’re running on Proxmox, start with 2 cores and 4 GB per VM. You can hot-add CPU and memory later without downtime. Don’t over-provision upfront — you’ll waste resources on a cluster that’s mostly idle.
Network Requirements
Every Elasticsearch node talks to every other node. Kibana and Logstash need to reach Elasticsearch. Your browser needs to reach Kibana’s Apache proxy.
| Port | Protocol | Direction | Service | Who Needs Access |
|---|---|---|---|---|
| 9200 | TCP | ES nodes ↔ clients | Elasticsearch REST API | Kibana, Logstash, monitoring tools |
| 9300 | TCP | ES nodes ↔ ES nodes | Elasticsearch cluster transport | All ES nodes (master election, data replication) |
| 443 | TCP | Inbound | Apache HTTPS (Kibana proxy) | Your browser |
| 5601 | TCP | Loopback only | Kibana HTTP | Apache proxies to this — never expose directly |
| 5044 | TCP | Inbound | Logstash Beats input | Filebeat/Metricbeat agents on your hosts |
| 5514 | TCP | Inbound | Logstash syslog input | Network devices, rsyslog forwarding |
| 9600 | TCP | All interfaces | Logstash monitoring API | Health checks, Prometheus exporters |
| 5066 | TCP | Loopback only | Filebeat stats API | Local health checks (curl http://127.0.0.1:5066/stats) |
Firewall strategy: The playbooks configure firewalld with source-restricted rich rules for Elasticsearch cluster traffic (ports 9200, 9300) — only the ES, Kibana, and Logstash nodes can reach these ports. Logstash ingestion ports (5044, 5514) use simple port opens without source restrictions, since Beats agents and syslog sources may connect from any host on your network. Port 443 (Apache) is also open to all clients. Filebeat’s stats API (port 5066) listens on loopback only and doesn’t require a firewall rule.
Software Prerequisites
On the ELK hosts (all three Rocky Linux 9 VMs):
| Requirement | Version | Notes |
|---|---|---|
| Rocky Linux | 9.x (minimal install) | Fresh install preferred |
| SELinux | Enforcing (default) | Don’t disable it. The playbooks configure the required booleans. |
| Internet access | — | Required for package downloads (Elastic repo) |
On your workstation (where you run the deployment playbooks):
| Requirement | Version | Notes |
|---|---|---|
| Ansible | 2.14+ (ansible-core) | For running the deployment playbooks |
ansible.posix collection | 1.5+ | ansible-galaxy collection install -r requirements.yml |
| SSH access | — | To all 3 hosts, with a sudo-capable user |
Pre-Flight Checklist
Run these on each of your three ELK hosts before starting Chapter 3. Every check should pass.
Confirm the OS:
cat /etc/os-release | grep PRETTY_NAME
Expected: PRETTY_NAME="Rocky Linux 9.x (Blue Onyx)"
Confirm SELinux is enforcing:
getenforce
Expected: Enforcing
If this says Permissive or Disabled, fix it before proceeding. Edit /etc/selinux/config, set SELINUX=enforcing, and reboot.
Confirm available RAM:
free -h | grep Mem
You need at least 4 GB total per node. The Logstash node benefits from 8 GB.
Confirm available disk:
df -h /opt
You need at least 20 GB free on /opt (or wherever you’ll store data).
Confirm nodes can reach each other:
# From es01, test connectivity to es02 and es03
ping -c 1 192.168.1.62
ping -c 1 192.168.1.63
Replace IPs with your actual node addresses. All three nodes must be able to reach each other.
What to Have Ready
Before you start Chapter 3, gather these:
The IP addresses or hostnames of your three ELK hosts. We’ll use
192.168.1.61,192.168.1.62, and192.168.1.63as examples throughout this guide.An SSL certificate and key for the Kibana host. Self-signed is fine for a home lab. If you don’t have one yet, generate one before running the playbooks.
Strong passwords for the Ansible vault — you’ll need three:
vault_elk_elastic_password(the Elasticsearch superuser),vault_elk_kibana_system_password(Kibana’s service account), andvault_elk_logstash_system_password(Logstash’s service account). Generate them now and keep them somewhere safe. These are used during deployment to configure built-in user authentication.Your Ansible inventory configured with all three hosts in the
elasticsearchgroup, and the appropriate hosts in thekibana,logstash, andfilebeatgroups (seeinventory/hosts.yml.example).
Running the Deployment
If you have the playbook bundle, here’s the single command that deploys the entire stack:
cd elk-deploy
ansible-playbook site.yml -i inventory/hosts.yml --vault-password-file .credentials/vault.txt
This runs four plays in sequence: Elasticsearch cluster setup across all three nodes (including transport TLS and built-in user authentication), Kibana with Apache reverse proxy on es01, Logstash on es02, and Filebeat on your log-shipping hosts. Expect it to take 15-25 minutes depending on internet speed (package downloads are the bottleneck).
When it finishes, you’ll have a working 3-node cluster with security enabled, Kibana accessible at https://kibana.example.com, Logstash accepting logs on ports 5044 and 5514, and Filebeat shipping system logs from your hosts.
The rest of this guide walks through what that playbook does and why each decision was made.
Two paths through this guide: Each chapter shows the manual commands first (so you understand what’s happening), then summarizes what the playbook automates in a “What Automation Looks Like” section at the end. If you’re using the playbook bundle, you don’t need to run the manual commands — read them for understanding, then skip to the automation summary. If you’re following the manual path, every command you need is shown with copy-paste blocks.
Everything in place? Let’s deploy Elasticsearch.