← Deploying the ELK Stack the Right Way

Chapter 2

Prerequisites & Architecture

In this chapter
<nav id="TableOfContents" aria-label="Chapter sections"> <ul> <li><a href="#the-big-picture">The Big Picture</a></li> <li><a href="#component-overview">Component Overview</a></li> <li><a href="#hardware-sizing">Hardware Sizing</a></li> <li><a href="#network-requirements">Network Requirements</a></li> <li><a href="#software-prerequisites">Software Prerequisites</a></li> <li><a href="#pre-flight-checklist">Pre-Flight Checklist</a></li> <li><a href="#what-to-have-ready">What to Have Ready</a></li> <li><a href="#running-the-deployment">Running the Deployment</a></li> </ul> </nav>

What you’ll accomplish: Understand the full deployment architecture, size your VMs correctly, and confirm all prerequisites are in place before touching a terminal.

The Big Picture

Here’s what we’re building. The ELK stack spans three hosts — all three run Elasticsearch for clustering, with Kibana and Logstash co-located on two of them:

                                    Elasticsearch Cluster
                               (transport TLS on port 9300)
                      ┌──────────────────┬──────────────────┐
                      │                  │                  │
               ┌──────┴──────┐   ┌──────┴──────┐   ┌──────┴──────┐
               │    es01     │   │    es02     │   │    es03     │
               │             │   │             │   │             │
               │ Elasticsearch│   │ Elasticsearch│   │ Elasticsearch│
               │ (9200/9300) │   │ (9200/9300) │   │ (9200/9300) │
               │             │   │             │   │             │
               │ Kibana      │   │ Logstash    │   │             │
               │ (5601)      │   │ (5044/5514) │   │             │
               │             │   │ (9600 API)  │   │             │
               │ Apache httpd│   │             │   │             │
               │ (443 HTTPS) │   │             │   │             │
               └──────┬──────┘   └──────┬──────┘   └─────────────┘
                      │                  │
                HTTPS (443)         Beats (5044)
                      │            Syslog (5514)
                      │                  │
               ┌──────┴──────┐   ┌──────┴──────┐
               │  Browser    │   │ Log Sources  │
               │  (you)      │   │ (VMs, apps,  │
               │             │   │  network     │
               │             │   │  devices)    │
               └─────────────┘   └──────┬──────┘
                                 ┌──────┴──────┐
                                 │  Filebeat   │
                                 │  (fb01+)    │
                                 │             │
                                 │ Ships logs  │
                                 │ → Logstash  │
                                 │ (port 5044) │
                                 │             │
                                 │ Stats API   │
                                 │ (5066 local)│
                                 └─────────────┘

The key design decisions:

  • Three Elasticsearch nodes for quorum. Elasticsearch uses a majority-based consensus protocol for master election. With 3 nodes, you can lose 1 and still have quorum. With 2 nodes, losing 1 means no quorum and a read-only cluster. Three is the minimum for a real cluster.
  • Kibana co-locates with es01. Kibana is a lightweight Node.js application. It doesn’t need its own host. Apache sits in front on the same box, terminates SSL, and proxies to Kibana on localhost:5601.
  • Logstash co-locates with es02. Logstash is the most memory-hungry component (JVM-based, pipeline buffering). Putting it on its own ES node gives it dedicated resources without needing a fourth VM.
  • Apache terminates SSL for Kibana. Kibana doesn’t handle HTTPS directly in this setup. Apache on es01 listens on port 443, terminates TLS, and proxies to Kibana’s HTTP port. This avoids Java keystore complexity and makes certificate management straightforward.

Component Overview

ComponentWhat It DoesConfig LocationData Location
ElasticsearchDistributed search engine — stores and indexes logs/etc/elasticsearch//opt/lib/elasticsearch/
KibanaWeb UI for searching and visualizing logs/etc/kibana//opt/kibana/lib/
Apache httpdReverse proxy with SSL termination for Kibana/etc/httpd/conf.d/
LogstashLog ingestion pipeline — receives, transforms, ships to ES/etc/logstash//opt/lib/logstash/
FilebeatLightweight log shipper — collects host logs and ships to Logstash/etc/filebeat//var/log/filebeat/
Java 17Runtime for Elasticsearch and Logstash (bundled with packages)

Hardware Sizing

Here’s what actually works for a home lab, based on real usage — not Elastic’s marketing recommendations:

ResourceMinimum (per node)RecommendedNotes
CPU2 cores4 coresES indexing is CPU-intensive. 2 cores work for light ingestion (<1 GB/day).
RAM4 GB8 GBES uses ~50% for JVM heap, the rest for filesystem cache. 4 GB is tight but workable. Logstash node benefits most from 8 GB.
Disk20 GB50 GBDepends entirely on retention. 20 GB handles ~30 days of a small home lab. ILM policies prevent runaway growth.
Swappinessvm.swappiness=0 (minimized)vm.swappiness=0The kernel avoids swapping unless memory is critically low. The swap partition still exists as a safety net — see Ch3 for details.

Pro tip: If you’re running on Proxmox, start with 2 cores and 4 GB per VM. You can hot-add CPU and memory later without downtime. Don’t over-provision upfront — you’ll waste resources on a cluster that’s mostly idle.

Network Requirements

Every Elasticsearch node talks to every other node. Kibana and Logstash need to reach Elasticsearch. Your browser needs to reach Kibana’s Apache proxy.

PortProtocolDirectionServiceWho Needs Access
9200TCPES nodes ↔ clientsElasticsearch REST APIKibana, Logstash, monitoring tools
9300TCPES nodes ↔ ES nodesElasticsearch cluster transportAll ES nodes (master election, data replication)
443TCPInboundApache HTTPS (Kibana proxy)Your browser
5601TCPLoopback onlyKibana HTTPApache proxies to this — never expose directly
5044TCPInboundLogstash Beats inputFilebeat/Metricbeat agents on your hosts
5514TCPInboundLogstash syslog inputNetwork devices, rsyslog forwarding
9600TCPAll interfacesLogstash monitoring APIHealth checks, Prometheus exporters
5066TCPLoopback onlyFilebeat stats APILocal health checks (curl http://127.0.0.1:5066/stats)

Firewall strategy: The playbooks configure firewalld with source-restricted rich rules for Elasticsearch cluster traffic (ports 9200, 9300) — only the ES, Kibana, and Logstash nodes can reach these ports. Logstash ingestion ports (5044, 5514) use simple port opens without source restrictions, since Beats agents and syslog sources may connect from any host on your network. Port 443 (Apache) is also open to all clients. Filebeat’s stats API (port 5066) listens on loopback only and doesn’t require a firewall rule.

Software Prerequisites

On the ELK hosts (all three Rocky Linux 9 VMs):

RequirementVersionNotes
Rocky Linux9.x (minimal install)Fresh install preferred
SELinuxEnforcing (default)Don’t disable it. The playbooks configure the required booleans.
Internet accessRequired for package downloads (Elastic repo)

On your workstation (where you run the deployment playbooks):

RequirementVersionNotes
Ansible2.14+ (ansible-core)For running the deployment playbooks
ansible.posix collection1.5+ansible-galaxy collection install -r requirements.yml
SSH accessTo all 3 hosts, with a sudo-capable user

Pre-Flight Checklist

Run these on each of your three ELK hosts before starting Chapter 3. Every check should pass.

Confirm the OS:

cat /etc/os-release | grep PRETTY_NAME

Expected: PRETTY_NAME="Rocky Linux 9.x (Blue Onyx)"

Confirm SELinux is enforcing:

getenforce

Expected: Enforcing

If this says Permissive or Disabled, fix it before proceeding. Edit /etc/selinux/config, set SELINUX=enforcing, and reboot.

Confirm available RAM:

free -h | grep Mem

You need at least 4 GB total per node. The Logstash node benefits from 8 GB.

Confirm available disk:

df -h /opt

You need at least 20 GB free on /opt (or wherever you’ll store data).

Confirm nodes can reach each other:

# From es01, test connectivity to es02 and es03
ping -c 1 192.168.1.62
ping -c 1 192.168.1.63

Replace IPs with your actual node addresses. All three nodes must be able to reach each other.

What to Have Ready

Before you start Chapter 3, gather these:

  1. The IP addresses or hostnames of your three ELK hosts. We’ll use 192.168.1.61, 192.168.1.62, and 192.168.1.63 as examples throughout this guide.

  2. An SSL certificate and key for the Kibana host. Self-signed is fine for a home lab. If you don’t have one yet, generate one before running the playbooks.

  3. Strong passwords for the Ansible vault — you’ll need three: vault_elk_elastic_password (the Elasticsearch superuser), vault_elk_kibana_system_password (Kibana’s service account), and vault_elk_logstash_system_password (Logstash’s service account). Generate them now and keep them somewhere safe. These are used during deployment to configure built-in user authentication.

  4. Your Ansible inventory configured with all three hosts in the elasticsearch group, and the appropriate hosts in the kibana, logstash, and filebeat groups (see inventory/hosts.yml.example).

Running the Deployment

If you have the playbook bundle, here’s the single command that deploys the entire stack:

cd elk-deploy
ansible-playbook site.yml -i inventory/hosts.yml --vault-password-file .credentials/vault.txt

This runs four plays in sequence: Elasticsearch cluster setup across all three nodes (including transport TLS and built-in user authentication), Kibana with Apache reverse proxy on es01, Logstash on es02, and Filebeat on your log-shipping hosts. Expect it to take 15-25 minutes depending on internet speed (package downloads are the bottleneck).

When it finishes, you’ll have a working 3-node cluster with security enabled, Kibana accessible at https://kibana.example.com, Logstash accepting logs on ports 5044 and 5514, and Filebeat shipping system logs from your hosts.

The rest of this guide walks through what that playbook does and why each decision was made.

Two paths through this guide: Each chapter shows the manual commands first (so you understand what’s happening), then summarizes what the playbook automates in a “What Automation Looks Like” section at the end. If you’re using the playbook bundle, you don’t need to run the manual commands — read them for understanding, then skip to the automation summary. If you’re following the manual path, every command you need is shown with copy-paste blocks.

Everything in place? Let’s deploy Elasticsearch.

Want the automation code? Get the production-ready Ansible playbooks that deploy this entire ELK stack in ~20 minutes.

Get Playbooks — $29