Chapter 1

Why This Guide Exists

In this chapter

<nav id="TableOfContents" aria-label="Chapter sections"> <ul> <li><a href="#the-problem-with-scattered-logs">The Problem With Scattered Logs</a></li> <li><a href="#why-elastics-docs-arent-enough">Why Elastic’s Docs Aren’t Enough</a></li> <li><a href="#what-youll-have-at-the-end">What You’ll Have at the End</a></li> <li><a href="#who-this-guide-is-for">Who This Guide Is For</a></li> <li><a href="#who-this-guide-is-not-for">Who This Guide Is Not For</a></li> <li><a href="#prerequisites">Prerequisites</a></li> </ul> </nav>

What you’ll accomplish: Understand what the ELK stack does for your home lab, why Elastic’s documentation leaves you stranded, and exactly what you’ll have when you finish this guide.

The Problem With Scattered Logs

If you’re running a home lab, you have logs everywhere. Your Proxmox hypervisor has system journals. Nginx is writing access logs on your reverse proxy VM. Nextcloud has its own log file. Jellyfin, Pi-hole, Home Assistant, Docker containers — every service writes logs to its own location in its own format. SSH auth failures are in /var/log/secure on one box and /var/log/auth.log on another.

When something breaks at 11 PM, you SSH into each box, grep through logs, try to correlate timestamps across hosts, and hope the relevant log entry hasn’t been rotated away. This is not a sustainable workflow.

The ELK stack — Elasticsearch, Logstash, and Kibana — solves this. Logstash collects logs from your hosts and ships them into Elasticsearch, a distributed search engine that indexes everything. Kibana gives you a web UI to search, filter, and visualize those logs in one place. Instead of SSHing into five machines and grepping five different log formats, you open a browser and search.

It’s the same stack that powers log analytics at companies processing terabytes per day. For a home lab, you don’t need terabyte scale, but you absolutely benefit from centralized, searchable logs with automatic retention policies.

Why Elastic’s Docs Aren’t Enough

The Elastic documentation will get you to “installed.” It won’t get you to “production-ready on Rocky Linux with a real cluster, sensible retention, and a reverse proxy that doesn’t break.” Here’s what’s missing:

Cluster formation is hand-waved. The docs show you how to configure a single node. For a cluster, they say “set discovery.seed_hosts and cluster.initial_master_nodes” and move on. They don’t explain what happens when you get the node list wrong, how master election works, or why your 3-node cluster sometimes thinks it’s two separate 1-node clusters (split brain). The cluster formation section of this guide exists because I spent hours debugging discovery failures that the docs don’t mention.

Data path relocation is ignored. Elasticsearch stores its data in /var/lib/elasticsearch by default — the same partition as your OS. On a home lab VM with 20-40 GB of root disk, a few weeks of log ingestion fills that partition and your node crashes. You need to relocate the data path to a dedicated mount point or at minimum a larger directory on /opt. The docs don’t cover this at all.

JVM tuning is oversimplified. The official guidance is “set heap to half your RAM.” For Logstash, the formula is more nuanced — 62.5% of total RAM gives you better performance than 50% because Logstash needs the extra heap for pipeline processing, and it doesn’t rely on filesystem cache the way Elasticsearch does. The docs don’t make this distinction.

ILM is treated as “advanced.” Index Lifecycle Management — the system that automatically deletes old indices — is presented as an optional feature for large deployments. In a home lab, it’s essential from day one. Without ILM, your indices grow unbounded, your disk fills up, and your cluster goes red. This guide treats retention policies as a required part of the deployment, not an afterthought.

Rocky Linux specifics don’t exist. The docs target generic Linux. If you’re running Rocky Linux 9, you’ll hit SELinux denials that block Apache from proxying to Kibana, firewalld rules that need rich rules instead of simple port opens, and systemd service management that behaves differently from the docs’ examples. This guide was built and tested on Rocky 9.

What You’ll Have at the End

By the time you finish this guide, you’ll have:

A 3-node Elasticsearch cluster on Rocky Linux 9, with transport TLS, built-in user authentication, cluster discovery configured, data paths relocated to /opt, and swappiness tuned for search workloads
Kibana accessible via HTTPS through an Apache reverse proxy, with SELinux configured correctly (not disabled)
Logstash accepting log input from Beats agents (port 5044) and syslog sources (port 5514), with JVM heap tuned for your available RAM
Filebeat shipping system logs (syslog + auth) from your hosts to Logstash
ILM policies that automatically clean up old indices — 120 days for general logs, 26 days for metrics, 3 days for monitoring data
Firewall rules that allow only the required inter-node and client traffic Everything is documented. Everything uses sanitized, vault-encrypted credentials. At the end of each chapter, there’s an automation summary showing how to reproduce what you just learned using Ansible — useful whether you write your own playbooks or use the companion bundle.

Who This Guide Is For

You’re an intermediate sysadmin or home lab hobbyist. You’re comfortable with SSH, package managers, and editing config files. You’ve used Ansible at least a little — you know what a playbook is and have run ansible-playbook before. You don’t need hand-holding on Linux basics, but you do need someone to tell you “relocate your data path” and “set up ILM before you forget” and “this SELinux boolean will save you two hours of debugging.”

Who This Guide Is Not For

If you’re running petabyte-scale Elasticsearch clusters with dedicated master nodes, hot-warm-cold architectures, and cross-datacenter replication — that’s a different problem. We’re building something that runs on three VMs with 4-8 GB of RAM each and handles a home lab’s worth of logs comfortably.

Prerequisites

Before you start Chapter 2, make sure you have:

Three Rocky Linux 9 hosts (VMs or bare metal) with at least 4 GB RAM, 2 CPU cores, and 20 GB disk each. Fresh minimal installs are ideal.
SSH access to all three hosts with a user that has sudo privileges.
Ansible installed on your workstation (the machine you’ll run the deployment playbooks from — not the ELK hosts themselves).
DNS records or IP addresses for all three hosts. If you have internal DNS, use names like es01.example.com, es02.example.com, es03.example.com. If not, IP addresses work fine.
An SSL certificate for the Kibana hostname, or willingness to use a self-signed certificate for your home lab.

This guide also has a companion Ansible playbook bundle that automates every step covered here. If you find yourself wanting to automate after working through the guide, it’s available at RavenForge Press or directly on Payhip.

That’s it. Let’s look at the architecture.

Want the automation code? Get the production-ready Ansible playbooks that deploy this entire ELK stack in ~20 minutes.

Get Playbooks — $29