Docker Health Checks Explained: A Beginner's Guide to Keeping Your Containers Running
Learn how Docker health checks work and how to implement them in your homelab. Step-by-step guide with real examples for web apps, databases, and more.
Author
David Okonkwo

Key Takeaways
- Docker health checks let you detect when a container is "running" but not actually working - far more useful than just knowing it started
- Health checks use the
HEALTHCHECKinstruction in Dockerfiles or thehealthcheckconfig in Docker Compose - The
start_periodoption prevents false failures during slow application startup - use it for every service that takes more than 2 seconds to initialize - You can chain
depends_onwithcondition: service_healthyin Docker Compose to control service start order - Different service types need different health check approaches - a web app check looks nothing like a database check
If you've ever had a Docker container showing "Up" in docker ps while the application inside was completely unresponsive, you know the frustration I'm talking about. Your container started successfully, so Docker thinks everything is fine. But your web app is returning 502 errors, your database isn't accepting connections, or your background worker is silently failing.
The good news is there's a straightforward solution, and Docker built it right into the platform. It's called a health check.
In this guide, I'll walk you through what Docker health checks are, why they matter for your homelab, and exactly how to implement them step by step. You don't need to be a Docker expert to follow along - if you've used docker run a few times and maybe written a Docker Compose file, you have everything you need.
What Is a Docker Health Check? (And Why You Need One)
Let's start with an analogy, because this concept is much easier to understand when you think about it in familiar terms.
Imagine you hire someone to stand guard at your front door. You check that they showed up on time - they're physically present. That's like Docker checking that the container process started. But are they actually doing their job? Are they checking IDs? Answering the doorbell? Staying awake?
A Docker health check is like a supervisor who periodically calls the guard and says "tap your foot twice if everything's okay." If the guard doesn't respond, or responds wrong, the supervisor knows something is wrong - even if the guard is technically still standing at the door.
In technical terms, a Docker health check is a command that Docker runs inside your container at regular intervals. If the command succeeds (exits with code 0), the container is marked "healthy." If it fails repeatedly, the container is marked "unhealthy."
This matters for your homelab because:
Why Health Checks Matter for Your Homelab
- You can't trust just "running" status - A container process can start successfully but its application may crash moments later, or get stuck in a broken state
- Service dependencies work properly - If your web app depends on a database, you need to know when the database is actually ready to accept connections, not just when it started
- Automatic recovery - Combined with restart policies, health checks can trigger Docker to restart a container that's been unhealthy for too long
- Monitoring integration - Tools like Uptime Kuma, Grafana, and Portainer can use health check status to alert you when something goes wrong
- Peace of mind - When you're running 10, 20, or 30 services in your homelab (as many of us do), you can't manually check each one
Docker HEALTHCHECK: The Two Ways to Add Health Checks
Docker gives you two places to define health checks, and which one you use depends on your situation:
Option 1: HEALTHCHECK in a Dockerfile
This is the approach to use when you're building your own Docker images. You add the HEALTHCHECK instruction directly to your Dockerfile, and every container created from that image automatically gets the health check.
Here's the basic syntax:
# Dockerfile
FROM nginx:alpine
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
CMD curl -f http://localhost/ || exit 1
Let me break down what each option does:
- --interval=30s - How often Docker runs the health check command (default: 30s)
- --timeout=5s - How long Docker waits for the command to complete before considering it failed (default: 30s)
- --start-period=10s - Grace period after the container starts during which failed health checks won't count toward retries (default: 0s - this is important, I'll explain more below)
- --retries=3 - How many consecutive failures before Docker marks the container as "unhealthy" (default: 3)
The CMD instruction (or CMD-SHELL, which is the default) specifies the actual command Docker runs inside the container. If it exits with code 0, the check passes. Any non-zero exit code means the check failed.
Option 2: healthcheck in Docker Compose
If you're using someone else's image (which most of us do for things like PostgreSQL, Nginx, Redis), you don't control the Dockerfile. That's fine - Docker Compose lets you add or override health checks right in the service definition:
# docker-compose.yml
services:
web:
image: nginx:alpine
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost/"]
interval: 30s
timeout: 5s
retries: 3
start_period: 10s
The options are the same - just formatted as YAML instead of command-line flags.
What's Actually Happening Under the Hood
Here's something that confused me when I first started using health checks, and I want to make sure it's clear for you.
Docker runs your health check command inside the container. This means you need the tools available inside the container to run the check. If you're using curl in your health check, curl needs to be installed in the container image. If it's not, the health check will fail because the command itself can't be found - not because your application is broken.
Important Note
Many minimal Docker images (like alpine-based images) don't include curl or wget. You have two options: either install them in your Dockerfile, or use a different health check command that doesn't need external tools.
For example, instead of curl -f http://localhost/, you can use Docker's built-in shell check: CMD-SHELL pgrep nginx || exit 1 to check if the Nginx process is running. This uses only the shell, which is always available.
When Docker runs a health check, it sets the container's status to one of three values:
- starting - The initial state when the container starts, or during the start_period
- healthy - The last health check succeeded
- unhealthy - The number of consecutive failures has exceeded retries
You can see these statuses when you run docker ps:
$ docker ps
CONTAINER ID IMAGE STATUS NAMES
a1b2c3d4e5f6 nginx:alpine Up 2 minutes (healthy) my-web-app
f6e5d4c3b2a1 postgres:16 Up 5 minutes (unhealthy) my-database
Real-World Examples: Health Checks for Common Services
Let me show you health checks for the services you're most likely running in your homelab. I've tested these myself, and they work as shown.
Web Application (Nginx)
For web servers and web applications, the most reliable health check is to make an actual HTTP request to the application's endpoint:
services:
app:
image: your-web-app:latest
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
I recommend creating a dedicated /health endpoint in your application that returns HTTP 200. This endpoint can check that your database connection is alive, that your cache is responding, and that your application is fully initialized. If any of these dependencies fail, the health endpoint returns non-200, and Docker marks the container as unhealthy.
PostgreSQL Database
For Postgres, you can check that the database is running and accepting connections using the pg_isready utility that comes with PostgreSQL:
services:
db:
image: postgres:16-alpine
environment:
POSTGRES_PASSWORD: secret
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5
start_period: 30s
The pg_isready command checks if PostgreSQL is ready to accept connections. It exits with code 0 when everything's fine. This is much more reliable than just checking if the Postgres process is running.
Redis Cache
Redis has a built-in health check command:
services:
redis:
image: redis:7-alpine
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 5
Redis responds with PONG when it's working correctly, and redis-cli ping exits with code 0. Simple and effective.
General Linux Process Check
If you just need to verify a process is running inside a minimal container:
healthcheck:
test: ["CMD-SHELL", "pgrep my-process || exit 1"]
interval: 30s
timeout: 3s
retries: 3
This doesn't need curl or any external tools - just the shell and pgrep.
Understanding start_period: The Most Overlooked Option
When I first tried Docker health checks, I had a frustrating experience. My PostgreSQL container kept showing "unhealthy" even though it was clearly working fine after it finished initializing. The issue was that Postgres takes about 15-20 seconds to start up, create the data directory, and start accepting connections. During that time, my health check was already running and failing.
The start_period option exists specifically for this. It gives your container a grace period after startup where health check failures are recorded but not counted. Docker still runs the health check, and you can still see failures in the logs, but they won't increment the retry counter. The container stays in "starting" status until the start period ends, then transitions to either "healthy" or "unhealthy" based on the first real check.
When to Adjust start_period
- 2-5 seconds - A simple Go or Rust web server that starts almost instantly
- 15-30 seconds - PostgreSQL, MySQL, or MariaDB that needs to initialize data directories
- 30-60 seconds - Java applications (even with modern frameworks, the JVM needs time to warm up)
- 60-120 seconds - Applications that run database migrations on startup, or generate initial data
When in doubt, set start_period longer than your application's actual startup time. A few extra seconds of grace period is much better than false alarms that restart your container unnecessarily.
Controlling Service Startup Order with depends_on
One of the most practical uses of health checks in a homelab is controlling the order in which services start. If you've ever had your web application crash because it tried to connect to the database before the database was ready, you know exactly what I'm talking about.
Docker Compose's depends_on option can be combined with health checks to solve this:
services:
db:
image: postgres:16-alpine
environment:
POSTGRES_PASSWORD: secret
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 5s
timeout: 5s
retries: 5
start_period: 20s
app:
image: my-web-app:latest
depends_on:
db:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
With condition: service_healthy, Docker Compose waits until the database's health check passes before starting the app service. This eliminates the "race condition" where your web app starts but can't connect to the database because the database isn't ready yet.
Here's what the startup sequence looks like:
- Docker starts the
dbcontainer - Docker starts running health checks against
db - The
dbcontainer goes through itsstart_period(20 seconds of grace) - Once
dbpasses a health check, it's marked "healthy" - Docker starts the
appcontainer (it now knows the database is ready) - The
appcontainer goes through its ownstart_period
Health Check Patterns for Different Architectures
Simple Service (Single Container)
For a basic service with no dependencies, keep your health check simple:
services:
nginx:
image: nginx:alpine
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost/"]
interval: 30s
timeout: 5s
retries: 2
Multi-Service Stack (Web + DB + Cache)
For a typical web stack with dependencies, use the dependency chain pattern:
services:
postgres:
image: postgres:16-alpine
healthcheck:
test: ["CMD-SHELL", "pg_isready -U myapp"]
interval: 10s
timeout: 5s
retries: 5
start_period: 30s
redis:
image: redis:7-alpine
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 3
app:
image: myapp:latest
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost:3000/api/health || exit 1"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
Background Worker (Queue Processor)
For workers that don't expose HTTP endpoints, check process health instead:
services:
worker:
image: my-worker:latest
healthcheck:
test: ["CMD-SHELL", "pgrep -f 'queue-worker' || exit 1"]
interval: 30s
timeout: 5s
retries: 3
start_period: 10s
Common Mistakes (And How to Avoid Them)
After implementing health checks across my homelab and several production environments, here are the mistakes I've made so you don't have to:
Mistake 1: Not Setting start_period
This is by far the most common issue. You add a health check to PostgreSQL, restart the stack, and immediately see "unhealthy." You panic, thinking something's broken. But Postgres just needs 20 seconds to initialize. Always set a reasonable start_period for services that take time to start.
Mistake 2: Using a Health Check That's Too Expensive
I once used a health check that ran a complex SQL query against the database. It worked fine - until the database was under load. The health check itself became a performance bottleneck, running every 30 seconds and making things worse. Keep health checks lightweight. SELECT 1 for databases, HEAD / for web services not SELECT * FROM giant_table.
Mistake 3: Not Installing Health Check Tools in Custom Images
If you're building a custom Docker image and adding a HEALTHCHECK in your Dockerfile, make sure the tools you're using (curl, wget, etc.) are installed in the image:
# Don't forget to install curl!
FROM node:20-alpine
# Install curl for health checks
RUN apk add --no-cache curl
COPY app.js .
CMD ["node", "app.js"]
HEALTHCHECK --interval=30s --timeout=5s \
CMD curl -f http://localhost:3000/ || exit 1
Mistake 4: Health Check That Always Passes
A surprising number of health checks are written in a way that never actually fails. For example, CMD-SHELL curl http://localhost || true will always exit with code 0 because of the || true. Test your health check by deliberately breaking your application and making sure Docker marks the container as unhealthy.
Mistake 5: Duplicate Health Checks
If you define a health check in both the Dockerfile AND in Docker Compose for the same service, Docker Compose overrides the Dockerfile version. This is useful when you want to customize the health check for a specific deployment, but it can be confusing if you're not aware of it.
Troubleshooting Health Checks
When your health check isn't working as expected, here's how to debug it:
Check Container Health Status
# Check health status of all containers
docker ps
# Get detailed info about a specific container
docker inspect --format='{{json .State.Health}}' container_name
# View health check logs
docker inspect container_name | jq '.[].State.Health.Log'
Test Your Health Check Manually
Run the health check command directly inside the container to verify it works:
# Run the command inside the container
docker exec container_name curl -f http://localhost/
# If it returns a non-zero exit code, the health check will fail
echo $?
Common Health Check Failures
| Symptom | Likely Cause | Fix |
|---------|-------------|-----|
| Container shows "unhealthy" immediately | start_period too short or missing | Increase start_period |
| Health check never runs | Wrong interval or Docker bug | Verify docker inspect shows the health check config |
| curl: not found error | Tool not installed in image | Install curl or use a different check method |
| Health check "passes" but app is down | Health check doesn't test what matters | Use an actual endpoint or service-level check |
| depends_on doesn't wait | Not using condition: service_healthy | Add the condition to depends_on |
Next Steps: What to Learn After This
If you've followed along and implemented health checks for your services, here's what I'd recommend looking into next:
1. Container restart policies - Combine health checks withrestart: unless-stopped or restart: always so Docker automatically restarts unhealthy containers. Docker doesn't restart containers based solely on health status, but you can configure external tools to watch for unhealthy containers and restart them.
2. Monitoring and alerting - Set up a monitoring tool like Uptime Kuma or Grafana with a health check dashboard that alerts you when containers stay unhealthy. This is especially useful when you're running services you don't check daily.
3. Docker Compose best practices - If you haven't already, check out our guide on Docker Compose Best Practices to learn more about structuring your Compose files properly.
4. Docker for homelabs - If you're new to Docker entirely, our Docker for Homelabs guide covers the fundamentals you need to get started.
5. Docker volumes and data persistence - Health checks are just one piece of the puzzle. Learn about Docker Volumes vs Bind Mounts to make sure your data survives container restarts.
Recommended Gear for Running Docker in Your Homelab
Frequently Asked Questions
What's the difference between Docker health checks and restart policies?
Restart policies (like restart: always) control when Docker restarts a container based on its process state - if the main process exits, Docker restarts it. Health checks tell you (and other systems) whether the container is actually working, regardless of whether the process is still running. They serve different purposes and work best together.
Can I have multiple health checks for one container?
No, Docker only supports one HEALTHCHECK instruction per container. If you need to check multiple conditions, create a script that checks everything and exits with code 0 only if all conditions pass. For example, a script that checks both the database connection and the cache connection.
Do health checks add significant overhead to my server?
Not if you set them up correctly. A lightweight health check (like a simple HTTP HEAD request or pg_isready) runs in milliseconds and uses negligible resources. The problem is when you make the health check itself expensive - like running complex database queries. Keep them simple and you won't notice the overhead.
Will Docker restart my container when it's unhealthy?
No, not by default. Docker only records the health status - it doesn't automatically restart containers based on health check results. To get automatic restarts, you need an external tool like systemd, a monitoring service, or a custom script that watches for unhealthy containers and restarts them.
How do I check health status from outside the container?
You can use docker inspect to get detailed health info, or tools like Portainer, Uptime Kuma, and Grafana that can query Docker's API. The docker ps command shows health status with the "(healthy)" or "(unhealthy)" label next to each container.
Wrapping Up
Docker health checks are one of those features that seem optional until you've experienced the pain of a silently broken container. They're not complicated to set up, but they make a massive difference in how reliably your services run.
The key takeaways I hope you'll remember:
- Always set a
start_periodfor services that take time to initialize - Use
depends_onwithcondition: service_healthyto control startup order - Keep your health check commands lightweight
- Test your health checks by deliberately breaking your application
- Combine health checks with external monitoring for the best results
If you're running a homelab with multiple Docker services, adding health checks is one of the highest-ROI things you can do. It takes five minutes to add them to your Compose files, and they'll save you hours of debugging down the line.
*This article is part of our Docker Fundamentals series. For more on securing your Docker setup, see our Homelab Security Best Practices guide.*

alt="Beelink SER5 MAX Mini PC"
alt="Samsung 990 PRO 2TB NVMe SSD"