From the Keyboard of Zachary Wagner

caddy/ui: A Web Interface for Caddy, Built in Conversation with Claude

May 25, 2026

I've been running Caddy as my reverse proxy for a while now. It's excellent — automatic TLS, clean config syntax, fast. But managing it meant dropping into SSH, editing the Caddyfile by hand, and reloading. Not terrible, but not great either when you're doing it for the fifteenth time.

I wanted a web UI. There are a few out there, but none of them felt like what I was after. So I built one.

What started as a single conversation with Claude became caddy/ui — a full-stack management interface for Caddy that runs as two Docker containers alongside your existing setup.

What it does

caddy/ui exposes six tabs:

Dashboard shows you live server status, TLS state, a summary of your server blocks, upstream health, and Caddy process info — version, uptime, heap usage, and last reload time. Clicking a server block navigates directly to Routes filtered to that block.

Caddyfile Editor is a CodeMirror-powered editor with nginx syntax highlighting (close enough for Caddyfile). It validates before saving using Caddy's own /adapt API, runs caddy fmt on save, sorts site blocks automatically (public → internal → http), and keeps a full version history with inline preview and one-click rollback.

Route Manager shows all reverse proxy routes across every server block — not just srv0. Each route gets a live health dot sourced from Caddy's upstream pool API, with TCP fallback for routes not yet in the pool. There's uptime tracking, search/filter, clickable domain and upstream links, edit-in-place for UI-managed routes, and per-route notes. Caddyfile-managed routes show a notice instead of edit fields, since editing them in the UI would lose their formatting.

TLS Certificates lists every cert Caddy is managing with expiry dates, days remaining, and status. Sortable by domain, expiry, or days. Orphaned certs (no longer in the Caddyfile) are flagged and can be deleted. Root CA download comes directly from Caddy's /pki/ca/local admin endpoint.

Access Logs tails the last 200 lines of your access log with live SSE streaming, keyword search, and ERROR/WARN/INFO level filters. Log configuration (enable/disable, format, level, roll settings) is editable directly from the UI — it modifies the global Caddyfile block and reloads Caddy automatically.

Metrics shows request counts, RPS, average response time, status code breakdown, and p50/p95/p99 percentiles. All powered by Caddy's built-in Prometheus endpoint. The tab also lets you enable or disable the metrics directive in your Caddyfile without editing it manually.

Architecture

The backend is Node.js with Express. It talks to Caddy's admin API on port 2019, reads and writes the Caddyfile from a shared volume, and handles all the business logic. The frontend is React with Vite, served via Nginx, with all /api/* requests proxied to the backend.

One deliberate design decision: your Caddyfile stays the source of truth. The UI reads from it and writes to it, but it never tries to own it. Routes you manage through the UI get a @id annotation so the backend can find and update them. Routes defined by hand in the Caddyfile are visible in the Route Manager but read-only — editing them is a Caddyfile job.

A few things that came out of building this

Caddy's admin API is more useful than I expected. Validation is handled by POST /adapt?adapter=caddyfile — no need to shell out to the caddy binary for that. The upstream pool is available at GET /reverse_proxy/upstreams, which gives you Caddy's own view of upstream health including failure counts. Root CA download is GET /pki/ca/local. I'm using all of these instead of filesystem reads or Docker socket access wherever possible.

The Docker socket is a liability. An earlier version used docker exec to run caddy validate and caddy version. That's a lot of surface area to mount just for two commands. Both are now handled differently — validation via the admin API, the Caddy binary bundled in the backend image via a multi-stage build. No socket required.

Caddyfile-managed routes need special handling. When you define a route in the Caddyfile, Caddy's JSON config represents it without an @id. When you create a route through caddy/ui, the backend assigns one. So hasId is the reliable signal for “this route is UI-managed and editable.” Routes without an ID get a notice in the edit modal instead of form fields.

The upstream health logic is a hybrid. Caddy's pool API is the primary source — if Caddy has seen traffic to an upstream, it tracks it. But for routes with no recent traffic (or routes defined only in the Caddyfile), the upstream won't appear in the pool. Those fall back to a direct TCP connect. Both paths feed the same rolling uptime tracker.

Getting started

services:
  caddy:
    image: caddy:latest
    container_name: caddy
    environment:
      - CADDY_LOG_PATH=/var/log/caddy/access.log
      - DOMAIN=example.com
      - EMAIL=you@example.com
      - TZ=America/New_York
    volumes:
      - /docker/caddy/Caddyfile:/etc/caddy/Caddyfile
      - /docker/caddy/data:/data
      - /docker/caddy/config:/config
      - /docker/caddy/logs:/var/log/caddy
    networks:
      - caddy-ui

  caddy-ui-backend:
    image: zackwag/caddy-ui-backend:latest
    container_name: caddy-ui-backend
    ports:
      - 9876:3001
    environment:
      - TZ=America/New_York
      # Optional -- leave unset to disable auth
      - CADDY_UI_USER=admin
      - CADDY_UI_PASSWORD=yourpassword
      - JWT_SECRET=your-long-random-secret
    volumes:
      - /docker/caddy/Caddyfile:/etc/caddy/Caddyfile
      - /docker/caddy/logs:/var/log/caddy
      - /docker/caddy-ui:/etc/caddy-ui
      - /docker/caddy/data:/data/caddy
    networks:
      - caddy-ui
    depends_on:
      - caddy

  caddy-ui-frontend:
    image: zackwag/caddy-ui-frontend:latest
    container_name: caddy-ui-frontend
    ports:
      - 9877:80
    networks:
      - caddy-ui
    depends_on:
      - caddy-ui-backend

networks:
  caddy-ui:
    driver: bridge

One thing worth calling out: admin 0.0.0.0:2019 needs to be set in your Caddyfile global block, not as an environment variable. The CADDY_ADMIN env var doesn't do what you might expect.

{
    admin 0.0.0.0:2019
    email {$EMAIL}
}

All backend environment variables have sensible defaults. The only ones you need to set are auth credentials if you want authentication, and TZ. Everything else works out of the box.

The code

The source is on GitHub at zackwag/caddy-ui. Images are on Docker Hub at zackwag/caddy-ui-backend and zackwag/caddy-ui-frontend.

brew-export: Pack Your Entire Mac Environment Into a Tarball

May 23, 2026

Every time I get a new Mac, the same routine plays out. Open a terminal. Try to remember what I had installed. Run brew install a dozen times. Wonder why something is broken and eventually realize I'm missing a tap. Dig through dotfiles manually. It's tedious and I always miss something.

So I wrote brew-export — a single shell script that snapshots your entire Homebrew environment and selected dotfiles into a portable tarball. Drop it on a new machine, run one script, done.

What it does

brew_export.sh produces a <name>.tar.gz containing three things:

A Brewfile — generated by brew bundle dump, capturing every formula, cask, tap, and Mac App Store app
A self-contained install script that handles the full restore sequence
Any dotfiles you selected during export

The tarball is self-contained. There's no dependency on the export machine once it's built.

The install sequence

The generated install script walks through five steps automatically:

Step 1 — Homebrew. Checks if it's installed. If not, runs the official install script and handles the Apple Silicon shellenv path so brew is available immediately without opening a new shell.

Step 2 — Custom taps. Any taps outside of homebrew/core and homebrew/cask are captured at export time and baked into the install script as explicit brew tap calls. This is the part that always bit me before — packages from third-party taps fail silently if the tap isn't registered first.

Step 3 — Mac App Store. If there are MAS apps in the Brewfile, the script checks for mas and installs it automatically if it's missing. Then it checks App Store sign-in. If you're not signed in, it installs everything else, exits cleanly, and tells you exactly what to do next. Re-run after signing in and it picks up where it left off.

Step 4 — brew bundle install. Runs against the Brewfile. By this point taps are registered and MAS is ready, so it goes cleanly.

Step 5 — Dotfiles. Copies selected dotfiles back to ~. If a file already exists, you're prompted per file — overwrite, skip, or backup-and-overwrite. Backups get a timestamp suffix so nothing is lost.

Dotfile selection

During export, the script scans ~ for top-level dotfiles and ~/.ssh for any files. If fzf is installed, you get a TAB-based multi-select interface. If not, it falls back to a numbered checklist.

📝 Select dotfiles to include (TAB to multi-select, ENTER to confirm):

  dotfiles > /Users/zackwag/.gitconfig
             /Users/zackwag/.zshrc
             /Users/zackwag/.zshenv
             /Users/zackwag/.ssh/config
             /Users/zackwag/.ssh/id_ed25519

SSH files get a warning. If you include private keys, the tarball needs to be stored and transferred securely — the script reminds you of that at the end.

Usage

# Defaults to your hostname as the filename base
./brew_export.sh

# Or specify a name
./brew_export.sh my-macbook-setup

This produces my-macbook-setup.tar.gz. The tarball layout:

my-macbook-setup/
├── my-macbook-setup.Brewfile
├── my-macbook-setup_install.sh
└── dotfiles/
    ├── .gitconfig
    ├── .zshrc
    └── .ssh/
        └── config

On the new machine:

tar -xzf my-macbook-setup.tar.gz
cd my-macbook-setup
bash my-macbook-setup_install.sh

A few design decisions worth explaining

sync after the Brewfile dump. brew bundle dump returns before the filesystem has necessarily flushed. On fast machines this isn't usually a problem, but the MAS detection reads the Brewfile immediately after the dump. A sync call makes that safe.

mas is installed automatically. If your Brewfile has App Store apps, the export script installs mas for you if it isn't already present. No prerequisite steps.

mapfile vs read -a. macOS ships bash 3.2. mapfile is bash 4+. The fzf selection result gets read into an array using IFS=$'\n' read -r -d '' -a instead, which works on stock macOS without requiring a newer bash.

Banner alignment. Terminal emoji are double-width characters — they occupy 2 columns but ${#string} counts them as 1. Every box border was off by one per emoji. Fixed by measuring true display width using unicodedata.east_asian_width() in Python and computing correct padding before hardcoding it into the script.

The code

Available on GitHub at zackwag/brew-export.

I got tired of missing recordings, so I built channels2mqtt

May 4, 2026

I run Channels DVR and Home Assistant. For a long time, they had nothing to say to each other. Recordings would pile up and my phone stayed silent. My dashboard knew nothing about what was sitting unwatched on my DVR.

So I built channels2mqtt.

What it does

channels2mqtt is a Python service that polls your Channels DVR server on a configurable interval and publishes recording data to an MQTT broker. Home Assistant picks up those messages via MQTT Discovery and surfaces them as native sensors.

On startup, it publishes discovery config payloads so Home Assistant automatically creates three sensors — no yaml editing required:

Latest Recording — fires when a new recording completes
All Recordings — live count and full library payload
Upcoming Recordings — scheduled recordings with program details, auto-expired when their start time passes

From there it enters a poll loop, fetching fresh data from the Channels DVR API and publishing any changes.

How I use it

The main one is push notifications. The moment a new recording lands, my phone knows about it.

I also have a Lovelace popup that shows my current recordings — title, episode, artwork. But once data is in Home Assistant, you can do whatever you want with it. Automations, dashboards, notification routing based on who in the house would care about a particular show.

A bug worth explaining

Early versions tracked the latest recording by position — whatever was at the top of the list. The problem: watch a recording, it gets removed, the next one slides up, and you'd get a notification for a show that's been in your library for a week.

The fix was to stop tracking position and start tracking identity. channels2mqtt now maintains a set of every recording ID it has ever seen. A notification only fires when an ID appears that wasn't there before. On startup, all existing recordings are seeded into that set so a restart doesn't flood you with stale notifications.

# Before
last_recording_id = None

# After
seen_recording_ids = set()
for r in get_all_recordings():
    seen_recording_ids.add(r.get("id"))

Small change. Makes it feel right instead of just functional.

Getting started

services:
  channels2mqtt:
    image: zackwag/channels2mqtt:latest
    container_name: channels2mqtt
    restart: unless-stopped
    environment:
      - CHANNELS_HOST=192.168.x.x
      - MQTT_HOST=192.168.x.x
      - MQTT_USER=your_mqtt_username
      - MQTT_PASS=your_mqtt_password

Within one poll cycle, the sensors appear in Home Assistant.

Notable optional variables: POLL_INTERVAL (default: 60 seconds), LATEST_INCLUDE_WATCHED, ALL_INCLUDE_WATCHED, and LATEST_INCLUDE_IN_PROGRESS. Full details on Docker Hub and GitHub.

Getting CPU Temperature from the CLI on Apple Silicon

May 1, 2026

I wanted a simple way to get my Mac's CPU temperature from the command line — ideally something I could pipe into scripts and surface as a sensor in Home Assistant. What I assumed would be a five-minute task turned into a surprisingly deep rabbit hole that ended with me writing a native Objective-C tool from scratch.

Here's the full story.

The Dead Ends

osx-cpu-temp

The most commonly recommended tool for Mac CPU temps is osx-cpu-temp. It installs cleanly via Homebrew and the usage is dead simple:

brew install osx-cpu-temp
osx-cpu-temp

On Intel Macs, it works great. On Apple Silicon, it returns 0.0°C. The tool was built for Intel and reads from SMC keys that simply don't exist on M-series chips.

powermetrics

Apple ships a built-in tool called powermetrics that can sample various system metrics. Most guides suggest:

sudo powermetrics --samplers smc -i 1 -n 1 | grep -i temp

On recent macOS versions this returns:

powermetrics: unrecognized sampler: smc

Apple removed the SMC sampler. The replacement suggestion is --samplers thermal, which exists — but it only reports thermal pressure (whether the chip is throttling), not actual temperature values. A dead end.

mactop / asitop

Both mactop and asitop are TUI dashboard tools that do show real temperatures on Apple Silicon. The problem is they're designed for interactive use, not for piping. Neither has a --raw or --json output mode that works cleanly in scripts.

Finding the Right API

The key clue came from ioreg:

sudo ioreg -r -c AppleARMPMUTempSensor -l | head -100

This revealed that Apple Silicon Macs expose thermal sensors as AppleARMPMUTempSensor services, each with a Product name like PMU tcal, PMU tdev1, PMU tdev2, etc.

These aren't SMC key-value sensors at all. They're IOHIDEventService devices that deliver temperature readings as HID events via the private IOHIDEventSystem API.

The relevant private functions:

IOHIDEventSystemClientCreate — creates a client to the HID event system
IOHIDEventSystemClientSetMatching — filters to specific service types
IOHIDEventSystemClientCopyServices — enumerates matching services
IOHIDServiceClientCopyEvent — requests a temperature event from a service
IOHIDEventGetFloatValue — extracts the float value from the event

These functions exist in IOKit.framework but aren't exposed in public headers. They're accessible from Objective-C by declaring them yourself.

Building mac-temp

The tool is a single Objective-C file. The core logic:

Create an IOHIDEventSystemClient
Match services on PrimaryUsagePage: 0xff00 and PrimaryUsage: 5 — which targets the PMU temp sensors
Enumerate each service, request a kIOHIDEventTypeTemperature event, extract the float value
Filter out garbage values (anything outside 0–150°C)
Output results in human-readable, --json, or --raw format

IOHIDEventSystemClientRef system = IOHIDEventSystemClientCreate(kCFAllocatorDefault);

NSDictionary *matching = @{
    @"PrimaryUsagePage": @(0xff00),
    @"PrimaryUsage":     @(5)
};
IOHIDEventSystemClientSetMatching(system, (__bridge CFDictionaryRef)matching);

CFArrayRef services = IOHIDEventSystemClientCopyServices(system);

for (CFIndex i = 0; i < CFArrayGetCount(services); i++) {
    IOHIDServiceClientRef service = (IOHIDServiceClientRef)CFArrayGetValueAtIndex(services, i);
    IOHIDEventRef event = IOHIDServiceClientCopyEvent(service, kIOHIDEventTypeTemperature, 0, 0);
    double temp = IOHIDEventGetFloatValue(event, kIOHIDEventFieldTemperatureLevel);
    // ...
}

Compile with:

clang -fobjc-arc -framework Foundation -framework IOKit -o mac-temp mac-temp.m

No third-party dependencies. No Homebrew. Just Xcode Command Line Tools.

Usage

mac-temp            # All sensors, human-readable
mac-temp --json     # JSON output
mac-temp --raw      # Single number, pipe-friendly

Example output:

PMU tcal             51.9°C
PMU tdev1            49.2°C
PMU tdev2            48.7°C
PMU tdev3            47.1°C

Notably, it doesn't require sudo — unlike most tools that read thermal data on Apple Silicon.

Signing and Notarizing for Distribution

Since the binary is distributed outside the App Store, it needs to be code-signed with a Developer ID certificate and notarized by Apple, otherwise users get the Gatekeeper malware warning.

Sign:

codesign --sign "Developer ID Application: Your Name (TEAMID)" \
  --options runtime \
  --timestamp \
  mac-temp

Notarize:

zip mac-temp.zip mac-temp

xcrun notarytool submit mac-temp.zip \
  --apple-id "you@email.com" \
  --team-id "YOURTEAMID" \
  --password "app-specific-password" \
  --wait

One gotcha: stapling the notarization ticket (the usual final step) doesn't work on bare CLI binaries — xcrun stapler only supports .app bundles and .pkg installers. For a standalone binary, it doesn't matter. Gatekeeper verifies notarization online when the user first runs the tool, so the staple isn't required.

Wiring It Into Home Assistant

The original goal was to surface the temp as a Home Assistant sensor via the SSH integration. With mac-temp installed to /usr/local/bin, the sensor config is straightforward:

- command: /usr/local/bin/mac-temp --raw
  scan_interval: 30
  sensors:
    - type: number
      name: CPU temperature
      key: cpu_temperature
      float: true
      unit_of_measurement: "°C"

Use the full path — SSH integrations often run in a restricted environment where /usr/local/bin isn't in PATH.

For Fahrenheit, use value_template:

- command: /usr/local/bin/mac-temp --raw
  scan_interval: 30
  sensors:
    - type: number
      name: CPU temperature
      key: cpu_temperature
      float: true
      unit_of_measurement: "°F"
      value_template: "{{ value | float * 9 / 5 + 32 }}"

The Code

The full source, Makefile, and pre-built universal binary are available on GitHub at zackwag/mac-temp.

Hardening Home Assistant Automations Against Unavailable States

April 30, 2026

If you've been using Home Assistant long enough, you've been woken up at 2am by a notification that shouldn't have fired. A sensor came back online, a service restarted, HA rebooted — and suddenly your phone is buzzing about a “power outage” that never happened.

The root cause is almost always the same: automations that don't account for the full lifecycle of entity states.

This post covers a hardening system I've developed to eliminate those false triggers — including from_state/to_state guards, the availability template pattern, and how to bake these protections into blueprints so every automation gets them for free.

The Problem: Entities Don't Just Toggle

Most automations are written assuming an entity transitions cleanly between on and off. In reality, entities go through a much messier lifecycle:

unavailable → on → off → unavailable → unknown → on

This happens constantly:

HA restarts and entities briefly report unavailable
A network blip drops a Zigbee device
An MQTT service restarts and retained values are republished
A template sensor recalculates when its source entity comes back online

Each of these transitions can look like a real state change to an automation. Without guards, every one of them can fire your notification, trigger your lights, or run your scripts.

The Two-Guard Pattern

The most important hardening technique is guarding both the from_state and to_state of every trigger:

conditions:
  - condition: template
    value_template: >
      {{ trigger.from_state.state not in ['', 'unavailable', 'unknown'] }}
  - condition: template
    value_template: >
      {{ trigger.to_state.state not in ['', 'unavailable', 'unknown'] }}

Most people only guard to_state — checking that the new state isn't unavailable. But the from_state guard is equally important. Without it, the transition unavailable → on passes right through, which is exactly what happens when a sensor comes back online after a restart.

Both states must be real values for the automation to proceed.

The Availability Template Pattern

For template sensors, the old pattern for handling unavailable sources looked like this:

# Old pattern — prone to issues
state: >
  {% set s = states('sensor.some_entity') %}
  {% if s in ['unavailable', 'unknown', 'none'] %}
    {{ None }}
  {% else %}
    {{ s == 'on' }}
  {% endif %}

The problem is that returning None from a state template just sets the state to the string "None" — not actually unavailable. So you still get state transitions that can trigger automations.

The correct approach is to use the availability template, which is specifically designed for this:

state: >
  {{ states('sensor.some_entity') == 'on' }}
availability: >
  {{ states('sensor.some_entity') not in ['unavailable', 'unknown'] }}

When availability returns false, the sensor itself becomes unavailable in HA. This means the transition on restart becomes unavailable → on instead of off → on — and the from_state guard catches it.

For sensors with multiple dependencies, use the list pattern:

availability: >
  {% set entities = [
      'binary_sensor.some_sensor',
      'input_boolean.some_boolean',
      'switch.some_switch',
  ] %}
  {{ entities | select('is_state', 'unavailable') | list | count == 0
     and entities | select('is_state', 'unknown') | list | count == 0 }}

Baking Guards Into Blueprints

Writing these conditions into every automation manually is error-prone and inconsistent. The better approach is to encode them into blueprints so every automation instance gets them automatically.

Here's a hardened binary sensor blueprint:

blueprint:
  name: When Binary Sensor is Toggled
  description: |
    Triggers user-specified actions when a binary sensor changes to on and off.
    Ignores unavailable/unknown states and attribute-only changes.
  domain: automation
  input:
    binary_sensor:
      name: Binary Sensor
      selector:
        entity:
          domain: binary_sensor
    on_action:
      name: On Action
      default: []
      selector:
        action: {}
    off_action:
      name: Off Action
      default: []
      selector:
        action: {}

trigger:
  - platform: state
    entity_id: !input binary_sensor

condition:
  - condition: template
    value_template: >
      {{ trigger.from_state.state not in ['', 'unavailable', 'unknown'] }}
  - condition: template
    value_template: >
      {{ trigger.to_state.state not in ['', 'unavailable', 'unknown'] }}

action:
  - choose:
      - conditions:
          - condition: state
            entity_id: !input binary_sensor
            state: "on"
        sequence: !input on_action
      - conditions:
          - condition: state
            entity_id: !input binary_sensor
            state: "off"
        sequence: !input off_action
    default: []
mode: single

Every automation using this blueprint is hardened by default. You can create similar blueprints for input booleans, media players, outlets — anything you trigger off regularly.

For even simpler cases, a generic entity state change blueprint covers everything:

blueprint:
  name: When Entity State Changes
  description: |
    Triggers user-specified actions when any entity's state changes.
    Ignores unavailable/unknown states and attribute-only changes.
  domain: automation
  input:
    entity:
      name: Entity
      selector:
        entity: {}
    actions:
      name: Actions
      default: []
      selector:
        action: {}

trigger:
  - platform: state
    entity_id: !input entity

condition:
  - condition: template
    value_template: >
      {{ trigger.from_state.state not in ['', 'unavailable', 'unknown'] }}
  - condition: template
    value_template: >
      {{ trigger.to_state.state not in ['', 'unavailable', 'unknown'] }}

action: !input actions
mode: single

Automations using this blueprint become pure configuration:

alias: When There is a New Channels DVR Recording
use_blueprint:
  path: zackwag/entity_state_change.yaml
  input:
    entity: sensor.channels_dvr_latest_recording
    actions:
      - action: script.channels_dvr_new_recording_handler
        metadata: {}

When NOT to Use These Guards

Not every automation should use these guards. The pattern is deliberately designed to ignore unavailable — but sometimes you want to know when something goes unavailable.

A backup monitoring automation is a good example:

triggers:
  - trigger: state
    entity_id: sensor.container_backup_status
    to: failed
  - trigger: state
    entity_id: sensor.container_backup_status
    to: unavailable
  - trigger: state
    entity_id: sensor.container_backup_status
    to: unknown

If your backup service goes dark, that's exactly the alert you want. Adding the unavailability guards here would defeat the purpose. Use your judgment — the guards are the right default, but they're not universal.

The Full Defense Stack

For critical alerts like UPS power monitoring, you can combine all of these techniques into a robust defense:

availability on template sensors — prevents unavailable from masquerading as off
from_state guard — blocks transitions out of unavailable
to_state guard — blocks transitions into unavailable
Connectivity condition — suppresses alerts when the monitoring service itself is down
Persistent last values — prevents MQTT services from republishing retained values on restart

Each layer handles a different failure mode. The result is an alerting system you can actually trust — when your phone buzzes, something real happened.

Summary

The core principles:

Always guard both from_state and to_state — not just to_state
Use availability templates instead of returning None from state templates
Encode guards into blueprints so they're applied consistently
Know when to break the rules — some automations should fire on unavailable

Once you've applied this system across your automations, the 2am false alerts stop. And when a real alert fires, you'll actually pay attention to it.

Keeping Secrets Out of Git: A Hard Lesson with Docker Compose

April 29, 2026

We've all been there. You're wiring up a new Docker stack, things are finally working, and you commit and push before realizing your password is sitting right there in plain text in your compose.yaml. In my case it was MLB credentials for mlbserver. Oops.

Here's how I cleaned it up and what I'm doing going forward.

What Happened

I committed a compose.yaml with credentials hardcoded directly in the environment block:

environment:
  - account_username=zackwag@gmail.com
  - account_password=hunter2

Pushed it to a public GitHub repo. Caught it quickly, reset the password, but the damage was done — the secret was in the git history even after I deleted the file.

Removing It From History

My first instinct was BFG Repo Cleaner, but BFG matches on filename only — not path. Since I have multiple compose.yaml files across my stacks, that was a non-starter.

git filter-repo supports path filtering, which is exactly what I needed:

cd /tmp
git clone https://github.com/zackwag/docker.git
cd docker
git filter-repo --path opt/stacks/channels-addons/compose.yaml --invert-paths
git remote add origin https://github.com/zackwag/docker.git
git push --force origin main

Worth noting: git filter-repo refuses to run on a non-fresh clone by default. Clone fresh, run it there, force push. Don't fight it.

The Right Pattern Going Forward

The fix is straightforward — .env files. Keep secrets out of the compose file entirely and reference them as variables.

compose.yaml

environment:
  - account_username=${MLB_USERNAME}
  - account_password=${MLB_PASSWORD}

.env (never committed)

MLB_USERNAME=zackwag@gmail.com
MLB_PASSWORD=your_password_here

.env.example (committed as a template)

MLB_USERNAME=
MLB_PASSWORD=

.gitignore

.env

Docker Compose picks up .env automatically from the same directory as your compose.yaml. No extra configuration needed.

Not Everything Needs to Be a Secret

Worth calling out — not everything in your compose file needs to move to .env. In my Caddy stack I have things like DOMAIN, EMAIL, upstream IPs, and internal TLDs. None of that is sensitive. The rule of thumb:

Secrets → .env (passwords, tokens, API keys)
Config → fine in compose.yaml (domains, IPs, emails, paths)

Bonus: Nuking Your Git History Entirely

Since I'd already made a mess of the history, I took the opportunity to squash everything down to a single clean commit:

git checkout --orphan fresh
git add -A
git commit -m "Initial commit"
git branch -D main
git branch -m main
git push --force origin main

Clean slate. Felt good.

Takeaways

Add .gitignore and .env.example before you write your first compose.yaml
If you do commit a secret, reset it immediately — history cleanup is hygiene, not the fix
git filter-repo is the right tool for surgical history rewrites
Public repo bots are fast. Assume any exposed secret was seen.

Docker Container Backups: Leveling Up with v2.0

April 28, 2026

Back in February, I wrote about how I finally gave my home lab a real backup strategy using a containerized Flask server, rclone, and OneDrive. The solution worked well — but it only worked for a single host. If you run containers across multiple machines, you were on your own.

That changes with v2.0.

What Was Missing

The original setup was straightforward: one container, one host, one containers.json, and a cron job to kick things off. It solved the problem I had at the time.

But home labs grow. As I added more hosts, I found myself duplicating the setup and having no single place to check on the health of all my backups. That itch needed scratching.

Introducing the Hub/Spoke Architecture

v2.0 introduces a hub/spoke model for multi-host backup orchestration. Every instance of flask-container-backup is now a spoke by default — it behaves exactly as it did in v1.0. Nothing breaks.

The new piece is the hub. Set MODE=hub in your environment, point it at a spokes.json config file listing your remote spoke agents, and you now have a central orchestrator that can coordinate backups and aggregate status across your entire fleet.

environment:
  - MODE=hub

[
  { "name": "host-a", "url": "http://192.168.1.10:2128" },
  { "name": "host-b", "url": "http://192.168.1.11:2128" }
]

A couple of guardrails worth noting: if you set MODE=hub but spokes.json is missing or empty, the container will exit at startup with a fatal error. No silent failures.

New: The `/status` Endpoint

Every spoke now exposes a GET /status endpoint that returns the result of the last backup run — including a timestamp, which containers were backed up, and any errors encountered. Results are also written to backup_result.json after each run, so they survive a container restart.

{
  "timestamp": "2026-04-08T13:00:00",
  "containers_backed_up": ["caddy", "freshrss", "homeassistant"],
  "errors": []
}

The hub aggregates this across all configured spokes when you hit its own /status endpoint, giving you a unified view of backup health across every host in one call.

Config Consolidation

All config files — containers.json, spokes.json, and the new backup_result.json — now live in /app/config, which maps to a single mounted volume. Cleaner, and easier to manage.

Upgrading

If you're already running v1.0, upgrading is non-breaking. Pull the new image, update your compose file to mount /app/config, and you're done. The MODE environment variable defaults to spoke, so existing single-host setups continue to work exactly as before.

services:
  container-backup:
    image: zackwag/flask-container-backup:latest
    container_name: container-backup
    restart: unless-stopped
    ports:
      - 2128:2128
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - /docker/container-backup/config:/app/config
      - /docker:/source
      - onedrive-backup:/destination
    environment:
      - PYTHONUNBUFFERED=1
      - TZ=America/New_York

The image is available on Docker Hub at zackwag/flask-container-backup, and the source is on GitHub.

Automated Docker Container Backups

February 10, 2026

I run several containers in my home lab and the missing piece to my puzzle has been backups. That is to say I have no backup strategy. In my spare time, I've been working out how to backup the container storage and I feel pretty satisfied with my current solution.

Prerequisites

I keep all my container persistent data in a folder on my host machine in the format of /docker/{service-name} so /docker/caddy for instance. Therefore, only this folder should be backed up as it's the hardest to recreate.

Some of my containers use SQLite, some start up a separate container for a database (MySQL). Copying the persistent data while the container is running could cause corruption. So, stopping and starting the container is necessary.

Finally, I want maximum flexibility so I want to be able to just copy the data in the filesystem.

Solution

In order to backup persistent data in a simple way, I would need to map the destination in the filesystem (rclone), have a simple server that can take requests and perform business logic (flask), have the whole thing be ephemeral (Docker) and finally be able to call a backup at any time (REST).

rclone

Rclone is an application that allows you to mount cloud storage as a logical drive and perform I/O operations against it. Since I am a Microsoft 365 subscriber, I chose to use OneDrive.

Flask Server

I wanted to have a container that would perform actions based on either RESTful command or cronjobs. So I created flask-cron-server which will spin up a Flask server at port 2128.

From there I was able to create server.py. This is the file that runs the Flask server and is executed on container startup:

It reads in a JSON file called /app/config/containers.json that defines all the containers along with the folder that should be archived and where that archive should be stored.

The call to backup is executed in a separate thread and a 202 Accepted response is sent to the caller to let them know that the command was received, but it is unknown how long it will take.

Finally, the whole thing is driven by a simple JSON

[
    {
        "container_name": "caddy",
        "source_folder": "/source/caddy",
        "destination_folder": "/destination/caddy",
        "retention_days": 7
    },
    {
        "container_name": "freshrss",
        "source_folder": "/source/freshrss",
        "destination_folder": "/destination/freshrss",
        "retention_days": 7
    },
    {
        "container_name": "guacamole",
        "source_folder": "/source/guacamole",
        "destination_folder": "/destination/guacamole",
        "retention_days": 7
    },
    {
        "container_name": "mosquitto",
        "source_folder": "/source/mosquitto",
        "destination_folder": "/destination/mosquitto",
        "retention_days": 7
    },
    {
        "container_name": "ps5-mqtt",
        "source_folder": "/source/ps5-mqtt",
        "destination_folder": "/destination/ps5-mqtt",
        "retention_days": 7
    },
    {
        "container_name": "slash",
        "source_folder": "/source/slash",
        "destination_folder": "/destination/slash",
        "retention_days": 7
    },
    {
        "container_name": "write-freely",
        "source_folder": "/source/writefreely",
        "destination_folder": "/destination/writefreely",
        "retention_days": 7
    },
    {
        "container_name": "homeassistant",
        "source_folder": "/source/ha",
        "destination_folder": "/destination/ha",
        "retention_days": 14
    }
]

Docker Container

The final step was to create the Docker container and stack

services:
  container-backup:
    image: zackwag/flask-container-backup:latest
    container_name: container-backup
    restart: unless-stopped
    ports:
      - 2128:2128
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - /docker/container-backup/config:/app/config
      - /docker:/source
      - onedrive-backup:/destination
    environment:
      - PYTHONUNBUFFERED=1
      - TZ=America/New_York
volumes:
  onedrive-backup:
    driver: rclone
    driver_opts:
      remote: onedrive:backup
      allow_other: "true"
      vfs-cache-mode: writes
networks: {}

I'm in the timezone of New York, so you will need to change it to where you live.

Also, I made sure to include /var/run/docker.sock:/var/run/docker.sock:ro so that I could start and stop containers.

Finally, I followed the directions for Docker Volume Plugin. This allowed me to create the volume onedrive-backup that points to the /backup folder in OneDrive.

RESTfully Performing Backups

Now that the container is running I can simply call

curl -X POST [IP ADDRESS]:2128/backup

to backup all the containers specified in containers.json or just

curl -X POST {IP ADDRESS}:2128/backup/{container name}

To backup a specific container specified in containers.json.

I have setup automations that daily in Home Assistant, that call the main /backup endpoint to kick off backups.

What it does

Architecture

A few things that came out of building this

Getting started

The code

What it does

The install sequence

Dotfile selection

Usage

A few design decisions worth explaining

The code

What it does

How I use it

A bug worth explaining

Getting started

The Dead Ends

osx-cpu-temp

powermetrics

mactop / asitop

Finding the Right API

Building mac-temp

Usage

Signing and Notarizing for Distribution

Wiring It Into Home Assistant

The Code

The Problem: Entities Don't Just Toggle

The Two-Guard Pattern

The Availability Template Pattern

Baking Guards Into Blueprints

When NOT to Use These Guards

The Full Defense Stack

Summary

What Happened

Removing It From History

The Right Pattern Going Forward

Not Everything Needs to Be a Secret

Bonus: Nuking Your Git History Entirely

Takeaways

What Was Missing

Introducing the Hub/Spoke Architecture

New: The /status Endpoint

Config Consolidation

Upgrading

Prerequisites

Solution

rclone

Flask Server

Docker Container

RESTfully Performing Backups

New: The `/status` Endpoint