GitOps for Network Engineers - Deploying Nautobot
Deploying Our First Network Automation App - Nautobot!

Previous Articles in the Series
Bridging the Gap: GitOps for Network Engineers - Part 1 (Deploying ArgoCD)
Bridging the Gap: GitOps for Network Engineers - Part 2 (Deploying Critical Infrastructure with ArgoCD)
Intro
Here we go! Time to deploy something network automation engineers actually use: Nautobot. For those who are unfamiliar, Nautobot is an open-source Network Source of Truth and automation platform. It gives you a clean API, GraphQL, plugins, and jobs for modeling your network and driving intent-based automation. In a GitOps workflow, Nautobot becomes the living database of network intent and inventory, while Argo CD ensures the platform itself is deployed and maintained declaratively. It’s one of my favorite tools because you can’t have a solid network automation foundation without a solid source of truth (okay, “source of intent” if you prefer). Either way, Nautobot is among the best, kudos to the Network to Code team for a great product. Before we dive in, let’s quickly recap previous GitOps for Network Engeineers posts. If you haven’t read those yet, I’d recommend starting there first. The links are posted above.
Part 1 established the groundwork: why GitOps matters for network engineers (intent-as-code, reviews, rollbacks), installing Argo CD, connecting it to Git, and proving the reconcile loop with a simple, Git-managed deployment.
Part 2 leveled that foundation into a production-ready platform. We declaratively integrated:
MetalLB for external service IPs
Traefik for ingress routing and TLS
Rook-Ceph for durable, cluster-native storage
A secrets stack using External Secrets backed by HashiCorp Vault, all continuously managed by ArgoCD.
As a result, the platform can now:
Expose apps securely via external IPs and ingress rules
Persist data with Ceph-backed volumes
Manage secrets without committing them to Git
Treat infrastructure the same as applications: defined in code, reconciled by Argo CD
Instead of stamping this post as “Part 3,” I’m branching it off from the foundation posts. That gives me room to play with future installments while still keeping them under the GitOps for Network Engineers umbrella when it makes sense. The goal here is simple: bring a basic Nautobot deployment online, fully managed by ArgoCD, using the same GitOps patterns we established earlier. Specifically, we will:
Add the main Nautobot Helm chart to ArgoCD
Define (or confirm) a StorageClass for Nautobot’s persistent needs
Allocate a MetalLB IP for Traefik to serve Nautobot externally
Create Secrets for DB, Redis, and an initial Nautobot superuser
Compose Kustomize resources to wrap Helm and environment overlays
Author custom
values.ymlfor your environmentDeploy the App
When we are done our deployment will include 5 total pods:
Nautobot Web (frontend/API) - serves the UI plus REST/GraphQL endpoints
Nautobot Celery Worker - executes background jobs and plugin tasks
Nautobot Celery Beat - schedules periodic tasks for the worker
PostgreSQL - primary application database for Nautobot objects/state
Redis - cache and message broker backing Celery queues
This deployment will not include any building of custom container images, Nautobot plugins, or custom Nautobot configurations. I’m planning that for a future post.
Let’s dive in.
Adding Nautobot’s Helm Chart
First things first: let’s add the Nautobot Helm chart to Argo CD. If you followed the earlier posts, this will feel familiar. In the examples below, I’m using my prod-home Argo CD Project, you’ll see that name throughout. Your Project name can (and likely will) be different; substitute your own wherever you see prod-home.
Step 1: Add the Helm Repo
- Helm Repo URL:
https://nautobot.github.io/helm-charts/
In the ArgoCD UI:
Go to Settings → Repositories
Click + CONNECT REPO
Enter the Helm repo URL
Choose Helm as the type
Give the repo a name (Optional)
Chose the project you created earlier to associate this repo to (mine was ‘prod-home’)
No authentication is needed for this public repo
When done, click CONNECT
Once added, ArgoCD can now pull charts from this source.

Note: As seen in Part 2, you’ll also need to add the GitHub repo that contains your custom configuration files, like Helm values.yml files and Kustomize overlays.
If you're using my example repo, add
https://github.com/leothelyon17/kubernetes-gitops-playground.gitas another source, of type Git.If you're using your own repo, just make sure it's added in the same way so ArgoCD can pull your values and overlays when syncing.

Step 2: Create the ArgoCD Application
Head to the Applications tab and click + NEW APP to start the deployment.
Here’s how to fill it out:
Application Name:
nautobot(or in my casenautobot-prod)Project: Select your project (e.g., prod-home)
Sync Policy: Manual for now (we’ll automate later)
Repository URL: Select the Helm repo you just added
Chart Name:
nautobotTarget Revision: Use the latest or specify a version (latest is recommended)
Cluster URL: Use
https://kubernetes.default.svcif deploying to the same cluster (mine might be different than the default, dont worry.)Namespace:
nautobotornautobot-prodto match the ArgoCD application name. Check the box for creating the namespace if it doesn’t exist already in your kubernetes cluster
Click CREATE when finished.
If everything is in order you should see the App created like the screenshot below, though your’s will be all yellow status and ‘OutOfSync’ -

Just like before, ArgoCD will immediately show you all the Kubernetes objects it plans to create. Don’t hit Sync yet. We haven’t done the configuration of the databases, secrets, or persistent storage, so a deploy right now would fail. Databases would fail to mount their volumes. We’ll get there.
For this first section, the goal was simple: pull in the main Nautobot Helm chart, which we’ve done. In previous posts, we’d usually fine-tune the ArgoCD Application to point at our Kustomize overlays or custom helm values. We’ll come back to that once all those pieces exist; if you do this in the Application now ArgoCD will fail on the missing paths. Onward.
Overview for Nautobot’s Helm Values
Here we’ll take a quick pass over Nautobot’s default Helm values so we know exactly where our overrides will land later.
Defaults can be found here:
https://github.com/nautobot/helm-charts/blob/develop/charts/nautobot/values.yaml
For this deployment, we’ll customize these core sections:
superuser- bootstrap admin (username/email/password).

postgresql- point at our Postgres (in-cluster or external), version, storage, and connection settings.

redis- enable/disable and wire the cache/queue endpoint (persistence optional).

A few optional knobs worth calling out:
Replicas: under both
nautobotandcelery, you can setreplicas: 1for dev or tight clusters; bump later as you scale. I will be setting the replicas to ‘1’.Image: under
nautobot.imageset a specifictag(or a custom image) if you don’t want “latest.” Unless you know what you are doing, leave the defaults for this deployment.Ingress: the chart can create it, but we’re keeping that off and handling exposure via our Kustomize IngressRoute pattern.
That’s it for the big call-outs. We’ll circle back and set those values once the rest of the pieces (storage, secrets, and ingress) are in place later in the post.
Add Persistent Storage
For Nautobot, the one thing that absolutely needs persistence is the primary database, by default that’s PostgreSQL, and it should live on durable storage. Redis handles caching/queuing, and persistence there is optional: if you need cached data to survive pod restarts or rolling updates, back it with a PVC; otherwise keep it ephemeral and let it rebuild as needed.
In the earlier Part 2 post we previously created two CephFS storage classes with Rook-Ceph. For this post I’m using the rook-cephfs-retain class for Postgres and rook-cephfs-delete for Redis (optional) which we will see later in our helm custom values.
CephFS StorageClass (Retain)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
# Name you’ll reference from PVCs (spec.storageClassName)
name: rook-cephfs-retain
# CSI driver that provisions CephFS-backed volumes via Rook
provisioner: rook-ceph.cephfs.csi.ceph.com
parameters:
# ----- Tell the CSI driver which Ceph cluster/filesystem to use -----
# Namespace where your Rook-Ceph cluster runs (operator, mons/osds, etc.)
# If your cluster is in a different namespace, update this and the secrets below.
clusterID: rook-ceph
# Name of the CephFS filesystem (created during CephFS setup)
# You can confirm with `ceph fs ls`.
fsName: k8s-ceph-fs
# Ceph pool backing the filesystem (required when provisionVolume is true)
# Must match the pool configured for your fsName.
pool: k8s-ceph-fs-replicated
# ----- CSI secrets for provisioning/expansion/node-stage (auto-created by Rook) -----
# Secret used by the provisioner sidecar to create volumes
csi.storage.k8s.io/provisioner-secret-name: rook-csi-cephfs-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
# Secret used by the controller for volume expansion operations
csi.storage.k8s.io/controller-expand-secret-name: rook-csi-cephfs-provisioner
csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
# Secret used on the node to stage/mount volumes
csi.storage.k8s.io/node-stage-secret-name: rook-csi-cephfs-node
csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
# ----- Optional: choose the client implementation for CephFS mounts -----
# If omitted, CSI auto-detects. Kernel client is typical in prod.
# mounter: kernel
# Keep PVs (and data) when PVCs are deleted—safer for DBs and long-lived data
reclaimPolicy: Retain
# Allow growing PVCs in place (kubectl patch ... size: 40Gi, etc.)
allowVolumeExpansion: true
# Mount-time options passed to the client
mountOptions:
# Uncomment for verbose client debug logs during troubleshooting
# - debug
Why choose Retain vs Delete
Retainkeeps the PV (and data) when its PVC is deleted.
Use it for anything you don’t want accidentally destroyed (databases, long-lived app data, easy rollbacks). The trade-off is manual cleanup later.Deleteremoves the PV and backend data when the PVC goes away.
Great for ephemeral/dev workloads where you don’t care about the data. Trade-off: once it’s gone, it’s gone.
Why allowVolumeExpansion is important
Lets you grow PVCs in place as your data grows (no migrate-and-restore dance).
With CephFS + CSI, online expansion is supported; Kubernetes handles the resize.
You still need available capacity in the Ceph cluster. This just makes growth operationally simple.
Use this class for your Nautobot Postgres PVC. Redis persistence is optional. Enable it only if you truly need cache durability.
Add this storage class or classes to your Rook-Ceph deployment if you haven’t already (below) and let’s move forward.

MetalLB IP Pool for Traefik
Before we can expose apps to the outside world, Traefik needs an externally reachable IP from MetalLB. “Public” here just means outside the cluster (it can still be RFC1918). Since we already set up MetalLB in the earlier posts, this is a quick tweak.
1) Give MetalLB an address to hand out
Add a single IP (or a range) to your existing IPAddressPool. I like dedicating a single /32 for Traefik so DNS stays stable.
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: prod-traefik-pool
namespace: metallb-prod
spec:
addresses:
- 192.168.101.161/32 # Traefik LB IP
If you don’t already have one, pair the pool with an L2Advertisement (MetalLB won’t announce addresses without it):
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: prod-traefik-l2adv
namespace: metallb-prod
spec:
ipAddressPools:
- prod-traefik-pool
Note: Pick an unused IP in your LAN (outside DHCP scope). Then sync your Argo CD app for MetalLB.
2) Pin that IP on the Traefik Service
In your Traefik Helm values, set the Service to LoadBalancer and assign the static IP:
service:
enabled: true
type: LoadBalancer
spec:
loadBalancerIP: 192.168.101.161
# optional, preserves client source IP if you care about logs:
externalTrafficPolicy: Local
Sync your Traefik app. You should see the EXTERNAL-IP appear:
kubectl -n kube-system get svc
traefik-prod LoadBalancer 10.233.23.76 192.168.101.161 32400:30228/TCP,80:31007/TCP,443:32150/TCP 108d
3) (Optional) DNS now or later
Once the IP is live, create a DNS A record (e.g., nautobot.example.local → 192.168.101.161). We’ll wire the IngressRoute host to match this in the next steps.
That’s it. Traefik now has a stable, outside-facing address; we can safely publish Nautobot behind it.
Exposing Nautobot using an Ingress Route
With Traefik now holding an external IP, we can move on to exposing Nautobot to the outside world, time to configure the IngressRoute so users and devices can reach it.
This part is straightforward if you already have an ingress controller. If not, jump back to the Part 2 post for deploying Traefik in-cluster. By default, the Nautobot Helm chart does not create any Ingress/IngressRoute resources.

You can use the Nautobot chart values to let it create ingress, but we’re leaving those at the defaults. Instead, we’ll handle exposure in the overlay with a Traefik IngressRoute. I prefer this split: Helm owns the app; Kustomize owns how it’s exposed. It’s a repeatable, cookie-cutter pattern across apps and keeps odd edge cases out of chart values. Goal here is simple, publish the web UI outside the cluster. Nothing fancy.
A working IngressRoute example is below, and can also be found in my GitOps Playground repository in the apps/nautobot/overlays/prod folder -
---
# --> (Example) Create an IngressRoute for your service...
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: nautobot-prod-ingressroute # <-- Replace with your IngressRoute name
namespace: nautobot-prod # <-- Replace with your namespace
spec:
entryPoints:
- websecure
routes:
- match: Host(`nautobot.home.nerdylyonsden.io`) # <-- Replace with your FQDN
kind: Rule
services:
- name: nautobot-prod-default # <-- Replace with your service name
port: 80
# --> (Optional) Add certificate secret
tls:
secretName: prod-apps-certificate-secret # < cert-manager will store the created certificate in this secret.
# <--
The main points to cover here are:
Namespace - Make sure the manifest’s namespace matches where Nautobot will live.
EntryPoints - Use only
websecureso traffic is encrypted at least up to Traefik inside the cluster.Host rule -
routes.matchmust match the public DNS A record users will hit for Nautobot.Service wiring -
services.nameandservices.portmust match the Nautobot Service. In my setup the name is<namespace>-default; adjust if yours differs.Port - Defaults to
80unless you’ve changed it in the Service.TLS / certs - If you have a cluster cert solution (e.g., cert-manager), wire it here. If not, leave the TLS section out for now; I’ll cover this in an advanced post.
Note: To check the Service name and port you can go click into the app on ArgoCD, whether fully deployed or not, click the Service → Summary Tab → Desired Manifest as shown below -


Note (again): The Service also exposes port 443, but we’re not using it. Nautobot needs additional app-level config to terminate HTTPS directly. For now we’ll keep TLS at Traefik and speak HTTP to the Service. End-to-end HTTPS on Nautobot itself is out of scope for this post (maybe a future one).
Once the IngressRoute is set the way you want, drop it into your environment overlay (e.g., apps/nautobots/overlays/prod/ingress-route.yml) and commit it. That’s it for this piece, on to the next section.
Deploying Securely - Creating Our Secrets
For a starter implementation of Nautobot with some basic security we are going to need the following secrets stored in Vault -
Super User Login Credentials (which will include a password and API token)
Postgres DB Credentials
Redis DB Credentials
We’re going to keep credentials out of Git and let External Secrets (ESO) fetch them from HashiCorp Vault at deploy time. The two things we need to cover here are: (1) enabling Kubernetes authentication in Vault with a role dedicated to Nautobot, and (2) adding the actual secrets into Vault under the /secret path.
You should hopefully have an existing instance of Hashicorp Vault already if you’ve been following along with the previous posts.
Kubernetes Authentication + Nautobot Role
Why we need it:
External Secrets runs inside your cluster. It needs a secure, short-lived way to prove to Vault, “I’m allowed to read only the Nautobot secrets.” Vault’s Kubernetes auth method does exactly that by validating a pod’s service account token against the cluster API and mapping it to a least-privilege policy.
What the role does:
Binds a specific ServiceAccount + Namespace (e.g., the one where Nautobot lives) to a read-only policy for your Nautobot secret paths.
Issues short-lived Vault tokens to ESO when it presents the Kubernetes JWT. No root tokens or static creds in manifests.
Scopes access to exactly the secret paths you choose (nothing more).
First piece that has to be done (if never done previously) is not unable the Kubernetes Authentication method. For enabling through the GUI follow the steps below:
In the left-hand pane, click Access.
Under Authentication, click Enable New Method (top right).
Under Infra, choose Kubernetes.
Leave the options at their defaults and click Enable Method.
Back on the Authentication Methods list, you should now see kubernetes/ and token/. Click kubernetes/.
Click Configuration (top area), then Configure (right side).
In Configuration, set Kubernetes host to your API URL (I use the Kube-VIP URL from earlier posts). If you don’t have one, you can use
https://kubernetes.default.svc.
Kubernetes auth is now configured. Next, create the Nautobot role.
Create the Nautobot role:
From Authentication Methods, select kubernetes/.
Click Create role (right side).
Use these values (adjust as needed for your environment):
Name:
nautobotAlias name source:
serviceaccount_nameBound service account names:
nautobot-prodBound service account namespaces:
nautobot-prodUnder Tokens → Generated Token’s Policies: add
nautobot(we’ll create this policy next)
Leave other token settings at their defaults; other fields can remain blank.
Click Save.
That’s all we need for the Nautobot role. The referenced ServiceAccount will be created by our Helm deployment a bit later.
Create the ACL policy:
In the left-hand pane, click Policies.
Click Create ACL policy.
Enter a policy name (e.g.,
nautobot).Paste in the policy content. Note: for simplicity, I start from the default policy and add read and list capabilities for the upcoming secrets paths (shown below).
# Allow tokens to look up their own properties
path "auth/token/lookup-self" {
capabilities = ["read"]
}
# Allow tokens to renew themselves
path "auth/token/renew-self" {
capabilities = ["update"]
}
# Allow tokens to revoke themselves
path "auth/token/revoke-self" {
capabilities = ["update"]
}
# Allow a token to look up its own capabilities on a path
path "sys/capabilities-self" {
capabilities = ["update"]
}
# Allow a token to look up its own entity by id or name
path "identity/entity/id/{{identity.entity.id}}" {
capabilities = ["read"]
}
path "identity/entity/name/{{identity.entity.name}}" {
capabilities = ["read"]
}
# Allow a token to look up its resultant ACL from all policies. This is useful
# for UIs. It is an internal path because the format may change at any time
# based on how the internal ACL features and capabilities change.
path "sys/internal/ui/resultant-acl" {
capabilities = ["read"]
}
# Allow a token to renew a lease via lease_id in the request body; old path for
# old clients, new path for newer
path "sys/renew" {
capabilities = ["update"]
}
path "sys/leases/renew" {
capabilities = ["update"]
}
# Allow looking up lease properties. This requires knowing the lease ID ahead
# of time and does not divulge any sensitive information.
path "sys/leases/lookup" {
capabilities = ["update"]
}
# Allow a token to manage its own cubbyhole
path "cubbyhole/*" {
capabilities = ["create", "read", "update", "delete", "list"]
}
# Allow a token to wrap arbitrary values in a response-wrapping token
path "sys/wrapping/wrap" {
capabilities = ["update"]
}
# Allow a token to look up the creation time and TTL of a given
# response-wrapping token
path "sys/wrapping/lookup" {
capabilities = ["update"]
}
# Allow a token to unwrap a response-wrapping token. This is a convenience to
# avoid client token swapping since this is also part of the response wrapping
# policy.
path "sys/wrapping/unwrap" {
capabilities = ["update"]
}
# Allow general purpose tools
path "sys/tools/hash" {
capabilities = ["update"]
}
path "sys/tools/hash/*" {
capabilities = ["update"]
}
# Allow checking the status of a Control Group request if the user has the
# accessor
path "sys/control-group/request" {
capabilities = ["update"]
}
# Allow a token to make requests to the Authorization Endpoint for OIDC providers.
path "identity/oidc/provider/+/authorize" {
capabilities = ["read", "update"]
}
# Allow a token to access nautobot db secrets
path "secret/nautobot-prod-db-credentials" {
capabilities = ["read", "list"]
}
# Allow a token to access nautobot superuser secrets
path "secret/nautobot-prod-superuser-credentials" {
capabilities = ["read", "list"]
}
That’s it. Kubernetes Auth, the Nautobot Role, and the policy are set. Let’s finally add our actual secrets to Vault.
Add Secrets to Vault (under secret/)
We’ll store the credentials and app secrets that Nautobot (and its dependencies) need under a clear, predictable hierarchy in the /secret (KV) mount.
What to store for a “basic but secure” deploy:
Superuser: password and API token (for first login and automation).
Database Passwords: Postgres and Redis
If the KV (Key/Value) secrets engine isn’t enabled yet, start here. Otherwise, skip to Create the secrets.
Enable the KV secrets engine
In the left navigation, click Secrets Engines.
Click Enable new engine + (top right).
Choose KV under “Generic.”
Set the Path to
secret; leave other options at defaults.Click Enable Engine.
If this is a fresh Vault and KV wasn’t previously enabled, you should now see it listed alongside the existing engines.

Create the secrets
In the left navigation, click Secrets Engines.
Select the new secret (KV) engine.
Click Create secret + (right side).
For Path, enter
nautobot-prod-db-credentials(or your preferred name).Under Secret data, add a key
postgres-passwith its value.Click Add and create a second key
redis-passwith its value.Click Save.
If completed correctly it should look like below -

Repeat the process for the Superuser secret. Create a new secret with two keys (for example, password and api-token) and save.

How ESO and Vault Work Together (high-level)
Once the role and secrets exist and ArgoCD goes to deploy the app, ESO will:
Use the Kubernetes auth role to obtain a short-lived Vault token (via its ServiceAccount).
Read the exact keys under
/secret/nautobot...as defined by your policy.Materialize a single Kubernetes Secret (or multiple, your call) in the Nautobot namespace with the names/keys your Helm chart expects.
With Vault and External Secrets in place, we now have a clean, Git-free path for credentials: a Kubernetes auth role that scopes exactly who can read what, a tidy set of KV paths for Nautobot’s superuser + databases, and ESO ready to materialize those values as Kubernetes Secrets when Argo CD reconciles. That closes the loop on “secure by default” for this deployment. Next up, we’ll use everything we’ve built so far (storage classes, ingress patterns, secrets) to assemble our Kustomize resources and configure the Nautobot Helm chart the GitOps way.
The Rest of the Kustomize Resources
Earlier we created the IngressRoute Kustomize file to publish Nautobot through Traefik. Now we’ll add the rest of the overlay, mostly focused on integrating in the work from the Secrets section. We’ll also add a top-level kustomization.yml to bundle these pieces so the cluster can build them as a single unit. Once this overlay is in place, everything we’ve prepared (storage, secrets, and ingress) comes together as one declarative package.
The first file will be the ClusterSecretStore - cluster-secret-store.yml
ClusterSecretStore: ESO’s shortcut to Vault
A ClusterSecretStore is a cluster-wide connection profile that tells External Secrets (ESO) how to reach Vault, which KV (“/secret”) mount to read, and how to auth (Kubernetes auth + Vault role). Use ClusterSecretStore to share one Vault setup across namespaces; use SecretStore if you want it namespace-scoped. For a more simplistic deployment I choose to use a ClusterSecretStore.
What this sets:
server– Vault URL reachable from the clusterpath/version– your KV mount (e.g.,secret, v2)auth.kubernetes– use SA token login;rolemaps SA+namespace → read-only policyserviceAccountRef– which SA ESO uses to authenticate
Repo Example (with comments):
apiVersion: external-secrets.io/v1
kind: ClusterSecretStore
metadata:
name: vault-backend # cluster-wide handle ESO will reference
spec:
provider:
vault:
# Where Vault is reachable from the cluster using a cluster internal URL
# (Many setups use http://vault.vault.svc:8200 or https with proper CA)
server: "http://hashi-vault-prod-0.hashi-vault-prod-internal.hashi-vault-prod.svc.cluster.local:8200"
# The KV mount path and version you enabled in Vault
path: "secret" # e.g., 'secret', 'kv', etc.
version: "v2" # be explicit to avoid surprises
# Authenticate to Vault using the Kubernetes auth method
auth:
kubernetes:
mountPath: "kubernetes" # must match your Vault auth mount path
role: "nautobot" # Vault role bound to SA+namespace with read-only policy
serviceAccountRef:
name: nautobot-prod # SA whose token ESO will use for login
namespace: nautobot-prod # namespace where that SA lives
How it flows: ESO reads this store → logs into Vault with the SA token → gets a short-lived token for the nautobot role → pulls only the allowed keys → renders Kubernetes Secrets for Helm/Kustomize.
The next pair of files are for the database and superuser secret creation.
ExternalSecrets: mapping Vault data into Kubernetes Secrets
Why these exist: An ExternalSecret tells ESO which Vault keys to read and how to materialize them as a plain Kubernetes Secret that Helm/Kustomize can mount. We’ll use two: one for database/redis creds and one for the Nautobot superuser. An ExternalSecret is namespace-scoped.
Database & Redis ExternalSecret (commented)
apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
name: nautobot-prod-db-external-secret # ESO resource name
namespace: nautobot-prod # where the resulting K8s Secret will live
spec:
refreshInterval: "1h" # re-sync cadence from Vault
secretStoreRef:
name: vault-backend # points to our (Cluster)SecretStore
kind: ClusterSecretStore
target:
name: nautobot-prod-db-secrets # name of the K8s Secret ESO will create/update
creationPolicy: Owner # ESO owns and reconciles this Secret
data:
- secretKey: postgres-password # key inside the K8s Secret
remoteRef:
key: secret/data/nautobot-prod-db-credentials # Vault path (KV v2 HTTP style)
property: postgres-pass # field inside that Vault doc
- secretKey: password # duplicate key for charts expecting 'password'
remoteRef:
key: secret/data/nautobot-prod-db-credentials
property: postgres-pass
- secretKey: redis-password # Redis password (optional if Redis is unauthenticated)
remoteRef:
key: secret/data/nautobot-prod-db-credentials
property: redis-pass
Superuser ExternalSecret (commented)
apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
name: nautobot-prod-superuser-external-secret
namespace: nautobot-prod
spec:
refreshInterval: "1h"
secretStoreRef:
name: vault-backend
kind: ClusterSecretStore
target:
name: nautobot-prod-superuser-secrets # K8s Secret with Nautobot bootstrap creds
creationPolicy: Owner
data:
- secretKey: password # superuser password
remoteRef:
key: secret/data/nautobot-prod-superuser-credentials
property: superuser-pass
- secretKey: api_token # superuser API token
remoteRef:
key: secret/data/nautobot-prod-superuser-credentials
property: superuser-api-token
Notes
Key naming: The
secretKeyentries become keys in your Kubernetes Secret. Align them with whatever your Helm values or manifests expect.KV v2 pathing: Some setups prefer the logical path (e.g.,
nautobot-prod-db-credentials) rather than the HTTP-stylesecret/data/.... Use the style that matches how yourClusterSecretStoreis configured.Duped mappings: Having both
postgres-passwordandpasswordmapped to the same Vault value is fine if different consumers expect different key names.Refresh:
refreshIntervalcontrols how quickly rotations in Vault propagate to Kubernetes. Pick something that fits your rotation policy.
Kustomize: Bundling our Resources Together
Time to bundle everything we’ve created into a single overlay Kustomize can build (and Argo CD can track). Keep this file in your environment overlay (e.g., overlays/prod/).
What this overlay does:
Registers Vault access for ESO via the ClusterSecretStore
Pulls database + superuser creds via ExternalSecret objects
Publishes Nautobot through Traefik with our IngressRoute
## kustomize.yml
# The building blocks we created earlier
resources:
- cluster-secret-store.yml # ESO → Vault connection (cluster-scoped; namespace here is ignored)
- external-secrets-db.yml # Database & Redis credentials from Vault → K8s Secret
- external-secret-superuser.yml # Nautobot superuser creds from Vault → K8s Secret
- ingress-route.yml # Traefik exposure for Nautobot
Notes
Order of operations: Kustomize doesn’t enforce ordering, but Argo CD will reconcile until everything is healthy. If you want strict sequencing later, you can add Argo CD sync waves via annotations.
Where this fits: Your Argo CD Application will point at this folder (done in the next section). Once synced, ESO will authenticate to Vault, create the Kubernetes Secrets, and Traefik will expose the app host defined in your IngressRoute.
Commit this file alongside the four resources, and you’ve got a clean, declarative package ready for Argo CD to manage.
The Final Pieces - Custom Helm Values + ArgoCD App Manifest
Custom Helm values (values-prod.yml)
This file wires Nautobot to the secrets that will be deployed, dials replicas down for a tidy first deploy, and pins persistence to the CephFS StorageClass(es). Drop it next to your overlay (e.g., apps/nautobot/values-prod.yml) and reference it from your Argo CD Application (next section).
# values-prod.yml
nautobot:
# Keep it small for the first sync; scale later.
replicaCount: 1
# Probes off for initial bring-up (migrations can make probes flap).
# Once stable, consider enabling these.
livenessProbe:
enabled: false
readinessProbe:
enabled: false
# Bootstrap superuser from our ExternalSecret-backed K8s Secret.
superUser:
existingSecret: "nautobot-prod-superuser-secrets" # created by ESO
existingSecretPasswordKey: "password" # key in that Secret
existingSecretApiTokenKey: "api_token" # key in that Secret
username: "jeff" # static bootstrap username
celery:
# One worker to start; bump if you run jobs/heavy plugins.
replicaCount: 1
serviceAccount:
# Leave token mounted, used for ESO/ClusterSecretStore
automountServiceAccountToken: true
postgresql:
# Using the chart’s built-in PostgreSQL with CephFS persistence.
primary:
persistence:
enabled: true
size: "2Gi" # starter size; expand later
storageClass: "rook-cephfs-retain" # keep data if PVC is deleted
accessModes: ['ReadWriteOnce'] # DB should be single-writer
auth:
# Pull the password from the ExternalSecret-created Secret.
existingSecret: nautobot-prod-db-secrets
redis:
# Enable persistence if you want cache/queue data to survive restarts.
master:
persistence:
enabled: true
size: "1Gi"
storageClass: "rook-cephfs-delete" # okay to delete for cache data
accessModes: ['ReadWriteOnce']
auth:
enabled: true
existingSecret: nautobot-prod-db-secrets
Why these choices
Probes disabled (initially): first runs often include migrations; turning probes off avoids noisy restarts. Re-enable once everything is healthy.
CephFS everywhere: aligns with the storage classes you built earlier.
rook-cephfs-retainfor Postgres so accidental PVC deletes don’t nuke data.rook-cephfs-deletefor Redis because it’s cache/queue data.
ReadWriteOncefor DB/Redis: even though CephFS supports RWX, keeping databases single-writer reduces foot-guns (performance issues, data corruption, or scalability bottlenecks).Secrets via ESO:
existingSecretkeys point at the Kubernetes Secrets materialized from Vault, so nothing sensitive lives in Git or in the helm values.
Rounding Out the Argo CD Application
Now that Helm (the app) and Kustomize (secrets + ingress) are defined and your custom Helm values exist we just need to finish the Argo CD Application so it points at both sources and deploys them to the right place (below).
project: prod-home
destination:
server: https://prod-kube-vip.jjland.local:6443
namespace: nautobot-prod
syncPolicy:
syncOptions:
- CreateNamespace=true
sources:
- repoURL: https://nautobot.github.io/helm-charts/
targetRevision: 2.5.5
helm:
valueFiles:
- $values/apps/nautobot/values-prod.yml
chart: nautobot
- repoURL: https://github.com/leothelyon17/kubernetes-gitops-playground.git
path: apps/nautobot/overlays/prod
targetRevision: HEAD
ref: values
Copy and paste the above file in the ArgoCD GUI or edit manually. Same as similar app manifests that were configured in previous posts.


Deploying and Syncing the App
With everything bundled via Kustomize and correctly referenced by ArgoCD, it’s time to deploy.
Open the Argo CD Application and click Sync. You should see the Helm release create a batch of Kubernetes objects. To focus on what we built in this post, look for:
PVCs bound to your CephFS StorageClasses and mounted by the pods
PostgreSQL and Redis pods coming up Healthy
Secrets flow:
ClusterSecretStoreandExternalSecretresources showing Synced, and the resulting Kubernetes Secrets present in the namespaceIngressRoute created and admitted by Traefik (host matches your DNS A record)
If all of the above is green, the Argo CD app should land in Synced / Healthy. Screenshots below show an example of what you should see.
Storage


Secrets


IngressRoute/Traefik



The Application Pods

Note: It can take a little while for the app to show Healthy and become reachable. On the first deploy, once Nautobot connects to Postgres it will run initial database migrations to create tables—this adds extra time on top of the normal startup. If you’re curious, watch the nautobot-init logs for migration progress.
If everything’s green in Argo CD and the pods look steady, open the host defined in your IngressRoute. You should land on the Nautobot login page. Sign in with the superuser credentials you stored in Vault (surfaced via External Secrets and referenced in your custom Helm values). If login fails, check the logs for the nautobot-init container. On first start it runs migrations and bootstraps the superuser. You’ll see log messages confirming the account creation (not the raw secrets), which is a quick way to verify the secret wiring end to end.



If you can log in, CONGRATULATIONS! You’ve just deployed Nautobot on Kubernetes, fully managed the GitOps way.
Troubleshooting Tips
If your deployment isn’t landing cleanly, work through these quick checks, organized by the same pieces we built in this post.
1) Argo CD & Kustomize
What to look for
App stuck in OutOfSync or Progressing.
Sync immediately fails
Resources missing from the tree.
Checks
Open the Argo CD git diff for the app: look for bad paths/filenames in
kustomization.yml.Verify the repo folder the Application points to contains:
cluster-secret-store.ymlexternal-secrets-db.ymlexternal-secret-superuser.ymlingress-route.ymlvalues-prod.yml(referenced by your Helm app)
Confirm files paths in the ArgoCD App manifest
Double check all YAML syntax
2) Secrets pipeline: Vault → ESO → K8s Secret
Symptoms
- ExternalSecrets show Not Synced, Nautobot init fails with missing env/creds.
Checks
ClusterSecretStore:Server URL reachable inside the cluster?
auth.kubernetes.mountPathmatches your Vault auth mount?rolename matches the role you created in Vault?
ExternalSecret:Conditions should be Ready=True; if not,
describeit for a clear error (auth denied, key not found, etc.).Verify Vault paths/field names match exactly (KV v2 pathing trips people up).
ServiceAccount binding:
- The SA referenced in the store exists in the right namespace, and your Vault role binds to that SA+namespace.
3) Storage: CephFS StorageClass & PVCs
Symptoms
- PVCs stuck in Pending; pods can’t mount volumes.
Checks
StorageClassname in Helm values matches your CephFS SC (e.g.,rook-cephfs-retain).Access modes fit usage:
Postgres/Redis:
ReadWriteOnce(single writer).Nautobot media/static (if used):
ReadWriteMany.
Rook-Ceph health:
- OSDs/MONs healthy, pool/FS exists, quota not exceeded.
If PVC deleted but PV persists:
- That’s expected with
reclaimPolicy: Retain; either reuse or manually clean it up before recreating.
- That’s expected with
4) Postgres & Redis (built-in charts)
Symptoms
- DB pod CrashLoopBackOff; app can’t connect.
Checks
Secrets:
- The
existingSecretnames line up with what the subcharts expect, and key names (password,postgres-password,redis-password) match your ExternalSecret outputs.
- The
Persistence:
- Correct StorageClass; PVC bound.
Logs:
Postgres: authentication/permissions, initdb errors.
Redis: refuses connections or auth errors if
auth.enabled=true.
5) Nautobot app (web/worker/beat)
Symptoms
- Web never becomes Ready, 502 via Traefik, or superuser not created.
Checks
nautobot-initlogs:- Confirms migrations and superuser bootstrap; errors here usually mean secret keys missing/wrong.
Probes:
- We disabled probes initially—good. If you enabled them early, they can flap during migrations; disable, sync, let it settle, then re-enable.
Environment wiring:
- Confirm the Helm values reference the K8s Secret keys you created (names and casing must match).
6) Ingress, Traefik & DNS
Symptoms
- 404/503 at the browser, TLS errors, or wrong host.
Checks
IngressRoute:
routes.matchhost matches your DNS A record exactly.entryPoints: ["websecure"]and Traefik has that entrypoint enabled.
Traefik Service:
- Has an EXTERNAL-IP from MetalLB; DNS A record points to it.
If using certs later:
- Don’t reference cert-manager resources yet if you haven’t set them up; keep TLS simple at Traefik.
7) MetalLB (external reachability)
Symptoms
- Traefik never gets an external IP; no traffic into the cluster.
Checks
IPAddressPoolcontains the IP/range; it’s unused on your LAN.L2Advertisementexists for that pool.Traefik Service
type: LoadBalancerand (optionally)loadBalancerIPmatches your chosen IP.
8) Resources & scheduling
Symptoms
- Pods Pending or OOMKilled.
Checks
Nodes have capacity; Ceph/DB pods especially need memory/CPU.
Start small (single replicas) then scale up.
If OOMs, raise limits/requests or add memory.
9) Naming & key mismatches (sneaky but common)
What to verify
Secret names and keys in:
ExternalSecret→ target SecretHelm values (e.g.,
existingSecret,existingSecretPasswordKey, etc.)
Namespace consistency across all manifests (
nautobot-prodvs something else).
10) Quick sanity commands (lightweight)
Objects at a glance:
kubectl -n nautobot-prod get allESO health:
kubectl -n nautobot-prod get externalsecret,secretstore,clustersecretstorePVCs:
kubectl -n nautobot-prod get pvcDescribe failures:
kubectl -n nautobot-prod describe <kind> <name>App logs:
kubectl -n nautobot-prod logs deploy/nautobot -c nautobot-init --tail=100
Ultimately, if you don’t know where to start, USE THE CONTAINER LOGS. ArgoCD makes it so easy and usually you can find the issue in the logs themselves.
Summary
Well, we did it. We didn’t just get Nautobot running, we established a repeatable pattern for network-automation apps or any containerized app: ArgoCD for reconciliation, Kustomize for environment shaping, Vault + External Secrets for credentials, Traefik + MetalLB for reachability, and CephFS for persistence. That stack gives a stable runway to ship changes the same way every time, through Git, without snowflakes or manual tweaks. This same method can be used to deploy both on-prem, in the cloud, or a mix of the two.
Why this helps your automation journey
Trustable intent: Nautobot becomes the system of record for sites, devices, IPAM, and custom models exposed via REST/GraphQL for pipelines and tools.
Safe, auditable change: Every tweak (charts, values, secrets wiring, ingress) goes through Git reviews and rolls back cleanly. Drift is visible; fixes are deterministic.
Fewer blockers: Secrets are handled with least-privilege, storage/ingress are standardized, so you can focus on workflows, not plumbing.
From dev to prod: The same pattern scales to new apps (observability, chatops, CI/CD helpers) with minimal friction. Copy the overlay, adjust values, and commit.
Where I’m going next
An advanced Nautobot deployment (plugins, app config, HTTPS/certs, SSO).
Integrations with other GitOps-deployed apps.
A NetBox deployment for folks who prefer that app. Love it too!
This is the moment where GitOps stops being theory and starts accelerating real network automation and manageable application delivery.
Thanks for reading!
Links
Bridging the Gap: GitOps for Network Engineers - Part 1 (Deploying ArgoCD)
Bridging the Gap: GitOps for Network Engineers - Part 2 (Deploying Critical Infrastructure with ArgoCD)




