ADR 0001: Document Cluster Access Without Storing Secrets
Date: 2026-05-07
Status: Accepted
Context
The workspace needs a lightweight record of available lab clusters and how to access them through dl385.
dl385 is the jump host for this environment. OpenShift and RKE2 cluster operations are expected to be run from dl385, not directly from a local laptop.
Access was verified for:
| Platform | Clusters | CLI |
|---|---|---|
| OpenShift | hub-dc, hub-dr, spoke-dc, spoke-dr | oc |
| RKE2 | rke2, rke2-dr | kubectl |
The kubeconfig files live on dl385 under ~/.kube/configs/. These files may contain sensitive credentials and must not be copied into the workspace.
Decision
Document cluster access in INFRASTRUCTURE.md, including:
dl385as the required jump host.- CLI paths and verified access checks.
- Cluster aliases, kubeconfig paths, API servers, and example commands.
- A clear warning not to store kubeconfig contents, tokens, or certificates in the repository.
Keep agent operating guidance in AGENTS.md so future edits preserve the same safety and documentation conventions.
Consequences
- Operators have a quick reference for cluster access without opening kubeconfig files.
- Future agents have clear rules for safe verification and documentation updates.
- The repository remains safe to share because it records only operational metadata and command patterns, not credentials.
Follow-Up
If the workspace grows, split ADRs into docs/adr/ and keep this decision as docs/adr/0001-document-cluster-access-without-storing-secrets.md.
ADR 0002: Keep an Aggressive Changelog
Date: 2026-05-07
Status: Accepted
Context
The wiki now contains infrastructure access details, deployment automation, and operating guidance. Small documentation changes can have operational impact because people may use this repository as a source of truth for cluster access and deployment state.
The user requested aggressive change tracking.
Decision
Maintain CHANGELOG.md as a required, newest-first history of meaningful repository changes.
Every meaningful tracked-file change must include:
- Intent.
- Changed files.
- Operational or deployment impact.
- Verification performed.
- Security notes when tokens, credentials, kubeconfigs, cluster access, or deployment credentials are relevant.
The changelog must be updated in the same commit as the change it records whenever practical.
Consequences
- Future work has a clear audit trail.
- Operators can quickly understand what changed, why it changed, and how it was verified.
- Changes take a little longer because documentation is part of the definition of done.
ADR 0003: Track Durable Project Memory In Git
Date: 2026-05-07
Status: Accepted
Context
The user wants project memory to survive laptop changes. The repository already tracks infrastructure documentation, deployment automation, ADRs, and a changelog, but durable handoff context was spread across multiple files and conversation history.
Some project context is safe to store in git, such as repo URLs, deployment project names, workflow files, bootstrap steps, and references to where secrets live. Secret values themselves are not safe to store in git.
Decision
Maintain MEMORY.md as the tracked durable memory file for the project.
MEMORY.md should record:
- Repository and deployment URLs.
- Current source-of-truth files.
- Safe infrastructure access patterns.
- Deployment setup and required secret names.
- Local bootstrap steps for a new laptop.
- Long-lived conventions that future agents should preserve.
MEMORY.md must not record:
- Token values.
- Kubeconfig contents.
- Client certificates.
- Private keys.
- Passwords.
Consequences
- A new laptop can recover project context by cloning the repo.
- Future agents have one place to refresh durable context before making changes.
- Secrets still need to be recreated through secure systems outside git.
ADR 0004: Use dl385 As The Cluster Jump Host
Date: 2026-05-07
Status: Accepted
Context
OpenShift and RKE2 cluster access is available from dl385. The user clarified that dl385 is specifically the jump host for cluster operations.
The local laptop should not be treated as the direct execution environment for OpenShift or RKE2 access. It may hold this repository and local documentation tooling, but cluster commands should be run after SSHing to dl385.
Decision
Use dl385 as the required jump host for cluster access.
Operational pattern:
ssh ze@dl385
From dl385:
- Use
ocpctx <cluster>andocfor OpenShift clusters. - Use
kubectl --kubeconfig ~/.kube/configs/<cluster>.kubeconfigfor RKE2 clusters.
Do not assume kubeconfigs or cluster network access are available from a replacement laptop.
Consequences
- New laptops only need the repo, SSH access to
dl385, and local credentials for GitHub or Cloudflare workflows. - Cluster operational docs should describe commands as running on
dl385. - Any future cluster verification should record that the verification was performed from
dl385.
ADR 0005: Use Vault Kubernetes Auth For RKE2 Vault Replication Export
Date: 2026-05-07
Status: Accepted
Context
The RKE2 DC Vault replication export CronJob had been authenticating with a static VAULT_TOKEN stored in Kubernetes Secret vault/vault-replicator.
That token expired or was otherwise invalidated, which caused export jobs to fail with:
403 permission deniedinvalid token
The cluster already had a Vault kubernetes/ auth mount and a dedicated Kubernetes ServiceAccount for the replication job.
Decision
Move the DC Vault replication export CronJob to Vault Kubernetes auth.
Implementation shape:
- Keep Kubernetes ServiceAccount
vault/vault-replicator. - Enable ServiceAccount token mounting for that pod identity.
- Configure the export CronJob to log in through Vault auth path
kubernetes/. - Bind Vault role
vault-replicator-exportto ServiceAccountvault-replicatorin namespacevault. - Keep MinIO credentials and
VAULT_ADDRin Kubernetes Secretvault/vault-replicator. - Remove the static
VAULT_TOKENkey from the live Secret.
The GitOps source of truth for the workload manifests is:
http://30.30.30.5/infra/gitops-rke2.git
clusters/dc/manifests/vault-replication
Consequences
- Vault replication export no longer depends on rotating a static Vault token by hand.
- The export job now authenticates with the pod's Kubernetes identity and a Vault role binding.
- Vault auth mount configuration still remains a live Vault concern and is not fully represented in the GitOps repo.