feat: move things around

This commit is contained in:
eric
2026-03-18 17:41:10 +01:00
parent f558ab4ba9
commit 1de34c1869
22 changed files with 54 additions and 2692 deletions

415
README.md
View File

@@ -1,414 +1,35 @@
# nix-nodeiwest
NixOS flake for NodeiWest VPS provisioning and ongoing deployment.
Employee and workstation flake for NodeiWest.
This repo currently provisions NixOS hosts with:
Server deployment moved to the sibling repo `../nix-deployment`.
- the `nodeiwest` employee helper CLI for safe provisioning
- shared base config in `modules/nixos/common.nix`
- Tailscale bootstrap via OpenBao AppRole in `modules/nixos/tailscale-init.nix`
- Home Manager profile in `modules/home.nix`
- disk partitioning via `disko`
- deployment via `colmena`
This repo now owns:
## Current Model
- shared Home Manager modules
- employee shell packages and environment variables
- workstation-side access to the `nodeiwest` helper by consuming it from `../nix-deployment`
- Employees should use `nodeiwest` as the supported provisioning interface
- New machines are installed with `nixos-anywhere`
- Ongoing changes are deployed with `colmena`
- Hosts authenticate to OpenBao as clients
- Tailscale auth keys are fetched from OpenBao namespace `it`, KV mount `kv`, path `tailscale`, field `CLIENT_SECRET`
- Public SSH must work independently of Tailscale for first access and recovery
This repo no longer owns:
## Repo Layout
- NixOS server host definitions
- Colmena deployment state
- Tailscale server bootstrap
- k3s bootstrap
- OpenBao server or Kubernetes infra manifests
```text
flake.nix
hosts/
vps[X]/
configuration.nix
disko.nix
hardware-configuration.nix
modules/
home.nix
helpers/
home.nix
nixos/
common.nix
tailscale-init.nix
pkgs/
helpers/
cli.py
templates/
```
## Helper Consumption
## Recommended Workflow
The supported employee path is the `nodeiwest` CLI.
It is exported from the root flake as `.#nodeiwest-helper` and installed by the shared Home Manager profile. You can also run it ad hoc with:
The helper package is re-exported from the deployment repo:
```bash
nix run .#nodeiwest-helper -- --help
```
Recommended sequence for a new VPS:
### 1. Probe The Live Host
```bash
nodeiwest host probe --ip <ip>
```
This validates SSH reachability and derives the boot mode, root device, primary disk candidate, and swap facts from the live machine.
### 2. Scaffold The Host Files
Dry-run first:
```bash
nodeiwest host init --name <name> --ip <ip>
```
Write after reviewing the plan:
```bash
nodeiwest host init --name <name> --ip <ip> --apply
```
This command:
- probes the host unless you override disk or boot mode
- creates or updates `hosts/<name>/configuration.nix`
- creates or updates `hosts/<name>/disko.nix`
- creates `hosts/<name>/hardware-configuration.nix` as a placeholder if needed
- prints the exact `flake.nix` snippets still required for `nixosConfigurations` and `colmena`
### 3. Create The OpenBao Bootstrap Material
Dry-run first:
```bash
nodeiwest openbao init-host --name <name>
```
Apply after reviewing the policy and AppRole plan:
```bash
nodeiwest openbao init-host --name <name> --apply
```
This verifies your existing `bao` login, creates the host policy and AppRole, and writes:
- `bootstrap/var/lib/nodeiwest/openbao-approle-role-id`
- `bootstrap/var/lib/nodeiwest/openbao-approle-secret-id`
### 4. Plan Or Run The Install
```bash
nodeiwest install plan --name <name>
nodeiwest install run --name <name> --apply
```
`install plan` validates the generated host files and bootstrap files, then prints the exact `nixos-anywhere` command. `install run` re-validates, asks for confirmation, and executes that command.
### 5. Verify First Boot And Colmena Readiness
```bash
nodeiwest verify host --name <name> --ip <ip>
nodeiwest colmena plan --name <name>
```
`verify host` summarizes the first-boot OpenBao and Tailscale services over SSH. `colmena plan` confirms the deploy target or prints the exact missing host stanza.
## Manual Flow (Fallback / Advanced)
This is the underlying sequence that `nodeiwest` automates. Keep it as the fallback path for unsupported host layouts or when you intentionally want to run the raw commands yourself.
### 1. Prepare The Host Entry
Create a new directory under `hosts/<name>/` with:
- `configuration.nix`
- `disko.nix`
- `hardware-configuration.nix`
`configuration.nix` should import both `disko.nix` and `hardware-configuration.nix`.
Example:
If you import `modules/helpers/home.nix` directly, pass the deployment flake as a special arg:
```nix
{ lib, ... }:
{
imports = [
./disko.nix
./hardware-configuration.nix
];
networking.hostName = "vps1";
networking.useDHCP = lib.mkDefault true;
time.timeZone = "UTC";
boot.loader.efi.canTouchEfiVariables = true;
boot.loader.grub = {
enable = true;
efiSupport = true;
device = "nodev";
};
nodeiwest.ssh.userCAPublicKeys = [
"ssh-ed25519 AAAA... openbao-user-ca"
];
nodeiwest.tailscale.openbao.enable = true;
system.stateVersion = "25.05";
}
extraSpecialArgs = {
deployment = inputs.deployment;
};
```
### 2. Add The Host To `flake.nix`
Add the host to:
- `nixosConfigurations`
- `colmena`
For `colmena`, set:
- `deployment.targetHost`
- `deployment.targetUser = "root"`
- tags as needed
## Discover Disk And Boot Facts
Before writing `disko.nix`, inspect the current VPS over SSH:
```bash
ssh root@<ip> 'lsblk -o NAME,SIZE,TYPE,MODEL,FSTYPE,PTTYPE,MOUNTPOINTS'
ssh root@<ip> 'test -d /sys/firmware/efi && echo UEFI || echo BIOS'
ssh root@<ip> 'findmnt -no SOURCE /'
ssh root@<ip> 'cat /proc/swaps'
```
Use that output to decide:
- disk device name: `/dev/sda`, `/dev/vda`, `/dev/nvme0n1`, etc.
- boot mode: UEFI or BIOS
- partition layout you want `disko` to create
`hosts/vps1/disko.nix` currently assumes:
- GPT
- `/dev/sda`
- UEFI
- ext4 root
- swap partition
Do not install blindly if those assumptions are wrong.
## Generate `hardware-configuration.nix`
`hardware-configuration.nix` is generated during install with `nixos-anywhere`.
The repo path is passed directly to the install command:
```bash
--generate-hardware-config nixos-generate-config ./hosts/<name>/hardware-configuration.nix
```
That generated file should remain tracked in Git after install.
## OpenBao Setup For Tailscale
Each host gets its own AppRole.
The host uses:
- OpenBao address: `https://secrets.api.nodeiwest.se`
- namespace: `it`
- KV mount: `kv`
- auth mount: `auth/approle`
- secret path: `tailscale`
- field: `CLIENT_SECRET`
The host stores:
- `/var/lib/nodeiwest/openbao-approle-role-id`
- `/var/lib/nodeiwest/openbao-approle-secret-id`
The rendered Tailscale auth key lives at:
- `/run/nodeiwest/tailscale-auth-key`
### Create A Policy
Create a minimal read-only policy for the Tailscale secret.
If the secret is accessible as:
```bash
BAO_NAMESPACE=it bao kv get -mount=kv tailscale
```
then create the matching read policy for that mount.
Example shape for the KV v2 mount `kv`:
```hcl
path "kv/data/tailscale" {
capabilities = ["read"]
}
```
Write it from your machine:
```bash
export BAO_ADDR=https://secrets.api.nodeiwest.se
export BAO_NAMESPACE=it
bao policy write tailscale-vps1 ./tailscale-vps1-policy.hcl
```
Adjust the path to match your actual OpenBao KV mount.
### Create The AppRole
Create one AppRole per host.
Example for `vps1`:
```bash
bao write auth/approle/role/tailscale-vps1 \
token_policies=tailscale-vps1 \
token_ttl=1h \
token_max_ttl=24h \
token_num_uses=0 \
secret_id_num_uses=0
```
### Generate Bootstrap Credentials
Create a temporary bootstrap directory on your machine:
```bash
mkdir -p bootstrap/var/lib/nodeiwest
```
Write the AppRole credentials into it:
```bash
bao read -field=role_id auth/approle/role/tailscale-vps1/role-id \
> bootstrap/var/lib/nodeiwest/openbao-approle-role-id
bao write -f -field=secret_id auth/approle/role/tailscale-vps1/secret-id \
> bootstrap/var/lib/nodeiwest/openbao-approle-secret-id
chmod 0400 bootstrap/var/lib/nodeiwest/openbao-approle-role-id
chmod 0400 bootstrap/var/lib/nodeiwest/openbao-approle-secret-id
```
These files are install-time bootstrap material. They are not stored in Git.
## Install With `nixos-anywhere`
Install from your machine:
```bash
nix run github:nix-community/nixos-anywhere -- \
--extra-files ./bootstrap \
--copy-host-keys \
--generate-hardware-config nixos-generate-config ./hosts/vps1/hardware-configuration.nix \
--flake .#vps1 \
root@100.101.167.118
```
What this does:
- wipes the target disk according to `hosts/vps1/disko.nix`
- installs NixOS with `.#vps1`
- copies the AppRole bootstrap files into `/var/lib/nodeiwest`
- generates `hosts/vps1/hardware-configuration.nix`
Important:
- this destroys the existing OS on the target
- take provider snapshots and application backups first
- the target SSH host keys may change after install
## First Boot Behavior
On first boot:
1. `vault-agent-tailscale.service` starts using `pkgs.openbao`
2. it authenticates to OpenBao with AppRole
3. it renders `CLIENT_SECRET` from namespace `it`, KV mount `kv`, path `tailscale` to `/run/nodeiwest/tailscale-auth-key`
4. `nodeiwest-tailscale-authkey-ready.service` waits until that file exists
5. `tailscaled-autoconnect.service` uses that file and runs `tailscale up --ssh`
Public SSH remains the recovery path if OpenBao or Tailscale bootstrap fails.
## Verify After Install
SSH to the host over the public IP first.
Check:
```bash
systemctl status vault-agent-tailscale
systemctl status nodeiwest-tailscale-authkey-ready
systemctl status tailscaled-autoconnect
ls -l /var/lib/nodeiwest
ls -l /run/nodeiwest/tailscale-auth-key
tailscale status
```
If Tailscale bootstrap fails, inspect logs:
```bash
journalctl -u vault-agent-tailscale -b
journalctl -u nodeiwest-tailscale-authkey-ready -b
journalctl -u tailscaled-autoconnect -b
```
Typical causes:
- wrong AppRole credentials
- wrong OpenBao policy
- wrong secret path
- wrong KV mount path
- `CLIENT_SECRET` field missing in the secret
## Deploy Changes After Install
Once the host is installed and reachable, use Colmena:
```bash
nix run .#colmena -- apply --on vps1
```
## Rotating The AppRole SecretID
To rotate the machine credential:
1. generate a new `secret_id` from your machine
2. replace `/var/lib/nodeiwest/openbao-approle-secret-id` on the host
3. restart the agent
Example:
```bash
bao write -f -field=secret_id auth/approle/role/tailscale-vps1/secret-id > new-secret-id
scp new-secret-id root@100.101.167.118:/var/lib/nodeiwest/openbao-approle-secret-id
ssh root@100.101.167.118 'chmod 0400 /var/lib/nodeiwest/openbao-approle-secret-id && systemctl restart vault-agent-tailscale tailscaled-autoconnect'
rm -f new-secret-id
```
## Recovery Notes
- Tailscale is additive. It should not be your only access path.
- Public SSH on port `22` must remain available for first access and recovery.
- OpenBao SSH CA auth is separate from Tailscale bootstrap.
- If a machine fails to join the tailnet, recover via public SSH or provider console.