Compare commits

...

5 Commits

Author SHA1 Message Date
eb6b3108e0 improving things 2025-10-28 15:36:03 -03:00
868fdce461 improve wording 2025-10-25 22:53:28 -03:00
5d436bb632 adding loki and fixing version 2025-10-24 22:41:40 -03:00
7d88137084 adding monitoring 2025-10-21 15:53:48 -03:00
703775e224 updating beszel 2025-10-21 15:53:42 -03:00
16 changed files with 553 additions and 53 deletions

View File

@@ -2,15 +2,15 @@
**A *forever-work-in-progress* self-hosted server setup**
Based on a multi-node k3s cluster running on VMs and bare metal hardware.
Runs on a multi-node k3s cluster deployed across VMs and bare-metal hosts.
The overall application configs are stored in a NFS share inside of a SSD that was purposed specifically for this. For that I'm using `nfs-subdir-external-provisioner` as a dynamic storage provisioner with specified paths on each PVC. Some other data is stored on a NAS server with a NFS share as well.
Application configuration is stored on an NFS share located on a dedicated SSD. This uses `nfs-subdir-external-provisioner` as a dynamic storage provisioner with PVC-specific paths. Additional data is stored on a NAS exported via NFS.
The cluster is running on `k3s` with `nginx` as the ingress controller. For load balancing I'm using `MetalLB` in layer 2 mode. I'm also using `cert-manager` for local CA and certificates (as Vaultwarden requires it).
The cluster runs `k3s` with `nginx` as the ingress controller. `MetalLB` is used in layer 2 mode for load balancing. `cert-manager` provides a local CA and issues certificates (required by Vaultwarden).
For more information on setup, check out [SETUP.md](SETUP.md).
For setup details, see [SETUP.md](SETUP.md).
Also, the repository name is a reference to my local TLD which is `.haven` :)
The repository name references my local TLD, `.haven` ;)
## Namespaces
- default
@@ -27,26 +27,36 @@ Also, the repository name is a reference to my local TLD which is `.haven` :)
- AdGuardHome-2 (2nd instance)
- AdGuard-Sync
- infra
- Haven Notify (my own internal service)
- [Haven Notify](https://git.ivanch.me/ivanch/server-scripts/src/branch/main/haven-notify)
- Beszel
- Beszel Agent (running as DaemonSet)
- Code Config (vscode for internal config editing)
- Beszel Agent (running as a DaemonSet)
- Code Config (VS Code for internal config editing)
- WireGuard Easy
- dev
- Gitea Runner (x64)
- Gitea Runner (arm64)
- monitoring
- Grafana
- Prometheus
- Node Exporter
- Kube State Metrics
- Loki
- Alloy
#### Miscellaneous namespaces
- lab (A playground/sandbox namespace)
- nfs-pod (for testing and accessing NFS mounts through NFS)
- lab (a playground/sandbox namespace)
- nfs-pod (for testing and accessing NFS mounts)
- metallb-system
- MetalLB components
- cert-manager
- Cert-Manager components
- cert-manager components
## Todo:
- Move archivebox data to its own PVC on NAS
- Move uptimekuma to `infra` namespace
- Add links to each application docs
- Add links to server scripts
## Todo
- Move ArchiveBox data to its own PVC on the NAS
- Move Uptime Kuma to the infra namespace
- Add links to each application's documentation
- Add links to server scripts
- Move Alloy to the monitoring namespace
- Install Loki, Grafana, and Prometheus via Helm charts
- Configure Loki and Prometheus to use PVCs

View File

@@ -50,7 +50,7 @@ kubectl apply -f metallb-system/address-pool.yaml
## Install cert-manager
```bash
kubectl create namespace cert-manager
kubectl create ns cert-manager
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.15.1/cert-manager.yaml
```

View File

@@ -67,7 +67,7 @@ spec:
- port: 7575
targetPort: homarr-port
---
# 3) PersistentVolumeClaim (for /config)
# 3) PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
@@ -83,7 +83,7 @@ spec:
requests:
storage: 1Gi
---
# 4) Ingress (Traefik)
# 4) Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:

View File

@@ -44,7 +44,7 @@ spec:
- port: 80
targetPort: 80
---
# 3) PersistentVolumeClaim (local storage via k3s local-path)
# 3) PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
@@ -60,7 +60,7 @@ spec:
requests:
storage: 1Gi
---
# 4) Ingress (Traefik)
# 4) Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:

View File

@@ -49,7 +49,7 @@ spec:
- port: 8080
targetPort: searxng-port
---
# 3) PersistentVolumeClaim (for /config)
# 3) PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
@@ -65,7 +65,7 @@ spec:
requests:
storage: 1Gi
---
# 4) Ingress (Traefik)
# 4) Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:

View File

@@ -63,7 +63,7 @@ spec:
- port: 3001
targetPort: uptimekuma-port
---
# 3) PersistentVolumeClaim (for /config)
# 3) PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
@@ -79,7 +79,7 @@ spec:
requests:
storage: 1Gi
---
# 4) Ingress (Traefik)
# 4) Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:

View File

@@ -75,7 +75,7 @@ spec:
requests:
storage: 1Gi
---
# 4) Ingress (Traefik)
# 4) Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
@@ -102,7 +102,7 @@ spec:
port:
number: 80
---
# 4) Ingress (Traefik)
# 4) Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:

View File

@@ -4,18 +4,3 @@ kubectl create secret generic adguardhome-password \
--from-literal=password='your_adguardhome_password' \
--from-literal=username='your_adguardhome_username' -n dns
```
## Add AdGuardHome to CoreDNS configmap fallback:
1. Edit the CoreDNS configmap:
```bash
kubectl edit configmap coredns -n kube-system
```
2. Replace the `forward` line with the following:
```
forward . <ADGUARDHOME_IP> <ADGUARDHOME_IP_2>
```
This will use AdGuardHome as the primary DNS server and a secondary one as a fallback, instead of using the default Kubernetes CoreDNS server.
You may also use `/etc/resolv.conf` to forward to the node's own DNS resolver, but it depends on whether it's well configured or not. *Since it's Linux, we never know.*
Ideally, since DNS is required for fetching the container image, you would have AdGuardHome as first and then a public DNS server as second (fallback).

View File

@@ -22,7 +22,7 @@ spec:
secretKeyRef:
name: beszel-key
key: SECRET-KEY
image: henrygd/beszel-agent:0.12.10
image: henrygd/beszel-agent:0.14.1
imagePullPolicy: Always
name: beszel-agent
ports:

View File

@@ -15,15 +15,19 @@ spec:
labels:
app: beszel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/arch
operator: In
values:
- amd64
containers:
- name: beszel
image: ghcr.io/henrygd/beszel/beszel:0.12.10
image: ghcr.io/henrygd/beszel/beszel:0.14.1
imagePullPolicy: Always
env:
- name: PUID
value: "1000"
- name: PGID
value: "1000"
ports:
- containerPort: 8090
name: beszel-port
@@ -49,7 +53,7 @@ spec:
- port: 80
targetPort: beszel-port
---
# 3) PersistentVolumeClaim (for /config)
# 3) PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
@@ -65,7 +69,7 @@ spec:
requests:
storage: 1Gi
---
# 4) Ingress (Traefik)
# 4) Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:

View File

@@ -66,7 +66,7 @@ spec:
- port: 8443
targetPort: code-port
---
# 3) PersistentVolumeClaim (for /config)
# 3) PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
@@ -82,7 +82,7 @@ spec:
requests:
storage: 5Gi
---
# 4) Ingress (Traefik)
# 4) Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:

105
monitoring/grafana.yaml Normal file
View File

@@ -0,0 +1,105 @@
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: grafana
name: grafana
namespace: monitoring
spec:
selector:
matchLabels:
app: grafana
template:
metadata:
labels:
app: grafana
spec:
securityContext:
fsGroup: 472
supplementalGroups:
- 0
containers:
- name: grafana
image: grafana/grafana:latest
imagePullPolicy: Always
ports:
- containerPort: 3000
name: http-grafana
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /robots.txt
port: 3000
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 2
livenessProbe:
failureThreshold: 3
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
tcpSocket:
port: 3000
timeoutSeconds: 1
resources:
requests:
cpu: 250m
memory: 750Mi
volumeMounts:
- mountPath: /var/lib/grafana
name: grafana-pv
volumes:
- name: grafana-pv
persistentVolumeClaim:
claimName: grafana-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: grafana-pvc
namespace: monitoring
annotations:
nfs.io/storage-path: "grafana-data"
spec:
storageClassName: "nfs-client"
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: Service
metadata:
namespace: monitoring
name: grafana
spec:
ports:
- port: 3000
protocol: TCP
targetPort: http-grafana
selector:
app: grafana
type: ClusterIP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
namespace: monitoring
name: grafana
spec:
ingressClassName: nginx
rules:
- host: grafana.haven
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: grafana
port:
number: 3000

View File

@@ -0,0 +1,109 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: kube-state-metrics
namespace: monitoring
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: kube-state-metrics
rules:
- apiGroups: [""]
resources:
- nodes
- pods
- services
- endpoints
- namespaces
- replicationcontrollers
verbs: ["list", "watch"]
- apiGroups: ["extensions", "apps"]
resources:
- daemonsets
- deployments
- replicasets
- statefulsets
verbs: ["list", "watch"]
- apiGroups: ["batch"]
resources:
- cronjobs
- jobs
verbs: ["list", "watch"]
- apiGroups: ["autoscaling"]
resources:
- horizontalpodautoscalers
verbs: ["list", "watch"]
- apiGroups: ["policy"]
resources:
- poddisruptionbudgets
verbs: ["list", "watch"]
- apiGroups: ["storage.k8s.io"]
resources:
- storageclasses
- volumeattachments
verbs: ["list", "watch"]
- apiGroups: ["apps"]
resources:
- replicasets
verbs: ["list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kube-state-metrics
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kube-state-metrics
subjects:
- kind: ServiceAccount
name: kube-state-metrics
namespace: monitoring
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: kube-state-metrics
namespace: monitoring
labels:
app: kube-state-metrics
spec:
replicas: 1
selector:
matchLabels:
app: kube-state-metrics
template:
metadata:
labels:
app: kube-state-metrics
spec:
serviceAccountName: kube-state-metrics
containers:
- name: kube-state-metrics
image: registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.13.0
ports:
- name: http-metrics
containerPort: 8080
resources:
requests:
cpu: 50m
memory: 64Mi
limits:
cpu: 200m
memory: 256Mi
---
apiVersion: v1
kind: Service
metadata:
name: kube-state-metrics
namespace: monitoring
labels:
app: kube-state-metrics
spec:
ports:
- name: http-metrics
port: 8080
targetPort: http-metrics
selector:
app: kube-state-metrics

101
monitoring/loki.yaml Normal file
View File

@@ -0,0 +1,101 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: loki
namespace: monitoring
spec:
replicas: 1
selector:
matchLabels:
app: loki
template:
metadata:
labels:
app: loki
spec:
containers:
- name: loki
image: grafana/loki:3
args: ["-config.file=/etc/loki/config/config.yaml"]
ports:
- containerPort: 3100
volumeMounts:
- name: config
mountPath: /etc/loki/config
- name: loki-storage
mountPath: /tmp/loki
volumes:
- name: config
configMap:
name: loki-config
- name: loki-storage
emptyDir:
medium: Memory
---
apiVersion: v1
kind: ConfigMap
metadata:
name: loki-config
namespace: monitoring
data:
config.yaml: |
auth_enabled: true
server:
http_listen_port: 3100
common:
ring:
instance_addr: 127.0.0.1
kvstore:
store: inmemory
replication_factor: 1
path_prefix: /tmp/loki
querier:
multi_tenant_queries_enabled: true
schema_config:
configs:
- from: "2024-01-01"
store: tsdb
object_store: filesystem
schema: v13
index:
prefix: index_
period: 24h
storage_config:
tsdb_shipper:
active_index_directory: /tmp/loki/index
cache_location: /tmp/loki/cache
filesystem:
directory: /tmp/loki/chunks
limits_config:
allow_structured_metadata: true
retention_period: 0
ingester:
lifecycler:
ring:
kvstore:
store: inmemory
replication_factor: 1
chunk_idle_period: 1m
max_chunk_age: 5m
chunk_target_size: 1536000
compactor:
retention_enabled: false
---
apiVersion: v1
kind: Service
metadata:
name: loki
namespace: monitoring
spec:
ports:
- port: 3100
targetPort: 3100
name: http
selector:
app: loki

View File

@@ -0,0 +1,56 @@
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-exporter
namespace: monitoring
labels:
app: node-exporter
spec:
selector:
matchLabels:
app: node-exporter
template:
metadata:
labels:
app: node-exporter
spec:
hostNetwork: true
containers:
- name: node-exporter
image: prom/node-exporter:latest
imagePullPolicy: Always
args:
- "--path.rootfs=/host"
ports:
- containerPort: 9100
hostPort: 9100
name: metrics
protocol: TCP
resources:
requests:
memory: "50Mi"
cpu: "100m"
limits:
memory: "100Mi"
cpu: "200m"
volumeMounts:
- name: host
mountPath: /host
readOnly: true
volumes:
- name: host
hostPath:
path: /
---
apiVersion: v1
kind: Service
metadata:
name: node-exporter
namespace: monitoring
spec:
selector:
app: node-exporter
ports:
- name: metrics
port: 9100
targetPort: metrics

130
monitoring/prometheus.yaml Normal file
View File

@@ -0,0 +1,130 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus
namespace: monitoring
labels:
app: prometheus
spec:
replicas: 1
selector:
matchLabels:
app: prometheus
template:
metadata:
labels:
app: prometheus
spec:
serviceAccountName: prometheus
containers:
- name: prometheus
image: prom/prometheus:latest
args:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.path=/prometheus"
- "--storage.tsdb.retention.time=1d"
- "--web.enable-lifecycle"
ports:
- containerPort: 9090
name: web
volumeMounts:
- name: prometheus-config-volume
mountPath: /etc/prometheus
- name: prometheus-storage
mountPath: /prometheus
resources:
requests:
memory: "500Mi"
cpu: "200m"
limits:
memory: "1Gi"
cpu: "500m"
volumes:
- name: prometheus-config-volume
persistentVolumeClaim:
claimName: prometheus-pvc
- name: prometheus-storage
emptyDir:
medium: Memory
sizeLimit: 256Mi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: prometheus-pvc
namespace: monitoring
annotations:
nfs.io/storage-path: "prometheus-config"
spec:
storageClassName: "nfs-client"
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
# Service URL - http://prometheus.monitoring.svc.cluster.local:9090
apiVersion: v1
kind: Service
metadata:
name: prometheus
namespace: monitoring
labels:
app: prometheus
spec:
ports:
- name: web
port: 9090
targetPort: web
selector:
app: prometheus
type: ClusterIP
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus
namespace: monitoring
labels:
app: prometheus
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus
namespace: monitoring
labels:
app: prometheus
rules:
- apiGroups: [""]
resources:
- nodes
- nodes/proxy
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
- apiGroups: ["extensions"]
resources:
- ingresses
verbs: ["get", "list", "watch"]
- apiGroups: ["networking.k8s.io"]
resources:
- ingresses
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus
namespace: monitoring
labels:
app: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus
namespace: monitoring