212 lines
7.1 KiB
Markdown
212 lines
7.1 KiB
Markdown
|
|
STEP 12 – MONITORING & HEALTH CHECKS (MINIMAL)
|
|||
|
|
Datum: 2026-02-22
|
|||
|
|
Status: ABGESCHLOSSEN
|
|||
|
|
|
|||
|
|
=====================================================================
|
|||
|
|
1. ZIEL
|
|||
|
|
=====================================================================
|
|||
|
|
|
|||
|
|
Minimaler Betriebsnachweis:
|
|||
|
|
- Services leben (Health-Checks)
|
|||
|
|
- Metriken aus bestehenden Logs (kein externer Dienst)
|
|||
|
|
- Integritaetspruefung automatisierbar
|
|||
|
|
- Scheduling dokumentiert
|
|||
|
|
|
|||
|
|
Keine Patientendaten erfasst.
|
|||
|
|
|
|||
|
|
=====================================================================
|
|||
|
|
2. HEALTH-CHECK ENDPOINTS
|
|||
|
|
=====================================================================
|
|||
|
|
|
|||
|
|
Alle Services liefern jetzt unter /health:
|
|||
|
|
|
|||
|
|
{
|
|||
|
|
"status": "ok",
|
|||
|
|
"version": "<app-version>",
|
|||
|
|
"uptime_s": <seconds>,
|
|||
|
|
"tls": true/false
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
a) backend_main.py
|
|||
|
|
URL: https://127.0.0.1:8000/health
|
|||
|
|
Vorher: {"ok": true}
|
|||
|
|
Nachher: {"status": "ok", "version": "0.1.0", "uptime_s": ..., "tls": ...}
|
|||
|
|
|
|||
|
|
b) transcribe_server.py
|
|||
|
|
URL: https://127.0.0.1:8090/health
|
|||
|
|
Vorher: {"status": "ok"}
|
|||
|
|
Nachher: {"status": "ok", "version": "0.1.0", "uptime_s": ..., "tls": ...}
|
|||
|
|
|
|||
|
|
c) todo_server.py
|
|||
|
|
URL: https://127.0.0.1:5111/health
|
|||
|
|
Vorher: Kein /health Endpoint
|
|||
|
|
Nachher: {"status": "ok", "version": "1.0.0", "uptime_s": ..., "tls": ...}
|
|||
|
|
|
|||
|
|
d) Desktop-App (basis14.py)
|
|||
|
|
Kein HTTP-Endpoint (Desktop-App ohne eigenen Server).
|
|||
|
|
Betriebsnachweis ueber Audit-Log:
|
|||
|
|
- APP_START / APP_STOP Events
|
|||
|
|
- LOGIN_OK / LOGIN_FAIL Events
|
|||
|
|
|
|||
|
|
=====================================================================
|
|||
|
|
3. MONITORING-METRIKEN (aza_monitoring.py)
|
|||
|
|
=====================================================================
|
|||
|
|
|
|||
|
|
Quellen: Nur bestehende lokale Logs/Dateien. Kein Cloud-Dienst.
|
|||
|
|
|
|||
|
|
a) Audit-Log Metriken (aus aza_audit_log.py):
|
|||
|
|
- Anzahl LOGIN_FAIL
|
|||
|
|
- Anzahl AI_CHAT + AI_TRANSCRIBE (KI-Calls Zaehler)
|
|||
|
|
- Anzahl AI_BLOCKED
|
|||
|
|
- Anzahl 2FA_FAIL
|
|||
|
|
- Integritaets-Status (PASS/FAIL)
|
|||
|
|
|
|||
|
|
b) Consent-Log Metriken (aus aza_consent.py):
|
|||
|
|
- Anzahl Eintraege
|
|||
|
|
- Integritaets-Status
|
|||
|
|
|
|||
|
|
c) Backup-Metriken (aus backups/ Verzeichnis):
|
|||
|
|
- Anzahl vorhandener Backups
|
|||
|
|
- Letztes Backup (Name + Zeitpunkt)
|
|||
|
|
|
|||
|
|
d) Alert-Severity-Stufen:
|
|||
|
|
- INFO: Zaehler ohne Handlungsbedarf
|
|||
|
|
- WARN: Erhoehte Werte (z.B. > 0 login_fail)
|
|||
|
|
- HIGH: Kritische Schwellen (z.B. >= 10 login_fail)
|
|||
|
|
- CRITICAL: Integritaets-Verletzung
|
|||
|
|
|
|||
|
|
=====================================================================
|
|||
|
|
4. INTEGRITAETS-CHECKS
|
|||
|
|
=====================================================================
|
|||
|
|
|
|||
|
|
Automatisierte Pruefung ueber:
|
|||
|
|
python aza_monitoring.py integrity
|
|||
|
|
|
|||
|
|
Prueft:
|
|||
|
|
- aza_audit_log: verify_all_rotations() (SHA-256 Hash-Kette)
|
|||
|
|
- aza_consent_log: verify_chain_integrity() (SHA-256 Hash-Kette)
|
|||
|
|
|
|||
|
|
Bei Fehler:
|
|||
|
|
- Klarer Log-Eintrag (INTEGRITY_FAIL Event ins Audit-Log)
|
|||
|
|
- Exit-Code 1 (fuer Scheduler-Alerting)
|
|||
|
|
|
|||
|
|
=====================================================================
|
|||
|
|
5. CLI-KOMMANDOS
|
|||
|
|
=====================================================================
|
|||
|
|
|
|||
|
|
python aza_monitoring.py health Health-Checks aller Services
|
|||
|
|
python aza_monitoring.py metrics Metriken aus Logs
|
|||
|
|
python aza_monitoring.py integrity Integritaetspruefung (Exit 0/1)
|
|||
|
|
python aza_monitoring.py alerts Sicherheits-Alerts
|
|||
|
|
python aza_monitoring.py nightly Naechtlicher Gesamtcheck + JSON-Report
|
|||
|
|
python aza_monitoring.py all Alles anzeigen
|
|||
|
|
|
|||
|
|
=====================================================================
|
|||
|
|
6. SCHEDULING-BEISPIELE
|
|||
|
|
=====================================================================
|
|||
|
|
|
|||
|
|
a) Linux (cron):
|
|||
|
|
|
|||
|
|
# Naechtlicher Gesamtcheck um 02:00
|
|||
|
|
0 2 * * * cd /pfad/zu/aza && python aza_monitoring.py nightly >> /var/log/aza_monitoring.log 2>&1
|
|||
|
|
|
|||
|
|
# Stuendlicher Health-Check
|
|||
|
|
0 * * * * cd /pfad/zu/aza && python aza_monitoring.py health >> /var/log/aza_health.log 2>&1
|
|||
|
|
|
|||
|
|
# Integritaet alle 6 Stunden
|
|||
|
|
0 */6 * * * cd /pfad/zu/aza && python aza_monitoring.py integrity || echo "INTEGRITY FAIL" | mail admin@example.com
|
|||
|
|
|
|||
|
|
b) Windows (Task Scheduler):
|
|||
|
|
|
|||
|
|
Aktion: Programm starten
|
|||
|
|
Programm: python
|
|||
|
|
Argumente: aza_monitoring.py nightly
|
|||
|
|
Starten in: C:\Users\surov\Documents\AZA\backup 19.2.26
|
|||
|
|
|
|||
|
|
Trigger: Taeglich, 02:00 Uhr
|
|||
|
|
|
|||
|
|
Alternativ via PowerShell:
|
|||
|
|
|
|||
|
|
$action = New-ScheduledTaskAction `
|
|||
|
|
-Execute "python" `
|
|||
|
|
-Argument "aza_monitoring.py nightly" `
|
|||
|
|
-WorkingDirectory "C:\Users\surov\Documents\AZA\backup 19.2.26"
|
|||
|
|
$trigger = New-ScheduledTaskTrigger -Daily -At "02:00"
|
|||
|
|
Register-ScheduledTask -TaskName "AZA Nightly Monitor" `
|
|||
|
|
-Action $action -Trigger $trigger -Description "AZA Nightly Monitoring"
|
|||
|
|
|
|||
|
|
=====================================================================
|
|||
|
|
7. DATENSCHUTZ-HINWEIS (DATA MINIMIZATION)
|
|||
|
|
=====================================================================
|
|||
|
|
|
|||
|
|
Das Monitoring erfasst KEINE:
|
|||
|
|
- Patientennamen oder -daten
|
|||
|
|
- Transkript-Inhalte oder Prompts
|
|||
|
|
- Passwoerter oder API-Keys
|
|||
|
|
- KI-Antworten
|
|||
|
|
|
|||
|
|
Es werden ausschliesslich Zaehler und Status-Informationen erhoben:
|
|||
|
|
- Anzahl Events pro Typ
|
|||
|
|
- PASS/FAIL Status
|
|||
|
|
- Dateigeroessen und Zeitstempel
|
|||
|
|
- Service-Version und Uptime
|
|||
|
|
|
|||
|
|
Health-Check-Responses enthalten nur technische Metadaten.
|
|||
|
|
|
|||
|
|
=====================================================================
|
|||
|
|
8. TEST-ERGEBNIS
|
|||
|
|
=====================================================================
|
|||
|
|
|
|||
|
|
Testskript: _test_monitoring.py
|
|||
|
|
27 Tests, 0 Fehler.
|
|||
|
|
|
|||
|
|
Tests:
|
|||
|
|
1. Metriken: Audit-Log Eintraege, Integritaet, Backup-Count
|
|||
|
|
2. Alerts: login_fail, ai_calls_total, ai_blocked, 2fa_fail erkannt
|
|||
|
|
3. Integritaets-Check: PASS bei intaktem Log
|
|||
|
|
4. Manipulation: FAIL bei manipuliertem Log, PASS nach Restore
|
|||
|
|
5. Nightly-Report: JSON-Struktur korrekt, Datei geschrieben
|
|||
|
|
6. Data Minimization: Keine Passwoerter/Keys/Transkripte
|
|||
|
|
7. Health-Check Format: _APP_VERSION und _START_TIME vorhanden
|
|||
|
|
|
|||
|
|
=====================================================================
|
|||
|
|
9. BETROFFENE DATEIEN
|
|||
|
|
=====================================================================
|
|||
|
|
|
|||
|
|
Geaendert:
|
|||
|
|
- backend_main.py (/health erweitert: version, uptime_s, tls)
|
|||
|
|
- transcribe_server.py (/health erweitert: version, uptime_s, tls)
|
|||
|
|
- todo_server.py (/health neu hinzugefuegt)
|
|||
|
|
|
|||
|
|
Neu:
|
|||
|
|
- aza_monitoring.py (Monitoring, Metriken, Integrity, CLI)
|
|||
|
|
- _test_monitoring.py (Proof-Skript)
|
|||
|
|
|
|||
|
|
Nicht geaendert:
|
|||
|
|
- basis14.py
|
|||
|
|
- aza_audit_log.py (unveraendert, wird nur gelesen)
|
|||
|
|
- aza_consent.py (unveraendert, wird nur gelesen)
|
|||
|
|
|
|||
|
|
=====================================================================
|
|||
|
|
10. RISIKEN
|
|||
|
|
=====================================================================
|
|||
|
|
|
|||
|
|
- Health-Checks sind unauthentifiziert.
|
|||
|
|
Risiko: Gering (nur Status-Info, keine sensiblen Daten).
|
|||
|
|
Massnahme: Bei Bedarf hinter API-Token schuetzen.
|
|||
|
|
|
|||
|
|
- Monitoring laeuft lokal, kein externes Alerting.
|
|||
|
|
Risiko: Alerts werden nur im JSON-Report gespeichert.
|
|||
|
|
Massnahme: Nightly-Report per Mail weiterleiten (Empfehlung).
|
|||
|
|
|
|||
|
|
- Kein Real-Time-Monitoring.
|
|||
|
|
Risiko: Zwischenzeitliche Ausfaelle werden erst beim naechsten
|
|||
|
|
Scheduled-Check erkannt.
|
|||
|
|
Massnahme: Health-Check-Intervall anpassen (z.B. alle 5 Min).
|
|||
|
|
|
|||
|
|
=====================================================================
|
|||
|
|
11. OFFENE PUNKTE
|
|||
|
|
=====================================================================
|
|||
|
|
|
|||
|
|
Keine.
|