Hospital AKI Detection System

Kubernetes-based real-time acute kidney injury detection from live HL7 laboratory streams with pager alerts and fault-tolerant inference.

GitHub Link
|
Python
Kubernetes
Docker
HL7 / Kafka
Prometheus / Grafana

A production-grade ML system capable of processing streaming hospital lab results, running inference in real time, and dispatching clinical pager alerts when AKI risk is detected.
Built as a fully fault-tolerant system and validated with chaos-engineering-style failure injection — network partitions, pod evictions, node failures, database restarts, and message backlogs. Sustained 100% uptime over a two-week evaluation and exceeded the NHS baseline for AKI detection with an F3-score of 99.9%.

Key Aspects

Real-time streaming — ingested HL7 lab messages via a message broker, running ML inference with sub-second latency and dispatching pager alerts.
Fault tolerance — validated via chaos-monkey style failure injection: network partitions, pod evictions, node failures, database restarts, and message backlogs.
Observability — Prometheus metrics, Grafana dashboards, and SLO-based alerting with on-call runbooks for operational reliability.
Autoscaling & rollouts — horizontal pod autoscaling with graceful rolling deployments and zero-downtime upgrades.
Clinical performance — exceeded the NHS baseline for AKI detection, achieving an F3-score of 99.9% over a sustained two-week evaluation with 100% uptime.