Scalable and Secure AI Inference in Healthcare: A Comparative Benchmarking of FastAPI and Triton Inference Server on Kubernetes
By: Ratul Ali
Published: 2026-02-03
View on arXiv →#cs.AI
Abstract
This paper benchmarks FastAPI and Triton Inference Server on Kubernetes for scalable and secure AI inference in healthcare. It focuses on practical deployment challenges and solutions for AI models in sensitive environments, highlighting real-world application in a critical sector.