AgentDrift: Unsafe Recommendation Drift Under Tool Corruption Hidden by Ranking Metrics in LLM Agents
By: Zeke Woo, Maria Persz Orortiz
Published: 2026-03-30
View on arXiv →#cs.AI
Abstract
This paper identifies a critical and previously hidden safety vulnerability in tool-augmented LLM agents, demonstrating that standard evaluation metrics can obscure unsafe recommendation drift under tool corruption, which is a paramount concern for the deployment of reliable AI systems.