AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security

By: Shanghai AI Lab

Published: 2026-01-26

View on arXiv →
#cs.AI

Abstract

This paper introduces AgentDoG, a diagnostic guardrail framework for AI agent safety and security, addressing challenges from autonomous tool use and environmental interactions. It provides fine-grained risk diagnosis and hierarchical attribution across agent trajectories, offering transparency beyond binary labels to facilitate effective agent alignment.

FEEDBACK

Projects

No projects yet

AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security | ArXiv Intelligence