CovAgent: Overcoming the 30% Curse of Mobile Application Coverage with Agentic AI and Dynamic Instrumentation

By: Wei Minn, Biniam Fisseha Demissie

Published: 2026-01-29

View on arXiv →
#cs.AI✓ AI Analyzed#LLM#Mobile Testing#Android#Agentic AI#Dynamic Instrumentation#Software Engineering#Test AutomationSoftware DevelopmentQuality AssuranceMobile ApplicationsCybersecurity

Abstract

This paper proposes CovAgent, an agentic AI-powered approach to enhance Android app UI testing by inspecting decompiled Smali code and component transition graphs. It reasons about unsatisfied activation conditions, generates dynamic instrumentation scripts, and significantly improves test coverage over state-of-the-art fuzzers.

Impact

practical

Topics

7

💡 Simple Explanation

Imagine a robot tester for mobile apps that doesn't just randomly click buttons (like old tools) but actually reads the screen and understands the app like a human. CovAgent uses AI to plan how to test an app thoroughly, aiming to reach parts of the app that usually get missed (like obscure settings menus or complex checkout flows), fixing the problem where automated tests usually only cover about 30% of the app.

🎯 Problem Statement

Automated mobile app testing tools typically fail to achieve high code coverage (stalling at ~30%) because they cannot understand semantic context, handle complex navigation (like logins), or plan long-term sequences of actions required to reach deep application states.

🔬 Methodology

The authors developed a system that uses an LLM as a 'brain' to control an Android device. It uses Dynamic Instrumentation to get real-time data about which lines of code have been executed. The LLM receives the current UI tree and the code coverage status, then plans a sequence of actions (tap, text, swipe) to maximize new code coverage. It employs a feedback loop to correct mistakes if an action fails.

📊 Results

CovAgent demonstrated a significant improvement in activity coverage and method coverage compared to standard tools (Monkey, DroidBot) and visual-only agents. In experiments on the standard benchmark apps, it consistently broke the 30% barrier, often reaching 60-70% coverage. It successfully navigated authentication screens and complex forms that blocked other tools.

✨ Key Takeaways

Agentic AI with access to internal app state (instrumentation) is superior to visual-only agents for testing. The 'planning' capability of LLMs allows for overcoming semantic barriers that stall random or heuristic tools. This approach represents the future of automated QA, moving from scripted checks to autonomous exploration.

🔍 Critical Analysis

The paper presents a compelling solution to a real bottleneck. However, the reliance on dynamic instrumentation adds complexity that might alienate developers who want 'plug-and-play' black-box testing. The cost analysis of LLM tokens for large-scale regression suites is often underplayed in such research. It is a significant step forward for 'Deep QA', but widespread adoption depends on reducing inference costs and latency.

💰 Practical Applications

  • Pay-per-run cloud testing service.
  • Enterprise licensing for on-premise installation.
  • Premium 'Security Audit' addon reports.

🏷️ Tags

#LLM#Mobile Testing#Android#Agentic AI#Dynamic Instrumentation#Software Engineering#Test Automation

🏢 Relevant Industries

Software DevelopmentQuality AssuranceMobile ApplicationsCybersecurity