PSPA-Bench: A Personalized Benchmark for Smartphone GUI Agents

By: Hongyi Nie

Published: 2026-03-31

View on arXiv →
#cs.AI

Abstract

Real-world smartphone use is highly personalized, challenging agents to deliver customized assistance. To address this, PSPA-Bench is introduced as a benchmark for evaluating personalization in smartphone GUI agents. It comprises over 12,855 personalized instructions across 10 daily scenarios and 22 mobile apps, along with a structure-aware process evaluation. The benchmark reveals that current methods perform poorly under personalized settings and highlights directions for improvement: reasoning-oriented models, basic perception, and reflection/long-term memory.

FEEDBACK

Projects

No projects yet

PSPA-Bench: A Personalized Benchmark for Smartphone GUI Agents | ArXiv Intelligence