RelayS2S: A Dual-Path Speculative Generation for Real-Time Dialogue
By: Long Mai
Published: 2026-03-25
View on arXiv →#cs.AI
Abstract
This paper introduces RelayS2S, a novel dual-path speculative generation framework designed for real-time dialogue systems. It significantly reduces latency in generating responses by employing a speculative approach where a draft response is quickly generated and then refined. This architecture enhances the responsiveness and user experience of conversational AI agents, making real-time human-AI interaction more seamless and efficient across various applications.