All Research Papers

Browse through 368 research papers

The Patient is not a Moving Document: A World Model Training Paradigm for Longitudinal EHR

This paper introduces a novel world model training paradigm specifically designed for longitudinal Electronic Health Records (EHR). It addresses the challenges of integrating and interpreting continuous patient data over time, aiming to improve AI's ability to provide more accurate and context-aware insights for healthcare applications.

cs.AI
Read More

World of Workflows: a Benchmark for Bringing World Models to Enterprise Systems

This research proposes "World of Workflows," a benchmark designed to facilitate the integration of advanced AI world models into enterprise systems. It aims to evaluate and accelerate the application of AI in complex business processes by providing a standardized framework for testing and developing AI solutions tailored for real-world corporate environments.

cs.AI
Read More

Routing the Lottery: Adaptive Subnetworks for Heterogeneous Data

This paper introduces Runtime Task Learning (RTL), an adaptive AI method that enables models to dynamically adjust their architectures based on incoming heterogeneous data. It demonstrates significant advancements in areas like image classification and speech enhancement, moving away from a 'one model fits all' approach to provide tailored solutions and efficiency gains, achieving accuracy improvements of up to 5% on CIFAR-100 benchmarks.

cs.AI
Read More

PhaseCoder: Microphone Geometry-Agnostic Spatial Audio Understanding for Multimodal LLMs

This paper presents PhaseCoder, a transformer-only spatial audio encoder that operates independently of microphone geometry. It processes raw multichannel audio and microphone coordinates to perform localization and generate robust spatial embeddings. This enables multimodal Large Language Models (LLMs) to perform complex spatial reasoning and targeted transcription from various microphone arrays.

cs.AI
Read More

Solver-in-the-Loop: MDP-Based Benchmarks for Self-Correction and Behavioral Rationality in Operations Research

This work introduces two new benchmarks, ORDebug and ORBias, that integrate a solver into the evaluation loop for AI models. ORDebug assesses iterative self-correction in solving infeasible operations research models, while ORBias evaluates behavioral rationality in newsvendor instances. This approach aims to improve the diagnostic and self-repair capabilities of large language models in practical optimization settings.

cs.AI
Read More

Exploring Reasoning Reward Model for Agents

This paper focuses on developing and exploring a reasoning reward model designed to improve the capabilities of AI agents. It likely investigates how to effectively train agents by providing rewards that are aligned with complex reasoning processes, leading to more intelligent and robust agent behaviors in various applications.

cs.AI
Read More

Conditional Denoising Model as a Physical Surrogate Model

This paper explores the use of conditional denoising models as physical surrogate models for complex physical systems. It addresses the common trade-off between data-fitting accuracy and physical consistency in surrogate modeling. This approach has potential for accurately simulating physical phenomena, particularly in fields like plasma physics.

cs.AI
Read More

Self-Improving Pretraining: using post-trained models to pretrain better models

The "Self-Improving Pretraining" framework integrates alignment objectives (safety, factuality, quality) directly into LLM pretraining using a powerful post-trained model as a dynamic rewriter and judge. This method leads to significant gains in generation coherence and factuality, improving the reliability and trustworthiness of large language models for real-world use.

cs.AI
Read More

LLM-Assisted Logic Rule Learning: Scaling Human Expertise for Time Series Anomaly Detection

This framework leverages LLMs to encode human expertise into interpretable logic rules for time series anomaly detection in supply chains. It outperforms unsupervised methods in accuracy and interpretability and offers consistent, low-cost results suitable for production deployment, bridging the gap between automation and expert decision-making.

cs.AI
Read More

How AI Impacts Skill Formation

This study experimentally investigates how AI assistance influences human skill acquisition, revealing that while it does not consistently improve immediate productivity for new learning tasks, it significantly hinders the formation of essential skills such as debugging and conceptual understanding. The research shows that the style of AI interaction dictates learning outcomes, with passive delegation leading to poorer skill development. This has critical implications for education and workplace training in the AI era.

cs.AI✓ AI Analyzed
Read More

DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation

DynamicVLA introduces a compact 0.4B parameter vision-language-action model and the Dynamic Object Manipulation (DOM) benchmark, enabling robots to robustly manipulate moving objects in real-world scenarios. The model achieves superior success rates on DOM simulation and consistent performance on physical robots, signifying a leap in robotic manipulation capabilities.

cs.AI
Read More

A Pragmatic VLA Foundation Model

LingBot-VLA is a Vision-Language-Action foundation model pre-trained on 20,000 hours of real-world multi-embodiment robot data. It demonstrates that VLA model performance scales with increasing data volume without saturation, achieving superior success rates on a 100-task real-world benchmark across three robot platforms, and improving training efficiency. This directly advances practical robotics.

cs.AI
Read More

The Illusion of Insight in Reasoning Models

This paper investigates the phenomenon of "illusion of insight" in AI reasoning models, where models might appear to have genuine understanding without truly possessing it. The research critically examines the mechanisms behind such illusions and their implications for the trustworthiness and explainability of artificial intelligence systems.

cs.AI
Read More

Progressive Ideation using an Agentic AI Framework for Human-AI Co-Creation

The paper introduces an agentic AI framework designed to facilitate human-AI co-creation through progressive ideation. This framework allows for iterative development of ideas, combining human creativity with AI's generative capabilities to explore novel solutions across various creative domains.

cs.AI
Read More

Mortar: Evolving Mechanics for Automatic Game Design

The paper introduces Mortar, a system that uses evolving mechanics for automatic game design. This AI-driven approach can generate novel game rules and interactions, aiming to accelerate the game development process and foster innovative gameplay experiences without manual intervention.

cs.AI
Read More

From Clay to Code: Typological and Material Reasoning in AI Interpretations of Iranian Pigeon Towers

This research explores AI's ability to interpret and reason about architectural heritage, specifically Iranian Pigeon Towers, using typological and material reasoning. It demonstrates how AI can contribute to understanding and preserving cultural artifacts by transforming complex architectural data into computable forms.

cs.AI
Read More

DA-DPO: Cost-efficient Difficulty-aware Preference Optimization for Reducing MLLM Hallucinations

This work presents DA-DPO, a cost-efficient and difficulty-aware preference optimization method aimed at significantly reducing hallucinations in Multimodal Large Language Models (MLLMs). By optimizing based on content difficulty, the approach improves the factual consistency and reliability of MLLM outputs.

cs.AI
Read More

Adaptive Causal Coordination Detection for Social Media: A Memory-Guided Framework with Semi-Supervised Learning

This paper proposes a memory-guided framework with semi-supervised learning for detecting adaptive causal coordination on social media. The approach aims to identify complex, evolving coordination patterns, which is critical for understanding and mitigating the spread of misinformation and coordinated malicious activities online.

cs.AI
Read More

A multi-algorithm approach for operational human resources workload balancing in a last mile urban delivery system

This paper proposes a multi-algorithm approach to optimize human resources workload balancing in last-mile urban delivery systems. The methodology aims to improve operational efficiency and resource allocation by intelligently distributing tasks, leading to better delivery times and reduced costs.

cs.AI
Read More

Can Semantic Methods Enhance Team Sports Tactics? A Methodology for Football with Broader Applications

This research explores how semantic methods can improve tactical analysis in team sports, specifically football. It presents a methodology that uses AI to derive deeper insights into game strategies, offering potential for enhanced coaching, player development, and real-time decision support in sports.

cs.AI
Read More

Self-Distillation Enables Continual Learning

This paper introduces Self-Distillation Fine-Tuning (SDFT), a method enabling large language models to continually acquire new skills and knowledge from demonstrations without catastrophic forgetting. SDFT leverages in-context learning by using the model itself as a teacher, outperforming traditional fine-tuning and allowing models to accumulate multiple skills over time.

cs.AI
Read More

One-step Latent-free Image Generation with Pixel Mean Flows

Researchers introduce Pixel MeanFlow (pMF), a generative model that produces high-fidelity images in a single network evaluation directly from noise in pixel space, without requiring a latent encoder or decoder. This method achieves competitive FID scores on ImageNet with lower computational cost, advancing boundaries for diffusion/flow-based generative models.

cs.AI
Read More

DeepSeek-OCR 2: Visual Causal Flow

This work presents DeepSeek-OCR 2, investigating a novel encoder, DeepEncoder V2, capable of dynamically reordering visual tokens based on image semantics. Inspired by human visual perception, this approach aims to achieve effective 2D image understanding through cascaded 1D causal reasoning structures, offering a new architectural paradigm for vision-language models.

cs.AI
Read More

AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security

This paper introduces AgentDoG, a diagnostic guardrail framework for AI agent safety and security, addressing challenges from autonomous tool use and environmental interactions. It provides fine-grained risk diagnosis and hierarchical attribution across agent trajectories, offering transparency beyond binary labels to facilitate effective agent alignment.

cs.AI
Read More

CovAgent: Overcoming the 30% Curse of Mobile Application Coverage with Agentic AI and Dynamic Instrumentation

This paper proposes CovAgent, an agentic AI-powered approach to enhance Android app UI testing by inspecting decompiled Smali code and component transition graphs. It reasons about unsatisfied activation conditions, generates dynamic instrumentation scripts, and significantly improves test coverage over state-of-the-art fuzzers.

cs.AI✓ AI Analyzed
Read More

Ultra-Low Latency Object Detection on Edge Devices for Autonomous Drone Navigation

We present a highly optimized neural network architecture and deployment framework enabling real-time, ultra-low latency object detection on resource-constrained edge devices for autonomous drone navigation. This work significantly enhances safety and efficiency in delivery and surveillance applications.

cs.AI
Read More

Transparent and Trustworthy AI for Real-time Financial Fraud Detection

We propose a novel explainable AI framework designed for real-time financial fraud detection, offering both high accuracy and clear, human-understandable explanations for its predictions. This system enhances trust and regulatory compliance in critical financial applications.

cs.AI
Read More

Multi-Agent Reinforcement Learning for Dynamic Urban Traffic Signal Control

This paper presents a multi-agent reinforcement learning system that dynamically optimizes urban traffic signal control in real-time. Experimental results demonstrate significant reductions in traffic congestion and travel times, paving the way for smarter city infrastructure.

cs.AI
Read More

Adaptive Learning Content Generation with Large Language Models for K-12 Education

We explore the use of large language models to adaptively generate personalized educational content for K-12 students, catering to individual learning styles and paces. This approach promises to revolutionize personalized learning experiences and improve educational outcomes.

cs.AI
Read More

Accelerating Novel Material Discovery for Solid-State Batteries via Active Learning and Generative Models

This research introduces an AI-driven platform that combines active learning with generative models to drastically accelerate the discovery and optimization of novel materials for high-performance, solid-state batteries. The approach holds immense potential for sustainable energy storage solutions.

cs.AI
Read More