The Research Engine That Thinks With You

Scientific discovery moves faster than ever. ArXiv Intelligence gives you a brain-upgrade for navigating it.

Recent Research Papers

The Patient is not a Moving Document: A World Model Training Paradigm for Longitudinal EHR

This paper introduces a novel world model training paradigm specifically designed for longitudinal Electronic Health Records (EHR). It addresses the challenges of integrating and interpreting continuous patient data over time, aiming to improve AI's ability to provide more accurate and context-aware insights for healthcare applications.

cs.AI
Read Analysis

World of Workflows: a Benchmark for Bringing World Models to Enterprise Systems

This research proposes "World of Workflows," a benchmark designed to facilitate the integration of advanced AI world models into enterprise systems. It aims to evaluate and accelerate the application of AI in complex business processes by providing a standardized framework for testing and developing AI solutions tailored for real-world corporate environments.

cs.AI
Read Analysis

Routing the Lottery: Adaptive Subnetworks for Heterogeneous Data

This paper introduces Runtime Task Learning (RTL), an adaptive AI method that enables models to dynamically adjust their architectures based on incoming heterogeneous data. It demonstrates significant advancements in areas like image classification and speech enhancement, moving away from a 'one model fits all' approach to provide tailored solutions and efficiency gains, achieving accuracy improvements of up to 5% on CIFAR-100 benchmarks.

cs.AI
Read Analysis

PhaseCoder: Microphone Geometry-Agnostic Spatial Audio Understanding for Multimodal LLMs

This paper presents PhaseCoder, a transformer-only spatial audio encoder that operates independently of microphone geometry. It processes raw multichannel audio and microphone coordinates to perform localization and generate robust spatial embeddings. This enables multimodal Large Language Models (LLMs) to perform complex spatial reasoning and targeted transcription from various microphone arrays.

cs.AI
Read Analysis

Solver-in-the-Loop: MDP-Based Benchmarks for Self-Correction and Behavioral Rationality in Operations Research

This work introduces two new benchmarks, ORDebug and ORBias, that integrate a solver into the evaluation loop for AI models. ORDebug assesses iterative self-correction in solving infeasible operations research models, while ORBias evaluates behavioral rationality in newsvendor instances. This approach aims to improve the diagnostic and self-repair capabilities of large language models in practical optimization settings.

cs.AI
Read Analysis

Exploring Reasoning Reward Model for Agents

This paper focuses on developing and exploring a reasoning reward model designed to improve the capabilities of AI agents. It likely investigates how to effectively train agents by providing rewards that are aligned with complex reasoning processes, leading to more intelligent and robust agent behaviors in various applications.

cs.AI
Read Analysis

Conditional Denoising Model as a Physical Surrogate Model

This paper explores the use of conditional denoising models as physical surrogate models for complex physical systems. It addresses the common trade-off between data-fitting accuracy and physical consistency in surrogate modeling. This approach has potential for accurately simulating physical phenomena, particularly in fields like plasma physics.

cs.AI
Read Analysis

Self-Improving Pretraining: using post-trained models to pretrain better models

The "Self-Improving Pretraining" framework integrates alignment objectives (safety, factuality, quality) directly into LLM pretraining using a powerful post-trained model as a dynamic rewriter and judge. This method leads to significant gains in generation coherence and factuality, improving the reliability and trustworthiness of large language models for real-world use.

cs.AI
Read Analysis

LLM-Assisted Logic Rule Learning: Scaling Human Expertise for Time Series Anomaly Detection

This framework leverages LLMs to encode human expertise into interpretable logic rules for time series anomaly detection in supply chains. It outperforms unsupervised methods in accuracy and interpretability and offers consistent, low-cost results suitable for production deployment, bridging the gap between automation and expert decision-making.

cs.AI
Read Analysis