This paper introduces a novel world model training paradigm specifically designed for longitudinal Electronic Health Records (EHR). It addresses the challenges of integrating and interpreting continuo...
By: Irsyad Adam, Zekai Chen, David Laprade, Shaun Porwal, David Laub, Erik Reinertsen, Arda Pekis, Kevin Brown
This research proposes "World of Workflows," a benchmark designed to facilitate the integration of advanced AI world models into enterprise systems. It aims to evaluate and accelerate the application ...
This paper introduces Runtime Task Learning (RTL), an adaptive AI method that enables models to dynamically adjust their architectures based on incoming heterogeneous data. It demonstrates significant...
By: Grzegorz Stefanski, Alberto Presta, Michal Byra
This paper presents PhaseCoder, a transformer-only spatial audio encoder that operates independently of microphone geometry. It processes raw multichannel audio and microphone coordinates to perform l...
This work introduces two new benchmarks, ORDebug and ORBias, that integrate a solver into the evaluation loop for AI models. ORDebug assesses iterative self-correction in solving infeasible operations...
This paper focuses on developing and exploring a reasoning reward model designed to improve the capabilities of AI agents. It likely investigates how to effectively train agents by providing rewards t...
This paper explores the use of conditional denoising models as physical surrogate models for complex physical systems. It addresses the common trade-off between data-fitting accuracy and physical cons...
By: José Afonso, Pedro Viegas, Rodrigo Ventura, Vasco Guerra
The "Self-Improving Pretraining" framework integrates alignment objectives (safety, factuality, quality) directly into LLM pretraining using a powerful post-trained model as a dynamic rewriter and jud...
By: Ellen Xiaoqing Tan, Shehzaad Dhuliawala, Jing Xu
This framework leverages LLMs to encode human expertise into interpretable logic rules for time series anomaly detection in supply chains. It outperforms unsupervised methods in accuracy and interpret...
By: Jianing Fang, Yuxuan Chen, Yanchao Tan, Guangtao Huang, Hongxing Li, Xiang Li, Fei Wang, Yiheng Fan, Ziyue Li, Kai Shu, Jun Wang, Zihui Xue, Jie Xu
This study experimentally investigates how AI assistance influences human skill acquisition, revealing that while it does not consistently improve immediate productivity for new learning tasks, it sig...
DynamicVLA introduces a compact 0.4B parameter vision-language-action model and the Dynamic Object Manipulation (DOM) benchmark, enabling robots to robustly manipulate moving objects in real-world sce...
LingBot-VLA is a Vision-Language-Action foundation model pre-trained on 20,000 hours of real-world multi-embodiment robot data. It demonstrates that VLA model performance scales with increasing data v...
This paper investigates the phenomenon of "illusion of insight" in AI reasoning models, where models might appear to have genuine understanding without truly possessing it. The research critically exa...
The paper introduces an agentic AI framework designed to facilitate human-AI co-creation through progressive ideation. This framework allows for iterative development of ideas, combining human creativ...
By: Sankar B, Srinidhi Ranjini Girish, Aadya Bharti, Dibakar Sen
The paper introduces Mortar, a system that uses evolving mechanics for automatic game design. This AI-driven approach can generate novel game rules and interactions, aiming to accelerate the game deve...
By: Muhammad U. Nasir, Yuchen Li, Steven James, Julian Togelius
This research explores AI's ability to interpret and reason about architectural heritage, specifically Iranian Pigeon Towers, using typological and material reasoning. It demonstrates how AI can contr...
This work presents DA-DPO, a cost-efficient and difficulty-aware preference optimization method aimed at significantly reducing hallucinations in Multimodal Large Language Models (MLLMs). By optimizin...
This paper proposes a memory-guided framework with semi-supervised learning for detecting adaptive causal coordination on social media. The approach aims to identify complex, evolving coordination pat...
This paper proposes a multi-algorithm approach to optimize human resources workload balancing in last-mile urban delivery systems. The methodology aims to improve operational efficiency and resource a...
By: Luis M. Moreno-Saavedra, Silvia Jimenez-Fernandez, Antonio Portilla-Figueras, David Casillas-Perez, Sancho Salcedo-Sanz
This research explores how semantic methods can improve tactical analysis in team sports, specifically football. It presents a methodology that uses AI to derive deeper insights into game strategies, ...
By: Alessio Di Rubbo, Mattia Neri, Remo Pareschi, Marco Pedroni, Roberto Valtancoli, Paolino Zica
This paper introduces Self-Distillation Fine-Tuning (SDFT), a method enabling large language models to continually acquire new skills and knowledge from demonstrations without catastrophic forgetting....
Researchers introduce Pixel MeanFlow (pMF), a generative model that produces high-fidelity images in a single network evaluation directly from noise in pixel space, without requiring a latent encoder ...
This work presents DeepSeek-OCR 2, investigating a novel encoder, DeepEncoder V2, capable of dynamically reordering visual tokens based on image semantics. Inspired by human visual perception, this ap...
This paper introduces AgentDoG, a diagnostic guardrail framework for AI agent safety and security, addressing challenges from autonomous tool use and environmental interactions. It provides fine-grain...
This paper proposes CovAgent, an agentic AI-powered approach to enhance Android app UI testing by inspecting decompiled Smali code and component transition graphs. It reasons about unsatisfied activat...
We present a highly optimized neural network architecture and deployment framework enabling real-time, ultra-low latency object detection on resource-constrained edge devices for autonomous drone navi...
By: Dr. Hiroshi Tanaka, Dr. Isabella Rossi, Dr. Jacob Smith, Dr. Katerina Novikova
We propose a novel explainable AI framework designed for real-time financial fraud detection, offering both high accuracy and clear, human-understandable explanations for its predictions. This system ...
By: Dr. Robert Johnson, Dr. Sarah Chen, Dr. Thomas Lee, Dr. Ursula Weber, Dr. Victor Morales
This paper presents a multi-agent reinforcement learning system that dynamically optimizes urban traffic signal control in real-time. Experimental results demonstrate significant reductions in traffic...
By: Dr. Wendy Davis, Dr. Xuan Zhou, Dr. Yuri Kim, Dr. Zoe Green
We explore the use of large language models to adaptively generate personalized educational content for K-12 students, catering to individual learning styles and paces. This approach promises to revol...
By: Dr. Alex Chang, Dr. Brenda Lee, Dr. Carlos Ruiz, Dr. Diana Popova, Dr. Ethan Brown
This research introduces an AI-driven platform that combines active learning with generative models to drastically accelerate the discovery and optimization of novel materials for high-performance, so...
By: Dr. Lucas Garcia, Dr. Maya Singh, Dr. Noah Brown, Dr. Olivia Davies, Dr. Paul White, Dr. Quinn Taylor
This research focuses on applying advanced AI techniques, including Bayesian optimization and deep learning, to optimize the design and operational parameters of direct air capture (DAC) technologies....
By: Dr. Fiona MacLeod, Dr. Gregory Parker, Dr. Hannah Zhao, Dr. Ivan Volkov, Dr. Jessica O'Connell
This paper explores the application of large-scale generative AI foundation models for accelerating personalized drug discovery. It details novel architectures capable of synthesizing drug candidates ...
By: Dr. Anya Petrova, Dr. Ben Carter, Dr. Chen Li, Dr. David Sharma, Dr. Emily Wong, Dr. Frank Miller, Dr. Grace Kim
Tencent researchers introduced Youtu-VL, a Vision-Language Model framework addressing fine-grained visual information loss with a "vision-as-target" optimization paradigm, achieving competitive perfor...
Researchers introduced a framework and benchmark to study visual world modeling in Unified Multimodal Models (UMMs), demonstrating that visual generation significantly improves reasoning on physical a...
This paper introduces On-Policy Self-Distillation (OPSD), a framework enabling a single Large Language Model to improve its mathematical reasoning by simultaneously acting as a teacher and student. OP...
This paper advocates for NeuroAI, a type of Neuroscience-informed Artificial Intelligence, by identifying current and future areas of synergism between neuroscience and AI. It focuses on embodiment, l...
Robbyant introduces Masked Depth Modeling (MDM), a framework that leverages natural sensor failures in RGB-D cameras as learning signals to generate dense, metric-scale, and pixel-aligned depth maps. ...
By: Bin Tan, Changjiang Sun, Xiage Qin, Hanat Adai, Zelin Fu, Tian Zhou, Han Zhang, Yinghao Xu, Xing Zhu, Yujun Shen, Nan Xue
Researchers introduce TTT-Discover, a test-time training framework that enables Large Language Models to learn and adapt during problem-solving, leading to new state-of-the-art solutions in diverse sc...
By: Mert Yuksekgonul, Daniel Koceja, Xinhao Li, Federico Bianchi, Jed McCaleb, Xiaolong Wang, Jan Kautz, Yejin Choi, James Zou, Carlos Guestrin, Yu Sun
This paper explores the application of vision-language pre-training techniques to improve the accuracy and interpretability of medical image analysis. By jointly learning from image and text data, the...
This paper develops novel interpretable AI models for transparent and reliable financial risk assessment. By providing clear explanations for their predictions, these models increase trust and facilit...
This paper explores the development of novel foundational models that enable robots to operate robustly and adaptively in complex and rapidly changing real-world environments. The models integrate adv...
The research presents a novel quantization technique that significantly reduces the computational and memory footprint of large language models, making them deployable on resource-constrained edge dev...
This paper proposes a generative AI framework that accelerates the discovery of novel drug candidates tailored to individual patient genetic profiles. By leveraging advanced deep learning architecture...
Addressing the challenge of catastrophic forgetting, this research introduces a continual learning paradigm for autonomous driving agents. The proposed methods allow vehicles to continuously learn fro...
By: Karen Park, Leo Rodriguez, Mia Taylor, Noah Davis, Olivia Hall
This paper introduces daVinci-Dev, a systematic agentic mid-training approach that equips large language models (LLMs) with foundational agentic behaviors for software engineering. It addresses the di...
By: Ji Zeng, Dayuan Fu, Tiantian Mi, Yumin Zhuang, Yaxing Huang, Xuefeng Li, Lyumanshan Ye, Muhang Xie, Qishuo Hua, Zhen Huang, Mohan Jiang, Hanning Wang, Jifan Lin, Yang Xiao, Jie Sun, Yunze Wu, Pengfei Liu
This paper introduces SOAR, a new self-improvement framework that enables large language models (LLMs) to generate their own curricula for mathematical reasoning problems they cannot initially solve. ...
By: Shobhita Sundaram, John Quan, Ariel Kwiatkowski, Kartik Ahuja, Yann Ollivier, Julia Kempe
This paper presents TelcoAI, an agentic, multi-modal Retrieval-Augmented Generation (RAG) system specifically designed for 3GPP documentation. It significantly improves recall, claim recall, and faith...
This paper introduces TSRBench, a comprehensive benchmark designed for multi-task and multi-modal time series reasoning. It aims to evaluate and advance generalist AI models in their ability to unders...
This research introduces a comprehensive diagnostic framework that utilizes big data analytics to evaluate the procedural reliability of intelligent agent systems. It addresses critical needs for depl...
This paper investigates how multi-agent bandit systems can effectively exchange and leverage visual uncertainties to improve decision-making. This is particularly relevant in dynamic environments wher...
This research focuses on developing scalable rubrics to enhance the quality and reliability of Large Language Models (LLMs) specifically tailored for healthcare applications. The goal is to improve th...
By: Zhichao Yang, Sepehr Janghorbani, Dongxu Zhang, Jun Han, Qian Qian, Andrew Ressler II, Gregory D. Lyng, Sanjit Singh Batra, Robert E. Tillman
The paper proposes a novel generative AI approach for creating synthesizable drug-like molecular glues. This realistic AI method offers a promising pathway for discovering new therapeutic compounds, a...
This paper explores the convergence of generative AI and Extended Reality (XR) to enable more scalable and natural human-computer interactions. It delves into how AI can enhance immersive experiences,...
Skywork UniPic 3.0 introduces a unified multi-image composition framework that leverages sequence modeling to generate complex and coherent images from multiple input components. This advancement in g...
This research proposes a novel approach to detect climate change disinformation by integrating vision-language models with external knowledge sources. The multimodal system analyzes both textual and v...
This paper proposes "Multi-Persona Thinking" as a novel approach to mitigate social biases in Large Language Models (LLMs). By enabling LLMs to consider multiple perspectives, the research aims to red...
This paper focuses on developing methods for evaluating prompts for Large Language Models (LLMs) specifically in educational contexts. It addresses the challenges of assessing prompt effectiveness and...
By: Langdon Holmes, Adam Coscia, Scott Crossley, Joon Suh Choi, Wesley Morris
This paper introduces FlexLLM, a composable High-Level Synthesis (HLS) library designed for flexible hybrid Large Language Model (LLM) accelerator design. It aims to streamline the development of effi...
By: Jiahao Zhang, Zifan He, Nicholas Fraser, Michaela Blott, Yizhou Sun, Jason Cong
This paper introduces Cosmos Policy, a method for fine-tuning large, pretrained latent video diffusion models into unified robot policies for visuomotor control and planning. It achieves state-of-the-...
By: Moo Jin Kim, Yihuai Gao, Tsung-Yi Lin, Yen-Chen Lin, Yunhao Ge, Grace Lam, Percy Liang, Shuran Song, Ming-Yu Liu, Chelsea Finn, Jinwei Gu
This paper explores methods for controlling the long-term behavior of language model agents by incorporating explicit state dynamics. It aims to improve the predictability and reliability of AI agents...
Generalizing video matting models to real-world videos remains a significant challenge due to the scarcity of labeled data. We present VideoMaMa, a novel mask-guided video matting framework that conve...
By: Sangbeom Lim, Seoung Wug Oh, Jiahui Huang, Heeji Yoon, Seungryong Kim, Joon-Young Lee
Large language models (LLMs) often struggle with complex reasoning tasks that require accurate and up-to-date factual knowledge. This paper proposes a novel framework that integrates Monte Carlo Tree ...
Optimizing scientific computing algorithms for modern GPUs is a labor-intensive and iterative process involving repeated code modification, benchmarking, and tuning across complex hardware and softwar...
Foreign Information Manipulation and Interference (FIMI) on social media poses a significant threat to democratic processes. This paper proposes a framework-agnostic agent-based operationalization of ...
By: Kevin Tseng, Juan Carlos Toledano, Bart De Clerck, Yuliia Dukach, Phil Tinn
Specific domains depend on high-quality fine-tuning datasets, particularly in instructional format (e.g., Question-Answer - Q&A). However, generating these datasets, particularly from unstructured sou...
By: Alex Echeverria, Sávio Salvarino Teles de Oliveira, Fernando Marques Federson
This paper introduces "The Agentic Leash," a method for extracting causal feedback fuzzy cognitive maps using Large Language Models (LLMs). This approach enables better interpretability and understand...
This paper presents a novel approach utilizing a vision-and-knowledge enhanced large language model to achieve generalizable inference of pedestrian crossing behavior. This development is crucial for ...
We present SciCoQA, a dataset for detecting discrepancies between scientific publications and their codebases to ensure faithful implementations. We construct SciCoQA from GitHub issues and reproducib...
This paper presents a comprehensive solution for detecting AI-generated videos, a critical need due to the increasing realism of synthetic media. The proposed system utilizes advanced computer vision ...
By: Long Ma, Zihao Xue, Yan Wang, Zhiyuan Yan, Jin Xu, Xiaorui Jiang, Haiyang Yu, Yong Liao, Zhen Bi
This paper introduces "The Great March 100" (GM-100), a benchmark of 100 detail-oriented tasks for evaluating embodied AI agents. It addresses limitations in existing datasets by providing a diverse a...
ShapeR introduces a novel approach for robust conditional 3D object shape generation from casually captured image sequences. It leverages multi-modal inputs like SLAM points, posed images, and VLM-gen...
By: Yawar Siddiqui, Duncan Frost, Samir Aroudj, Armen Avetisyan, Henry Howard-Jenkins, Daniel DeTone, Pierre Moulon, Qirui Wu, Zhengqin Li, Julian Straub, Richard Newcombe, Jakob Engel
This paper addresses the critical challenge of hyperparameter optimization for Constraint Programming (CP) solvers. It proposes advanced techniques to automatically tune these parameters, significantl...
By: Hedieh Haddad, Thibault Falque, Pierre Talbot, Pascal Bouvry
This research proposes the Large language model and Extended Greedy (LEG) framework to optimize health facility location in Ethiopia. It integrates expert knowledge, articulated in natural language, w...
This paper extends an LLM-based framework for Predictive Process Monitoring (PPM), evaluating its generality and reasoning mechanisms. It demonstrates that LLMs outperform benchmark methods in data-sc...
By: Alessandro Padella, Massimiliano de Leoni, Marlon Dumas
This paper introduces BoxMind, a closed-loop AI expert system for optimizing boxing strategies. It uses multi-modal data to define atomic punch events and proposes a graph-based predictive model to ca...
This research investigates the multifaceted impact of generative AI tools on the early stages of architectural design. It examines how these AI systems influence designers' performance, their creative...
By: Han Jiang, Yao Xiao, Rachel Hurley, Shichao Liu
This paper introduces a novel approach for constructing "context bubbles" in enterprise retrieval-augmented generation (RAG) systems, focusing on both the structural integrity and semantic diversity o...
Human papillomavirus (HPV) vaccine hesitancy poses significant public health challenges, particularly in Japan where proactive vaccination recommendations were suspended from 2013 to 2021. The resulti...
This paper proposes an agentic AI framework designed for autonomous, explainable, and real-time credit risk decision-making. The system leverages advanced AI agents to process financial data, assess r...
This research investigates the reliability of AI explanations, specifically focusing on chain-of-thought reasoning in large language models. The study provides evidence of systematic underreporting, w...
This paper introduces CogCanvas, a system designed for verbatim-grounded artifact extraction from extensive Large Language Model (LLM) conversations. It addresses the challenge of managing and leverag...
The research explores advanced reinforcement learning techniques to optimize smart grid operations in real-time, enhancing energy distribution efficiency, reducing peak loads, and improving resilience...
By: Michael Green, Sarah Brown, David Jones, Emily White, Frank Black, Grace Red, Henry Blue
This paper introduces a novel framework utilizing multimodal foundation models to create highly personalized healthcare solutions. It integrates patient data from various sources including genomics, e...
By: Anna Petrova, Dmytro Kovalenko, Olena Lysenko, Sergii Tkachenko, Victoria Bondar
This paper presents an innovative Explainable AI (XAI) model designed to improve transparency and trustworthiness in financial risk assessment. By providing clear justifications for its predictions, t...
By: Sophie Dubois, Pierre Dupont, Chloe Martin, Antoine Bernard
This paper introduces ML-Master 2.0, an autonomous agent tackling ultra-long-horizon machine learning engineering. It uses Hierarchical Cognitive Caching to manage context and sustain strategic cohere...
By: Xinyu Zhu, Yuzhu Cai, Zexi Liu, Bingyang Zheng, Cheng Wang, Rui Ye, Jiaao Chen, Hanrui Wang, Wei-Chen Wang, Yuzhi Zhang, Linfeng Zhang, Weinan E, Di Jin, Siheng Chen
LSRIF introduces a logic-structured training framework that explicitly models instruction logic for large language models to improve instruction-following. It addresses challenges with sequential depe...
By: Qingyu Ren, Qianyu He, Jingwen Chang, Jie Zeng, Jiaqing Liang, Yanghua Xiao, Han Xia, Zeye Sun, Fei Yu
This paper leverages epistemology to reframe human-AI complementarity, aiming to address theoretical challenges in understanding when human-AI teams outperform either alone. It seeks to provide a more...
By: Andrea Ferrario, Rasita Vinay, Matteo Casserini, Alessandro Facchini
DeepResearchEval is an automated framework for constructing deep research tasks and evaluating AI agents. It addresses challenges in assessing multi-step web research and cross-source information synt...
By: Yibo Wang, Lei Wang, Yue Deng, Keming Wu, Yao Xiao, Huanjin Yao, Liwei Kang, Hai Ye, Yongcheng Jing, Lidong Bing
This paper proposes Test-Time Tool Evolution (TTE), a new paradigm enabling LLM agents to synthesize, verify, and evolve executable tools during inference for scientific reasoning. It overcomes the li...
This scoping review maps ethically-oriented work on anthropomorphising LLM-based conversational agents, discussing benefits like engagement and inclusion versus concerns such as deception and overreli...
By: Andrea Ferrario, Tetsuya Sakai, Matteo Casserini, Alessandro Facchini
This paper proposes Controlled Self-Evolution (CSE) to enhance code generation through iterative generate-verify-refine cycles. It addresses inefficiencies in existing self-evolution methods for algor...
By: Tu Hu, Ronghao Chen, Shuo Zhang, Jianghao Yin, Mou Xiao Feng, Jingping Liu, Shaolei Zhang, Wenqi Jiang, Yuqi Fang, Sen Hu, Yi Xu, Huacan Wang
Real-world deployment of GUI agents requires aligning with users' complex implicit intents, beyond explicit instructions. This paper introduces "PersonalAlign," a new agent task where agents utilize l...
Centralized multi-agent systems based on LLMs often struggle with unstable long-horizon collaboration due to a lack of memory management, leading to context bloat, error accumulation, and poor cross-t...
While LLMs excel in text-based code automation, their potential in graph-oriented engineering workflows like Simulink remains underexplored. SimuAgent is an LLM-powered modeling and simulation agent f...
Large Language Model-based Multi-Agent Debate (MAD) frameworks enhance reasoning and collaboration, but existing approaches suffer from agents adopting identical reasoning paths, leading to errors and...
Modern supply chains are increasingly vulnerable to disruptions. This paper introduces a minimally supervised agentic AI framework that autonomously monitors, analyzes, and responds to disruptions acr...
Large Language Models (LLMs) in educational applications often reveal solutions rather than fostering dialogic learning. This paper introduces ConvoLearn, a dataset grounded in knowledge building theo...
Multi-agent systems powered by Large Language Models (LLMs) often struggle with resource-intensive and unstable training due to non-stationarity and sparse rewards in multi-agent reinforcement learnin...
By: Zhiyuan Hu, Yunhai Hu, Juncheng Liu, Shuyue Stella Li, Yucheng Wang, Zhen Xu, See-Kiong Ng, Anh Tuan Luu, Xinxing Xu, Bryan Hooi, Cynthia Breazeal, Hae Won Park
This study enhances dementia prediction using machine learning techniques on patient health data, with supervised learning algorithms like KNN, QDA, LDA, and Gaussian Process Classifiers. LDA achieved...
By: Shafiul Ajam Opee, Nafiz Fahad, Anik Sen, Rasel Ahmed, Fariha Jahan, Md. Kishor Morol, Md Rashedul Islam
Recent advancements in single-cell multi-omics provide profound insights into cellular heterogeneity. This paper proposes OKR-CELL, an Open-world Language Knowledge-Aided Robust Single-Cell Foundation...
This work introduces a generative co-memory regularization approach for Few-shot Class-Incremental Learning (FSCIL). The method leverages generative domain adaptation to fine-tune a pre-trained encode...
This paper introduces ECLIPSE, an Evolutionary Computation Library for Instrumentation Prototyping in Scientific Engineering. This library aims to accelerate the design and optimization of scientific ...
By: Max Foreback, Evan Imata, Vincent Ragusa, Jacob Weiler, Christina Shao, Joey Wagner, Katherine G. Skocelas, Jonathan Sy, Aman Hafez, Wolfgang Banzhaf, Amy Conolly, Kyle R. Helson, Rick Marcusen, Charles Ofria, Marcin Pilinski, Rajiv Ramnath, Bryan Reynolds, Anselmo C. Pontes, Emily Dolson, Julie Rolla
This paper benchmarks nine small language models (SLMs) and small reasoning language models (SRLMs) on system log severity classification using real-world `journalctl` data from Linux production serve...
By: Yahya Masri, Emily Ma, Zifu Wang, Joseph Rogers, Chaowei Yang
This paper proposes AdaFuse, an adaptive ensemble decoding method with test-time scaling for large language models (LLMs). This approach aims to enhance the performance of LLMs by combining outputs fr...
By: Chengming Cui, Tianxin Wei, Ziyi Chen, Ruizhong Qiu, Zhichen Zeng, Zhining Liu, Xuying Ning, Duo Zhou, Jingrui He
This paper introduces "transparent documents," interactive web-based scholarly articles that allow readers to explore the relationship to underlying data by hovering over text fragments. It also prese...
By: Alfonso Piscitelli, Cristina David, Mattia De Rosa, Ali Mohammed, Federico Nanni, Jacob Pake, Roly Perera, Jessy Sodimu, Chenyiqiu Zheng
This paper proposes PsychEval, a new multi-session and multi-therapy benchmark for evaluating AI psychological counselors. It aims to provide high-realism and comprehensive assessment of AI's capabili...
This paper introduces MineNPC-Task, a task suite designed to evaluate memory-aware Minecraft agents. It focuses on the development of AI agents that can effectively manage and utilize memory in comple...
By: Tamil Sudaravan Mohan Doss, Michael Xu, Sudha Rao, Andrew D. Wilson, Balasaravanan Thoravi Kumaravel
Depression is a major contributor to the mental-health burden in Nigeria, yet screening coverage remains limited due to low access to clinicians, stigma, and language barriers. This paper explores fin...
By: Isaac Iyinoluwa Olufadewa, Miracle Ayomikun Adesina, Ezekiel Ayodeji Oladejo, Uthman Babatunde Usman, Owen Kolade Adeniyi, Matthew Tolulope Olawoyin
Lumpy Skin Disease (LSD) is a contagious viral infection that significantly deteriorates livestock health. Early and precise identification is crucial. This paper proposes a hybrid deep learning-based...
By: Muhammad Tahir, Abdul Basit, Muhammad Awais, Muhammad Imran, Farman Ali, Muhammad Shoaib, Ali Raza
This paper proposes a novel approach for stock market price prediction leveraging a hybrid model that combines Neural Prophet with a Deep Neural Network (DNN). The integration aims to capture both tim...
Agents capable of reasoning and planning in the real world require the ability of predicting the consequences of their actions. While world models possess this capability, they most often require acti...
By: Quentin Garrido, Tushar Nagarajan, Basile Terver, Nicolas Ballas, Yann LeCun, Michael Rabbat
The rapid advancement of large language models (LLMs) has led to growing interest in using synthetic data to train future models. However, this creates a self-consuming retraining loop, where models a...
By: Yaxuan Wang, Zhongteng Cai, Yujia Bao, Xueru Zhang, Yang Liu
RoboVIP introduces a multi-view video generation framework that enhances robotic manipulation datasets by creating diverse backgrounds and tabletop scenes using visual identity prompting. This method ...
This paper presents a criticality-aware robust reinforcement learning framework to enhance safety and robustness in autonomous driving systems. By focusing on sparse but critical threats, the method i...
This paper introduces MAGMA, a novel multi-graph based agentic memory architecture designed to enhance the capabilities of AI agents. It focuses on enabling agents to manage complex memories for impro...
By: Dongming Jiang, Yi Li, Guanpeng Li, Bingzhe Li
This paper examines the critical issue of legal alignment for safe and ethical artificial intelligence. It explores how AI development can be guided by legal and ethical frameworks to ensure responsib...
By: Noam Kolt, Nicholas Caputo, Jack Boeglin, Cullen O'Keefe, Rishi Bommasani, Stephen Casper, Mariano-Florentino Cuéllar, Noah Feldman, Iason Gabriel, Gillian K. Hadfield, Lewis Hammond, Peter Henderson, Atoosa Kasirzadeh, Seth Lazar, Anka Reuel, Kevin L. Wei, Jonathan Zittrain
This paper investigates the fine-tuning of small language models to act as efficient enterprise search relevance labelers. The approach demonstrates how smaller LLMs can be optimized for specific busi...
By: Yue Kang, Zhuoyi Huang, Benji Schussheim, Diana Licon, Dina Atia, Shixing Cao, Jacob Danovitch, Kunho Kim, Billy Norcilien, Jonah Karpman, Mahmound Sayed, Mike Taylor, Tao Sun, Pavel Metrikov, Vipul Agarwal, Chris Quirk, Ye-Yi Wang, Nick Craswell, Irene Shaffer, Tianwei Chen, Sulaiman Vesal, Soundar Srinivasan
This paper introduces MedPI, a high-dimensional benchmark for evaluating large language models (LLMs) in patient-clinician conversations. Unlike single-turn QA benchmarks, MedPI assesses medical dialo...
By: Diego Fajardo V., Oleksii Proniakin, Victoria-Elisabeth Gruber, Razvan Marinescu
This paper generalizes the Evidence Accumulation Model (EAM) to real-world contexts, investigating how active sensing through eye movements influences decision-making. It proposes a cognitive scheme t...
This research presents Project Ariadne, proposing a structural causal framework to audit the faithfulness of Large Language Model (LLM) agents. This is crucial for ensuring that LLM agents provide acc...
This work focuses on developing methods for detecting hallucinations in long chain-of-thought reasoning processes, especially in the context of large language models. Effective hallucination detection...
By: Haolang Lu, Minghui Pan, Ripeng Li, Guoshun Nan, Jialin Zhuang, Zijie Zhao, Zhongxiang Sun, Kun Wang, Yang Liu
The paper presents a cross-lingual ontology alignment system that uses embedding-based cosine similarity matching. Ontology entities are contextually enriched through novel techniques, employing a fin...
This paper introduces Falcon-H1R, a hybrid model designed to enhance AI reasoning capabilities. The focus is on efficient test-time scaling, allowing the system to maintain high performance in complex...
By: Falcon LLM Team, Iheb Chaabane, Puneesh Khanna, Suhail Mohmad, Slim Frikha, Shi Hu, Abdalgader Abubaker, Reda Alami, Mikhail Lubinets, Mohamed El Amine Seddik, Hakim Hacid
This paper introduces FormuLLA, an innovative approach that leverages Large Language Models (LLMs) to generate novel 3D printable formulations. This opens up new possibilities for rapid prototyping an...
By: Adeshola Okubena, Yusuf Ali Mohammed, Moe Elbadawi
Introducing EverMemOS, a self-organizing memory operating system designed to enhance structured long-horizon reasoning in AI systems. This enables systems to efficiently manage and utilize information...
This research delves into the geometry of reason, exploring spectral signatures that indicate valid mathematical reasoning. This study could contribute to building AI systems capable of more robust an...
Recursive Language Models (RLMs) introduce a general inference strategy that allows Large Language Models (LLMs) to process arbitrarily long prompts (exceeding 10 million tokens) by treating them as e...
This paper addresses the critical limitation of hallucination in Large Language Models (LLMs) by proposing a novel and robust uncertainty quantification method (RU) for factual generation. It construc...
This paper introduces RoboReward, a set of general-purpose vision-language reward models along with a new benchmark called RoboRewardBench, designed for robotics applications. The RoboReward 8B model ...
By: Tony Lee, Andrew Wagenmaker, Karl Pertsch, Kevin Black, Suraj Nair, Michael Ahn, Jian Lan, Sergey Levine, Chelsea Finn
This paper presents Avatar Forcing, a new diffusion-driven framework that enables real-time interactive head avatar generation for natural conversation. It addresses the challenges of real-time motion...
By: Ki Taekyung, Junho Kim, Hyeonsu Lee, Hyewon Son, Jonghyun Choi
This technical report introduces STAgent, an agentic large language model developed by Alibaba Amap, specifically engineered for real-world spatio-temporal reasoning and complex planning. It achieves ...
By: Yulan Hu, Xiangwen Zhang, Sheng Ouyang, Hao Yi, Lu Xu, Qinglin Lang, Lide Tan, Xiang Cheng, Tianchen Ye, Zhicong Li, Ge Chen, Wenjin Yang, Zheng Pan, Shaopan Xiong, Siran Yang, Ju Huang, Yan Zhang, Jiamang Wang, Yong Liu, Yinfeng Huang, Tucheng Lin, Xin Li, Ning Guo
This research introduces ClinicalReTrial, a self-evolving AI agent designed to optimize clinical trial protocols. This agent leverages AI and multiagent systems to enhance the efficiency and effective...
We propose a comprehensive framework for building trustworthy AI systems by integrating explainability techniques with adversarial robustness methods in deep learning. This work addresses critical con...
By: Professor Julian Vance, Dr. Lena Schmidt, Mr. Omar Hassan, Ms. Jessica Lee, Dr. Martin Müller
We propose a new architectural design that significantly reduces the computational and energy footprint of large language models (LLMs), enabling their efficient deployment on edge devices. This break...
By: Professor Kai Hansen, Dr. Lena Popova, Mr. John M. Smith, Dr. Isabella Garcia, Dr. Wei Wang
This paper presents a multimodal conversational AI system that seamlessly integrates natural language understanding, speech recognition, and visual context to provide highly personalized and effective...
By: Dr. Sophia G. Miller, Professor Alexandre Dubois, Ms. Emily R. Chen, Mr. Robert Johnson, Dr. Priya Reddy, Mr. Carlos Mendoza
This research explores the application of federated learning to enable robust AI model training across distributed healthcare data sources without compromising patient privacy. Our method demonstrates...
By: Dr. Maria S. Kowalski, Professor Jian Li, Dr. Fatima Zahra, Mr. Benjamin Carter, Dr. Hiroshi Sato, Ms. Chloe Dubois, Dr. Anya Singh
This paper presents a novel framework for integrating adaptive agentic AI systems into human-machine teams, focusing on dynamic task allocation, context-aware decision-making, and real-time learning. ...
By: Dr. Elena Petrova, Dr. Kenji Tanaka, Professor Marcus Chen, Dr. Anya Sharma, Mr. David Rodriguez
This paper introduces a novel robust policy learning framework enabling seamless and safe human-robot collaboration in complex, unstructured industrial settings. The approach leverages advanced percep...
By: Dr. Emily White, Prof. Joon-Ho Kim, Dr. Ricardo Garcia, Dr. Anna Schmidt, Dr. Ben Carter
This research proposes a novel deep learning framework that integrates various medical imaging modalities for highly accurate and early detection of pancreatic cancer. The model significantly improves...
By: Dr. Anya Sharma, Prof. David Chen, Dr. Elena Petrova, Dr. Kenji Tanaka, Dr. Sofia Bianchi
This research presents an innovative adaptive AI tutoring system designed to personalize the learning experience, significantly boosting student engagement and improving academic outcomes. The system ...
By: Dr. Daniel Brown, Prof. Jessica Green, Dr. Hiroshi Sato, Dr. Laura Martinez, Dr. Peter Wang
We develop and evaluate context-aware large language models specifically tailored for legal applications, enabling the personalized generation and efficient review of complex legal documents. This sys...
By: Dr. Sophia Davis, Prof. Robert Miller, Dr. Chen Li, Dr. Maria Rodriguez, Dr. David Jones
SyncGait is a novel user-drone mutual authentication system that leverages implicit gait behaviors, specifically the user's unique arm swing, for robust long-distance authentication during drone deliv...
By: Zijian Ling, Man Zhou, Hongda Zhai, Yating Huang, Lingchen Zhao, Qi Li, Chao Shen, Qian Wang
Current language model evaluations measure what models know under ideal conditions but not how robustly they know it under realistic stress. We introduce the Drill-Down and Fabricate Test (DDFT), a pr...
This paper introduces Space AI as a unified interdisciplinary field at the intersection of artificial intelligence and space science and technology. It proposes a systematic framework organizing Space...
This work focuses on improving code generation from Bangla natural language prompts to Python code, utilizing iterative self-correction mechanisms and multilingual AI agents. It aims to bridge the gap...
This paper introduces SpaceTimePilot, a system for generative rendering of dynamic scenes, enabling the creation of realistic and evolving visual content across both spatial and temporal dimensions. T...
By: Zhening Huang, Hyeonho Jeong, Xuelin Chen, Yulia Gryaditskaya, Tuanfeng Y. Wang, Joan Lasenby, Chun-Hao Huang
This paper introduces an AI and Optical Character Recognition (OCR)-driven pipeline for digitizing and integrating historical documents into databases. It addresses challenges like layout variability ...
By: Zahra Abedi, Richard M.K. van Dijk, Gijs Wijnholds, Tessa Verhoef
This research focuses on developing sophisticated control policies for humanoid robots to achieve coordinated manipulation tasks. It explores how robots can make intelligent choices to perform complex...
By: Haozhi Qi, Yen-Jen Wang, Toru Lin, Brent Yi, Yi Ma, Koushil Sreenath, Jitendra Malik
This paper addresses the critical challenge of efficient and accurate data annotation for multisensor datasets, particularly for the rigorous testing of autonomous vehicles. It proposes semi-automated...
By: Andrii Gamalii, Daniel Górniak, Robert Nowak, Bartłomiej Olber, Krystian Radlak, Jakub Winter
This research investigates how iterative deployment strategies can significantly enhance the planning capabilities of Large Language Models (LLMs). The paper presents novel approaches for refining LLM...
By: Augusto B. Corrêa, Yoav Gelberg, Luckeciano C. Melo, Ilia Shumailov, André G. Pereira, Yarin Gal
This paper explores the development of context-aware AI agents based on large language models (LLMs) designed for human-centered energy management systems in smart buildings. The research aims to opti...
This theoretical paper argues for the necessity of incorporating uncertainty, incomplete preferences, and non-Archimedean utilities into AI safety frameworks. It suggests that current approaches to AI...
By: Alessio Benavoli, Alessandro Facchini, Marco Zaffalon
The paper presents Robo-Dopamine, a framework for high-precision robotic manipulation using reinforcement learning (RL). It introduces Dopamine-Reward, a novel multi-view, step-aware process reward mo...
Retrieval-Augmented Generation (RAG) systems enhance large language models by grounding responses in external knowledge bases, but conventional RAG architectures operate with static corpora that canno...
This paper introduces a scalable method to train language models as "AI co-scientists" capable of generating high-quality research plans across diverse scientific domains. It leverages automated extra...
By: Shashwat Goel, Rishi Hazra, Dulhan Jayalath, Timon Willi, Parag Jain, William F. Shen, Ilias Leontiadis, Francesco Barbieri, Yoram Bachrach, Jonas Geiping, Chenxi Whitehouse
This paper introduces MAI-UI, a family of foundation GUI agents designed for real-world deployment. It integrates agent-user interaction, external tool use via MCP, and a native device-cloud collabora...
This paper presents HY-Motion 1.0, a series of state-of-the-art, large-scale motion generation models that produce 3D human motions from text descriptions. It is the first to scale Diffusion Transform...
This survey unifies insights from cognitive neuroscience with Large Language Model (LLM)-driven agents, offering a comprehensive review of memory systems. It establishes a unified framework detailing ...
This paper presents a novel approach utilizing Information-Driven Large Language Model (LLM) Graph Reasoning to predict venture capital investment success. By analyzing complex relationships in financ...
By: Haoyu Pei, Zhongyang Liu, Xiangyi Xiao, Xiaocong Du, Haipeng Zhang, Kunpeng Zhang, Suting Hong
This paper introduces Web World Models, a new approach to building AI agents that can understand and interact with the internet more effectively. It aims to create AI that can navigate, process inform...
This research explores a novel method for causal discovery in federated learning settings, especially when interventions are unknown. It focuses on how to identify causal relationships across distribu...
This paper presents the application of Physics-Informed Neural Networks (PINNs) for modeling semiconductor devices and electronic circuits, using NeuroSPICE as a case study. This approach integrates p...
This paper addresses energy consumption in AI systems orchestrated by Large Language Models (LLMs) by proposing an energy-aware, data-driven model selection strategy. This research is critical for dev...
By: Daria Smirnova, Hamid Nasiri, Marta Adamska, Zhengxin Yu, Peter Garraghan
This preprint investigates the ability of Large Language Models (LLMs) to engage in both divergent (idea generation) and convergent (problem formulation) thinking for creative problem generation. It e...
This paper introduces RL-Struct, a lightweight reinforcement learning framework designed to improve the reliability of structured output generated by large language models. By ensuring more consistent...
This paper explores the integration of knowledge graphs with large language models to enhance the accuracy and interpretability of disease prediction. By leveraging structured medical knowledge, the p...
By: Ruiyu Wang, Tuan Vinh, Ran Xu, Yuyin Zhou, Jiaying Lu, Carl Yang, Francisco Pasquel
This paper introduces A2P-Vis, an agentic pipeline designed to automate the generation of visual insights and reports. It aims to streamline data visualization and communication, providing an efficien...
By: Shuyu Gan, Renxiang Wang, James Mooney, Dongyeop Kang
Neural network pruning is widely used to reduce model size and computational cost. However, most existing methods treat sparsity as an extrinsic constraint enforced via heuristic importance scores or ...
Spatial transcriptomics experiments are rapidly expanding in scale and complexity, making computational analysis a major bottleneck in biological discovery. While frontier AI agents have shown signifi...
By: Kenny Workman, Zhen Yang, Harihara Muralidharan, Hannah Le
Performance optimization is a critical yet challenging aspect of software development, often requiring a deep understanding of system behavior, algorithmic tradeoffs, and careful code modifications. A...
By: Huiyun Peng, Antonio Zhong, Ricardo Andrés Calvo Méndez, Kelechi G. Kalu, James C. Davis
The House-Tree-Person (HTP) drawing test, introduced by John Buck in 1948, remains a widely used projective technique in clinical psychology. However, it has long faced challenges such as heterogeneou...
Segment Anything Model 2 (SAM2), a vision foundation model has significantly advanced in prompt-driven video object segmentation, yet their practical deployment remains limited by the high computation...
By: Avilasha Mandal, Chaoning Zhang, Fachrina Dewi Puspitasari, Xudong Wang, Jiaquan Zhang, Caiyan Qin, Guoqing Wang, Yang Yang, Heng Tao Shen
This paper introduces ScoutGPT, a GPT-based framework designed to analyze team action sequences and quantify individual player impact in sports. By leveraging advanced language model capabilities, Sco...
By: Miru Hong, Minho Lee, Geonhee Jo, Jae-Hee So, Pascal Bauer, Sang-Ki Ko
This paper introduces MegaRAG, a novel framework for Retrieval Augmented Generation that leverages both multimodal data and knowledge graphs. It aims to enhance the accuracy and relevance of generated...
In real-world clinical practice, electrocardiograms (ECGs) are often captured and shared as photographs. However, publicly available ECG data, and thus most related research, relies on digital signals...
By: Xiaoyu Wang, Ramesh Nadarajah, Zhiqiang Zhang, David Wong
Document forgery poses a growing threat to legal, economic, and governmental processes, requiring increasingly sophisticated verification mechanisms. Recent advances in code generation with large lang...
By: Valentin Schmidberger, Manuel Eberhardinger, Setareh Maghsudi, Johannes Maucher
This paper investigates the underlying geometric principles behind AI hallucinations, particularly in Large Language Models. By analyzing 'angles,' it seeks to provide a predictable and computationall...
This paper introduces MiST, a framework for understanding the impact of mid-stage scientific training on the development of chemical reasoning models. By improving these models, it has significant rea...
By: Andres M Bran, Tong Xie, Shai Pranesh, Jeffrey Meng, Xuan Vu Nguyen, Jeremy Goumaz, David Ming Segura, Ruizhi Xu, Dongzhan Zhou, Wenjie Zhang, Bram Hoex, Philippe Schwaller
This technical report introduces C2LLM, a novel approach to code retrieval that utilizes adaptive cross-attention pooling. This innovation has direct real-world applications in software development, e...
By: Jin Qin, Zihan Liao, Ziyin Zhang, Hang Yu, Peng Di, Rui Wang
Ensuring the safety of embodied AI agents in complex, unstructured environments is a critical challenge. This paper introduces RoboSafe, a novel framework that integrates executable safety logic direc...
This study introduces a quantum-inspired framework for optimizing the exploration-exploitation tradeoff in multi-agent reinforcement learning (MARL), specifically applied to UAV-assisted 6G network de...
Small Language Models (SLMs) struggle with complex document understanding due to limited parameters. SMART SLM, a novel Structured Memory and Reasoning Transformer, enhances SLMs for accurate document...
Masked Diffusion Models (MDMs) offer flexible non-autoregressive generation, but their output quality is highly sensitive to the decoding order. This paper formalizes this issue by attributing variabi...
Smart home lighting systems consume 15-20% of residential energy but often lack adaptive intelligence. BitRL-Light is a novel framework that combines 1-bit quantized Large Language Models (LLMs) with ...
Explainable Artificial Intelligence (XAI) is vital for trust and transparency in AI systems, especially in high-stakes applications. This study introduces an Agentic XAI approach that utilizes the ite...
Large Language Models (LLMs) show promise for medication safety in healthcare. This paper presents a real-world evaluation of an LLM-powered system for medication safety reviews in NHS Primary Care, i...
By: Oliver Normand, Esther Borsi, Mitch Fruin, Lauren E Walker, Jamie Heagerty, Chris C. Holmes, Anthony J Avery, Iain E Buchan, Harry Coppock
Developing emotionally intelligent embodied AI that can generate empathic responses in various situations is a significant challenge for human-robot interaction. This paper explores "Closed-Loop Embod...
By: Jiawen Wang, Jingjing Wang Tianyang Chen, Min Zhang, Guodong Zhou
Accurate depth estimation is fundamental for many computer vision tasks, including 3D reconstruction, robotics, and augmented reality. This paper introduces "Re-Depth Anything," a novel method for tes...
Vision-Language Models (VLMs) have shown remarkable progress, but their ability to reason about the physical world, crucial for real-world applications like robotics, remains underexplored. This paper...
By: Li Puyin, Tiange Xiang, Ella Mao, Shirley Wei, Xinye Chen, Adnan Masood, Li Fei-fei, Ehsan Adeli
Automating clinical risk score calculations can significantly reduce physician administrative burden and improve patient care. Current benchmarks like MedCalc-Bench, constructed using LLM-based extrac...
By: Junze Ye, Daniel Tawfik, Alex J. Goodell, Nikhil V. Kotha, Mark K. Buyyounouski, Mohsen Bayati
Policy gradient methods are a cornerstone of reinforcement learning (RL), enabling agents to learn optimal behaviors in complex environments. This paper investigates advances in policy gradient method...
Current Explainable AI (XAI) approaches face a "Scalability-Stability Dilemma": post-hoc methods (e.g., LIME, SHAP) scale easily but are unstable, while supervised frameworks (e.g., TED) offer stabili...
By: Lawrence Krukrubo, Julius Odede, Olawande Olusegun
We introduce V-Agent, a novel multi-agent platform designed for advanced video search and interactive user-system conversations. By fine-tuning a vision-language model (VLM) with a small video prefere...
By: SunYoung Park, Jong-Hyeon Lee, Youngjune Kim, Daegyu Sung, Younghyun Yu, Young-rok Cha, Jeongho Ju
This research focuses on developing explainable conversational AI systems that leverage large language models for early disease diagnosis. It addresses the critical need for transparency and interpret...
This paper explores the effects of humanlike AI design on anthropomorphism, engagement, and trust across different global contexts. The findings reveal that while humanlike AI generally increases anth...
By: Robin Schimmelpfennig, Mark Díaz, Vinodkumar Prabhakaran, Aida Davani
This paper introduces a model-free reinforcement learning approach that incorporates timed reward machines to handle temporal properties in complex environments. By explicitly integrating timing const...
By: Anirban Majumdar, Ritam Raha, Rajarshi Roy, David Parker, Marta Kwiatkowska
This paper studies the use of Conflict-Driven Clause Learning (CDCL) with VSIDS heuristics as a computational engine for discrete facility layout problems. The facility layout problem is modeled as a ...
We propose a new architectural paradigm for multimodal foundation models designed specifically for clinical diagnostic support. The model integrates diverse data types, including medical images, elect...
By: Dr. Kenji Tanaka, Dr. Maria Rodriguez, Prof. Li Wei, Dr. Samuel Green, Dr. Isabella Rossi, Prof. Ahmed Khan
This paper presents a novel framework for automatically constructing large-scale knowledge graphs from unstructured, noisy text data by leveraging the advanced capabilities of large language models. I...
By: Dr. Anya Petrova, Prof. Serhii Kovalenko, Dr. Elena Vasylenko, Dmytro Kuzmenko, Olena Mykhailiuk
This paper addresses the challenge of teaching robots complex manipulation tasks using imperfect human demonstrations. We propose a novel human-in-the-loop framework that allows the robot to query a h...
By: Dr. Sarah Johnson, Prof. Mark Thompson, Dr. Anna Kaczmarek, Giovanni Russo, Dr. Elena Popova
This research explores the application of federated reinforcement learning to optimize traffic flow in urban environments without centralizing sensitive traffic data. Our proposed framework enables in...
By: Dr. Chen Wang, Dr. Emily Davis, Prof. Marco Bianchi, Dr. Javier Perez, Sophie Dubois
We introduce a new method to enhance the adversarial robustness of large-scale foundation models using a self-supervised approach to generate diverse and challenging perturbations. This technique sign...
By: Dr. Michael Brown, Dr. Jessica Lee, Prof. Benjamin Clark, Dr. Sofia Hernandez, Oliver Wilson, Dr. Grace Taylor, Prof. Kevin Moore
This paper establishes a benchmark for evaluating causal versus correlational AI approaches in predictive maintenance. By providing a clear framework for comparison, this work helps industries impleme...
By: Krishna Taduri, Shaunak Dhande, Giacinto Paolo (GP)Saggese, Paul Smith
CodeDistiller proposes a method for automatically generating code libraries, specifically tailored for scientific coding agents. This research has profound implications for accelerating scientific dis...
By: Peter Jansen, Samiah Hassan, Pragnya Narasimha
Optimizing CUDA kernels is complex and labor-intensive. This paper introduces cuPilot, a multi-agent framework that uses strategy as an intermediate semantic representation for kernel evolution, addre...
By: Jinwu Chen, Qidie Wu, Bin Li, Lin Ma, Xin Si, Yang Hu, Shouyi Yin, Jun Yang
This article presents Value Lens, a text-based model designed to detect human values using generative artificial intelligence, specifically Large Language Models (LLMs). The proposed model operates in...
This paper introduces TimeSeries2Report (TS2R), a prompting framework that converts raw lithium-ion battery operational time-series into structured, semantically enriched reports. This enables large l...
By: Jiayang Yang, Chunhui Zhao, Martin Guay, Zhixing Cao
Virtual testing with synthetic data is crucial for autonomous vehicle safety, but pixel-level fidelity doesn't guarantee real-world transfer. This paper introduces Decisive Feature Fidelity (DFF), an ...
Deploying local large language models and vision-language models on edge devices requires balancing accuracy with constrained computational and energy budgets. This paper systematically benchmarks LLM...
By: Ander Alvarez, Alessandro Genuardi, Nilotpal Sinha, Antonio Tiene, Samuel Mugel, Román Orús
This paper introduces AdaGradSelect, a novel fine-tuning method for Large Language Models (LLMs) that offers significant computational efficiency and memory optimization. It trains about 12% faster an...
We present Anubuddhi, a multi-agent AI system that designs and simulates quantum optics experiments from natural language prompts without requiring specialized programming knowledge. The system compos...
This paper proposes AI Epidemiology, a framework for governing and explaining advanced AI systems by applying population-level surveillance methods to AI outputs. It aims to bypass the complexity of c...
By: Zohra Hadjam, John Mellor, Ilaria Tiddi, Adrian R. Taylor
This paper introduces the Social Responsibility Stack (SRS), a control-theoretic architecture designed to govern socio-technical AI systems responsibly. The SRS provides a modular framework for integr...
We present TOGGLE, a novel framework for compressing Large Language Models (LLMs) specifically designed for efficient deployment on edge devices. TOGGLE leverages temporal logic to guide the compressi...
We introduce the concept of Distributional AGI Safety, a framework for analyzing and ensuring the safety of Artificial General Intelligence (AGI) systems across diverse operational contexts and potent...
By: Nenad Tomašev, Matija Franklin, Julian Jacobs, Sébastien Krier, Simon Osindero
Large language models (LLMs) with explicit reasoning capabilities excel at mathematical reasoning yet still commit process errors, such as incorrect calculations, brittle logic, and superficially plau...
By: Qihao Liu, Luoxin Ye, Wufei Ma, Yu-Cheng Chou, Alan Yuille
This paper explores AI-mediated social interaction from a multi-scale perspective, analyzing its impact at individual, group, and societal levels. We examine how AI agents and systems influence human ...
By: Junzhe Zhang
#cs.AI✓ Analyzed#AI-Mediated Communication#Computational Social Science
CitySeeker investigates how Vision-Language Models (VLMs) can effectively perform embodied urban navigation while implicitly understanding and addressing human needs. We propose a framework that integ...
By: Siqi Wang, Chao Liang, Yunfan Gao, Erxin Yu, Sen Li, Yushi Li, Jing Li, Haofen Wang
TimeLens proposes a novel method for video temporal grounding by leveraging multimodal Large Language Models (LLMs). This research enhances the ability of AI to understand and locate specific events w...
By: Jun Zhang, Teng Wang, Yuying Ge, Yixiao Ge, Xinhao Li, Ying Shan, Limin Wang
This paper introduces Predictive Concept Decoders (PCDs), a novel framework for training scalable end-to-end interpretability assistants. PCDs aim to provide human-understandable explanations for AI m...
By: Vincent Huang, Dami Choi, Daniel D. Johnson, Sarah Schwettmann, Jacob Steinhardt
This paper proposes "Stepwise Think-Critique," a unified framework designed to improve the robustness and interpretability of Large Language Model (LLM) reasoning. By incorporating iterative thinking ...
This paper investigates the development of human-centered AI systems for financial decision support, emphasizing explainability and trust. It presents approaches to design AI tools that provide clear ...
By: Sophia Chen, Robert Davis, Laura Evans, Michael Foster
This research focuses on strengthening AI ethics and governance frameworks by integrating Explainable AI (XAI) and causal inference techniques. It proposes methods to make AI decisions more transparen...
By: Emily Brown, Frank Green, Grace White, Henry Black
This paper presents a decision-theoretic approach to manage misalignment in AI systems, a critical challenge for safe and ethical AI deployment. It provides a formal framework to reason about and miti...
By: Daniel A. Herrmann, Abinav Chari, Isabelle Qian, Sree Sharvesh, B. A. Levinstein
This research explores how maintaining epistemic diversity across multiple language models can prevent "knowledge collapse," a reduction to dominant ideas. This is vital for building robust, reliable,...
This paper introduces Context-Picker, an approach that uses multi-stage reinforcement learning for dynamic context selection. This is highly relevant for AI systems that need to efficiently process an...
This work introduces MMGR, a framework for Multi-Modal Generative Reasoning, exploring the integration of various data modalities for enhanced AI understanding and generation, with applications in com...
This research introduces a dynamic learning rate scheduling method based on loss changes, aiming to achieve faster convergence in machine learning models, offering practical benefits for optimizing tr...
This paper interprets self-attention and residual streams in transformers through a Vector Symbolic Architecture (VSA) lens, proposing 'attention as binding' to develop a unified perspective on transf...
This paper introduces a universal reasoning model, aiming to develop a foundational AI system capable of diverse and general intelligence, potentially leading to more robust and adaptable AI applicati...
By: Zitian Gao, Lynx Chen, Yihao Xiao, He Xing, Ran Tao, Haoming Luo, Joey Zhou, Bryan Dai
This research focuses on enhancing the reliability of Large Language Model (LLM) agents by introducing a model-first reasoning approach, which explicitly models problems to reduce hallucinations and i...
This paper analyzes the design of a telehealth application for palliative care, integrating quality, human values, and real-world considerations to improve accessibility and continuity of care in digi...
By: Wei Zhou, Rashina Hoda, Andy Li, Chris Bain, Laura Bird, Emmy Trinh, Peter Poon, Teresa O Brien, Mahima Kalla, Olivia Metcalf, Wendy Chapman, Joycelyn Ling, Sam Georgy, David Bevan
This paper presents WorldPlay, a streaming video diffusion model that enables real-time, interactive world modeling with long-term geometric consistency. It resolves the trade-off between speed and me...
This paper introduces PortAgent, an LLM-driven vehicle dispatching agent designed to fully automate the Vehicle Dispatching System (VDS) transferring workflow in Automated Container Terminals (ACTs). ...
By: Jia Hu, Junqi Li, Weimeng Lin, Peng Jia, Yuxiong Ji, Jintao Lai
This paper proposes an intelligent, interactive workflow powered by Large Language Models (LLMs) to address the steep learning curve and complex manual operations in traditional seismic wave simulatio...
Seedance 1.5 pro is a foundational model for native, joint audio-visual generation, leveraging a dual-branch Diffusion Transformer architecture and a specialized multi-stage data pipeline. It achieves...
This paper proposes Nemotron-Cascade, a framework for developing general-purpose reasoning models using cascaded domain-wise reinforcement learning (Cascade RL). It addresses heterogeneity in RL infra...
This paper introduces SMMT, a sparse multi-modal transformer architecture, to address the high computational and energy costs of dense self-attention in intelligent systems. SMMT incorporates cluster-...
This paper presents a secure, modular framework that leverages locally deployed large language models (LLMs) to automate structured feature extraction from unstructured electronic health record (EHR) ...
By: Mitchell A. Klusty, Elizabeth C. Solie, Caroline N. Leach, W. Vaiden Logan, Lynnet E. Richey, John C. Gensel, David P. Szczykutowicz, Bryan C. McLellan, Emily B. Collier, Samuel E. Armstrong, V. K. Cody Bumgardner
This paper introduces an AI-based annotation pipeline designed to systematically identify, label, and fix instability patterns in Large Language Model (LLM) output. This human-AI synergy method combin...
This paper introduces LongVie 2, a multimodal controllable ultra-long video world model. It focuses on generating and understanding extended video sequences with high fidelity and controllability. Thi...
Large language models (LLMs) are often opaque, making principled governance of their internal memory and "self-like" behavior difficult. This paper develops an engineering-oriented, clause-based archi...
Investigates if Large Language Models exhibit envy-like preferences in multi-agent environments, providing insights into their social intelligence and decision-making biases. Understanding these compl...
Introduces MedCEG, a novel framework using critical evidence graphs to enhance the verifiability and reliability of AI-driven medical reasoning, crucial for clinical decision support. This work signif...
Presents MAC, a multi-agent framework designed to enhance conversational AI by enabling interactive clarification with users in multi-turn dialogues, improving understanding and task completion. This ...
By: Emre Can Acikgoz, Jinoh Oh, Joo Hyuk Jeon, Jie Hao, Heng Ji, Dilek Hakkani-Tür, Gokhan Tur, Xiang Li, Chengyuan Ma, Xing Fan
Evaluates the robustness of CNNs for diagnosing diseases in mango leaves, highlighting practical applications of AI in agriculture for crop health monitoring. This research directly contributes to sus...
By: Gabriel Vitorino de Andrade, Saulo Roberto dos Santos, Itallo Patrick Castro Alves da Silva, Emanuel Adler Medeiros Pereira, Erick de Andrade Barboza
Proposes a new approach combining differentiable programming with evolutionary strategies for reinforcement learning, aiming to improve learning efficiency and adaptability in complex environments. Th...
Explores methods for defending hierarchical models that represent precedential constraints, relevant for robust legal reasoning and AI systems in jurisprudence. This research offers valuable insights ...
Analyzes the role of large language models in combinatorial optimization, covering their ability to extract features and aid in selecting optimal algorithms for complex problems. This research is high...
By: Francesca Da Ros, Luca Di Gaspero, Kevin Roitero
This paper introduces Dora, a framework for Quality of Experience (QoE) aware hybrid parallelism in distributed edge AI training and inference. It addresses the challenge of optimizing heterogeneous c...
By: Jianli Jin, Ziyang Lin, Qianli Dong, Yi Chen, Jayanth Srinivasa, Myungjin Lee, Zhaowei Tan, Fan Lai
Medical image segmentation plays a crucial role in various clinical applications, including diagnosis, treatment planning, and surgical guidance. However, the inherent variability in medical images, c...
By: Jianpeng Zhang, Yizhe Zhang, Bo Liu, Zhihui Wang, Danny Chen
Embodied AI, which aims to develop intelligent agents capable of perceiving, acting, and reasoning in physical or simulated environments, represents a grand challenge in artificial intelligence. The e...
Imitation Learning (IL) has emerged as a promising paradigm for training robotic policies from expert demonstrations. A significant challenge in real-world robotics, however, is the robustness gap bet...
Zero-shot image segmentation, the task of segmenting unseen object categories without requiring any labeled examples, is a challenging but highly desirable capability for many real-world computer visi...
Recommender systems are ubiquitous in modern digital platforms, guiding users to relevant items from vast catalogs. A significant challenge arises in few-shot recommendation scenarios, where new items...
By: Yichao Lv, Fan Yang, Yiqi Wang, Xiangyu Zhao, Guohua Li
Large Language Models (LLMs) have demonstrated impressive capabilities across various domains, revolutionizing natural language processing and extending their influence to scientific research. This su...
By: Jiachen Li, Yujing Jiang, Zhiyuan Liu, Jie Tang
Large Language Models (LLMs) have sparked considerable excitement across various sectors, with education being a particularly prominent area of discussion. Proponents suggest that LLMs could revolutio...
This paper explores design goals for Large Language Model (LLM)-assisted literature reviews, aiming to shift the process from a verification burden to a trusted collaboration. It addresses the practic...
By: Brenda Nogueira, Werner Geyer, Andrew Anderson, Toby Jia-Jun Li, Dongwhi Kim, Nuno Moniz, Nitesh V. Chawla
This paper introduces Dora, a framework for optimizing distributed edge AI training and inference with Quality of Experience (QoE) awareness. It focuses on hybrid parallelism, managing heterogeneous c...
This research evaluates TxAgent's therapeutic agentic reasoning within the NeurIPS CURE-Bench Competition, focusing on AI's ability to assist in clinical decision-making and therapeutic strategies. It...
By: Tim Cofala, Christian Kalfar, Jingge Xiao, Johanna Schrader, Michelle Tang, Wolfgang Nejdl
This study investigates the application of Large Language Models (LLMs) to analyze unstructured clinical narratives for identifying Metabolic Dysfunction-Associated Steatotic Liver Disease (MASLD) and...
This paper advocates for dynamic and inclusive benchmarking to ensure AI evaluation keeps pace with its evolution, supporting responsible, reproducible, and accessible AI deployment. It aims to improv...
By: Gregor von Laszewski, Wesley Brewer, Jeyan Thiyagalingam, Juri Papay, Armstrong Foundjem, Piotr Luszczek, Murali Emani, Shirley V. Moore, Vijay Janapa Reddi, Matthew D. Sinclair, Sebastian Lobentanzer, Sujata Goswami, Benjamin Hawks, Marco Colombo, Nhan Tran, Christine R. Kirkpatrick, Abdulkareem Alsudais, Gregg Barrett, Tianhao Li, Kirsten Morehouse, Shivaram Venkataraman, Rutwik Jain, Kartik Mathur, Victor Lu, Tejinder Singh, Khojasteh Z. Mirza, Kongtao Chen, Sasidhar Kunapuli, Gavin Farrell, Renato Umeton, Geoffrey C. Fox
This paper introduces CORL, a method for reinforcement learning of policies that solve Mixed-Integer Linear Programs (MILPs) using branch and bound algorithms. It addresses the challenges of suboptima...
By: Akhil S Anand, Elias Aarekol, Martin Mziray Dalseg, Magnus Stalhane, Sebastien Gros
This research utilizes reinforcement learning to investigate the role of feedback in skill acquisition in a physical system. It demonstrates that learning a high-performance skill may require richer i...
This paper introduces the Prismatic World Model (PRISM-WM), a structured architecture to decompose complex hybrid dynamics into composable primitives for robust planning in robotic domains. By accurat...
This research investigates the use of Large Language Models (LLMs), specifically Llama-3.1 8B, for automated source code vulnerability detection (CVD). It explores various fine-tuning and prompt engin...
Auto-BenchmarkCard is a workflow designed to generate validated descriptions of AI benchmarks. It addresses the common issues of incomplete or inconsistent benchmark documentation by combining multi-a...
By: Aris Hofmann, Inge Vejsbjerg, Jiatong Shi, Junwon Lee
This paper presents an algorithm for computing evolutionarily stable strategies (ESSs) in symmetric perfect-recall extensive-form games of imperfect information. The algorithm, applicable to two-playe...
Autoregressive decoding in Large Language Models (LLMs) is inherently sequential, creating a latency bottleneck that scales linearly with output length. While ``Decomposition-and-Fill'' methods like S...
Circuit discovery aims to identify minimal subnetworks that are responsible for specific behaviors in large language models (LLMs). Existing approaches primarily rely on iterative edge pruning, which ...
By: Muhammad Umair Haider, Hammad Rizwan, Hassan Sajjad, A.B. Siddique
Sensor-based human activity recognition (HAR) mines activity patterns from the time-series sensory data. In realistic scenarios, variations across individuals, devices, environments, and time introduc...
From content moderation to content curation, applications requiring vision classifiers for visual concepts are rapidly expanding. Existing human-in-the-loop approaches typically assume users begin wit...
We establish a precise correspondence between decision-making agents in partially observable Markov decision processes (POMDPs) and one-input process functions, the classical limit of higher-order qua...
This paper introduces Value-Guided Offline Control Barrier Functions (V-OCBF), a framework for learning neural Control Barrier Functions (CBFs) entirely from offline demonstrations. It provides rigoro...
This paper proposes ReMe, a dynamic procedural memory framework for experience-driven agent evolution. It addresses the limitations of static memory in LLM agents by introducing multi-faceted distilla...
By: Zouying Cao, Jiaji Deng, Li Yu, Weikang Zhou, Zhaoyang Liu, Bolin Ding, Hai Zhao
This paper addresses the urgent need to unify research in AI safety and ethics. While AI development rapidly scales capabilities, the work on producing harmless, "aligned" systems is equally critical....
This paper explores how large language models (LLMs) can enhance the proposal selection process at large user facilities. It offers a scalable, consistent, and cost-effective alternative to traditiona...
By: Lijie Ding, Janell Thomson, Jon Taylor, Changwoo Do
This research explores enhancing radiology report generation and visual grounding in medical imaging by applying reinforcement learning (RL) to vision-language models (VLMs). It investigates how RL, c...
By: Benjamin Gundersen, Nicolas Deperrois, Samuel Ruiperez-Campillo, Thomas M. Sutter, Julia E. Vogt, Michael Moor
This paper introduces CA-GPT, a RAG-enhanced AI-OCT system, demonstrating superior decision support for Percutaneous Coronary Intervention (PCI). It significantly outperforms general-purpose large lan...
The proposed framework advances computational methods for belief-driven discourse analysis and offers applications for stance detection, political communication studies, and content moderation policy.
ExaCraft is an AI system that generates personalized educational examples by dynamically adapting to a learner's context, including their struggles, mastery, and preferences. This promises a more effe...
This qualitative study investigates how users calibrate their trust when interacting with Large Language Models (LLMs) that exhibit hallucinations. Understanding this dynamic is crucial for developing...
This paper presents a comprehensive evaluation of AI agents against human cybersecurity professionals in live enterprise penetration testing. It highlights the capabilities of AI in discovering vulner...
By: Justin W. Lin, Eliot Krzysztof Jones, Donovan Julian Jasper, Ethan Jun-shen Ho, Anna Wu, Arnold Tianyi Yang, Neil Perry, Andy Zou, Matt Fredrikson, J. Zico Kolter, Percy Liang, Dan Boneh, Daniel E. Ho
This work presents a robust autonomous navigation system for robotic platforms operating in highly unstructured and hazardous disaster environments. Our proposed system integrates advanced sensor fusi...
By: Dr. Robert Smith, Dr. Laura Kim, Dr. Daniel Lee, Dr. Sophia Chang, Dr. William Johnson
The energy consumption of deep learning models is a growing concern. This paper presents a novel hardware accelerator for spiking neural networks, a key component of neuromorphic computing, enabling u...
By: Dr. Satoshi Tanaka, Dr. Maria Rossi, Dr. John Doe, Dr. Jane Smith, Dr. Wei Zhang
Traditional methods for identifying software vulnerabilities are often labor-intensive and prone to human error. This paper explores the effectiveness of fine-tuned large language models (LLMs) in aut...
By: Alex Johnson, Benjamin Lee, Catherine Davis, Daniel White, Elizabeth Green
Deploying powerful generative AI models on resource-constrained edge devices remains a significant challenge. This paper introduces a novel distillation-based framework that effectively compresses lar...
By: Sarah Jones, Michael Brown, Emily White, James Taylor, Olivia Davis
The discovery of new materials with desired properties is crucial for technological advancement but traditionally relies on costly and time-consuming experimental trials. We introduce an AI-driven pla...
By: Dr. Priya Sharma, Dr. Hiroshi Sato, Dr. Liam Murphy, Dr. Isabella Costa, Dr. Noah Brown, Dr. Mia Wilson, Dr. Ethan Hall
Autonomous robots operating in unstructured and dynamic environments face significant challenges due to unpredictable conditions and complex interactions. This paper proposes a novel robust reinforcem...
By: Dr. Alex Miller, Dr. Lena Becker, Prof. Robert Johnson, Dr. Sophie Dubois
The rapid advancement of Artificial Intelligence (AI) necessitates a robust legal and ethical framework to ensure its responsible development and deployment. This paper proposes a comprehensive framew...
By: Sophia Chen, David Lee, Elena Petrova, Markus Schmidt
This paper introduces OmniView, a novel diffusion model capable of generating high-quality 3D and 4D view syntheses from limited input. By leveraging advanced architectural designs and training strate...
Federated learning (FL) offers a promising paradigm for privacy-preserving machine learning by enabling collaborative model training without centralizing raw data. This paper introduces an adaptive cl...
By: Jia Li, Kevin Zhang, Maria Garcia, Ahmed Hassan, Oliver Brown
This paper addresses the critical issue of Multimodal Large Language Models (MLLMs) producing inconsistent or different answers when presented with the same information through various input modalitie...
By: Angela van Sprang, Laurens Samson, Ana Lucic, Erman Acar, Sennay Ghebreab, Yuki M. Asano
This paper introduces EcomBench, a benchmark designed for the holistic evaluation of foundation agents in e-commerce, addressing the need for comprehensive assessment of AI's performance in this criti...
By: Rui Min, Zile Qiao, Ze Xu, Jiawen Zhai, Wenyu Gao, Xuanzhong Chen, Haozhen Sun, Zhen Zhang, Xinyu Wang, Hong Zhou, Wenbiao Yin, Xuan Zhou, Yong Jiang, Haicheng Liu, Liang Ding, Ling Zou, Yi R. (May)Fung, Yalong Li, Pengjun Xie
DAComp provides a comprehensive, research-grade benchmark for evaluating data agents across the entire data intelligence lifecycle, encompassing data engineering and open-ended data analysis, which is...
By: Fangyu Lei, Jinxiang Meng, Yiming Huang, Junjie Zhao, Yitong Zhang, Jianwen Luo, Xin Zou, Ruiyi Yang, Wenbo Shi, Yan Gao, Shizhu He, Zuo Wang, Qian Liu, Yang Wang, Ke Wang, Jun Zhao, Kang Liu
This research presents CARLoS, a method for efficient retrieval utilizing Concise Assessment Representation of LoRAs (Low-Rank Adaptations) at scale, offering significant potential for optimizing the ...
By: Shahar Sarfaty, Adi Haviv, Uri Hacohen, Niva Elkin-Koren, Roi Livni, Amit H. Bermano
This paper presents ReasonBENCH, a new benchmark designed to evaluate and quantify the stability and consistency of reasoning capabilities in Large Language Models. The findings are vital for understa...
This research proposes RL-MTJail, a reinforcement learning approach for automated black-box multi-turn jailbreaking of Large Language Models. The study offers crucial insights for enhancing LLM securi...
This research introduces DEMOCRITUS, a novel system for constructing large causal models by leveraging Large Language Models to extract and structure textual knowledge across diverse domains. It pione...
This paper addresses the challenge of efficient memory utilization in Large Language Models through a novel dynamic memory management system. It aims to optimize resource allocation, reduce computatio...
This research introduces a data-driven model predictive control strategy, enhanced by Gaussian Process Regression, tailored for complex cyber-physical systems. The approach offers improved robustness ...
This paper investigates methods for auditing strategic behavior, specifically "sandbagging," in game-theoretic settings. It aims to develop robust mechanisms for detecting and preventing deceptive pla...
By: Jordan Taylor, Sid Black, Dillon Bowen, Thomas Read, Satvik Golechha, Alex Zelenka-Martin, Oliver Makins, Connor Kissane, Kola Ayonrinde, Jacob Merizian, Samuel Marks, Chris Cundy, Joseph Bloom
This study provides a comprehensive performance analysis of Data Oriented Design (DOD) versus traditional Object-Oriented Design (OOD), focusing on cache utilization and efficiency in multi-threaded e...
By: Gabriel M. Arantes, Richard F. Pinto, Bruno L. Dalmazo, Eduardo N. Borges, Giancarlo Lucca, Viviane L. D. de Mattos, Fabian C. Cardoso, Rafael A. Berri
This paper proposes a new perspective on human-robot interaction by leveraging extended reality (XR) and virtual robots powered by large foundation models. It argues that these XR-native agents can ac...
This paper introduces SusVibes, a benchmark with 200 real-world software engineering tasks, to evaluate the safety and vulnerabilities of code generated by large language model agents in "vibe coding"...
This novel framework, IM HERE, models engagement in human-human, human-robot, and robot-robot interactions by using an effort-based description of relationships. It aims to automate the analysis and d...
This paper introduces DRIFT (Dissatisfaction-Refined Iterative preFerence Training), a novel approach for preference learning in real-world large language model deployments. It leverages abundant impl...
This research explores the application of generative pre-trained diffusion paradigms, drawing parallels with successful large language models and vision models, for zero-shot time series forecasting. ...
This paper introduces a new benchmark and baseline to develop robust Vision-Language Models (VLMs) specifically for autonomous driving, addressing critical safety and performance challenges in real-wo...
Introduces conversational LLMs to streamline the documentation of business processes for Small and Medium-sized Enterprises (SMEs), transforming tacit knowledge into formal BPMN diagrams to enhance op...
This project develops an AI system offering an end-to-end solution for aiding doctors with diagnosis and treatment planning for Glioblastoma Multiforme (GBM), the deadliest human cancer. It uses multi...
Incomplete data is a pervasive challenge in real-world applications. This paper introduces Impugan, a conditional Generative Adversarial Network (cGAN) designed for robustly imputing missing values an...
This study addresses the crucial problem of hallucinations in Multimodal Large Language Models (MLLMs), which generate factually inconsistent descriptions despite coherent linguistic output. HalluShif...
By: Sujoy Nath, Arkaprabha Basu, Sharanya Dasgupta, Swagatam Das
Conformal prediction is a framework for quantifying uncertainty in machine learning predictions, crucial for reliable real-world applications. This paper introduces an online conformal prediction meth...
By: Dongjian Hu, Junxi Wu, Shu-Tao Xia, Changliang Zou
Public-use microdata samples often risk re-identification, especially for firm-level data where anonymity is difficult. This paper describes a machine learning model to construct synthetic public-use ...
By: Jorge Cisneros Paz, Timothy Wojan, Matthew Williams, Jennifer Ozawa, Robert Chew, Kimberly Janda, Timothy Navarro, Michael Floyd, Christine Task, Damon Streat
Heart failure (HF) is a leading cause of rehospitalization. This paper proposes ClinNoteAgents, an LLM multi-agent system to predict and interpret heart failure 30-day readmission from clinical notes,...
By: Rongjia Zhou, Chengzhuo Li, Carl Yang, Jiaying Lu
Researchers at Physical Intelligence developed a method for real-time robot control that shifts action chunk conditioning from inference-time to training-time, achieving lower latency and improved rob...
Researchers from Shenzhen Sunline Tech Co., Ltd. addressed the LLM repetition problem in production financial batch code interpretation by evaluating multiple solutions. Their study found that Beam Se...
Huawei Inc. researchers developed EMMA, a unified multimodal architecture for understanding, generation, and editing, utilizing 32x visual token compression and channel-wise feature fusion to enhance ...
Researchers from Google, NYU, ETH Zurich, and Stanford present a theoretical framework to formalize how large language models perform complex, iterative reasoning. The framework characterizes reasonin...
By: David Lee, Maria Garcia, Alexandre Dubois, Sophia Müller
Researchers from Zhejiang University and ByteDance introduced CodeVision, a 'code-as-tool' framework that equips Multimodal Large Language Models (MLLMs) to programmatically interact with images. The ...
This research empirically validates that deep neural networks consistently converge to shared, low-dimensional parametric subspaces, leading to substantial memory efficiency and parameter-efficient ad...
This paper systematically quantifies errors in published AI papers using large language model analysis, providing valuable insights for improving the reliability and integrity of AI research.
By: Federico Bianchi, Yongchan Kwon, Zachary Izzo, Linjun Zhang, James Zou
TRACE provides a framework to analyze and improve the stepwise reasoning capabilities of Vision-Language Models, crucial for developing more interpretable and robust multimodal AI systems.
SIMA 2 is a generalist embodied AI agent developed by Google DeepMind that can understand and act in diverse 3D virtual worlds, significantly improving task success rates and demonstrating autonomous ...
This paper presents a large-scale empirical analysis of real-life code generated by ChatGPT, evaluating its correctness and security, and highlighting user's lack of security awareness for LLM-generat...
This paper introduces an agentic AI pipeline that autonomously clusters prediction markets and identifies relationships between them, achieving high accuracy and profitable trading strategies.
This paper presents a model-based framework combining Bayesian optimization with Monte Carlo Tree Search to achieve new state-of-the-art upper bounds in sphere packing, demonstrating AI's ability to a...
By: Rasul Tutunov, Alexandre Maraval, Antoine Grosnit, Xihan Li, Jun Wang, Haitham Bou-Ammar
This study investigates human perception and evaluation of AI-generated responses modified by a mitigator model to reduce harm, focusing on mitigation performance, transparency, and metrics to bridge ...
By: Heloisa Candello, Muneeza Azmat, Uma Sushmitha Gunturi, Raya Horesh, Rogerio Abreu de Paula, Heloisa Pimentel, Marcelo Carpinette Grave, Aminat Adebiyi, Tiago Machado, Maysa Malfiza Garcia de Macedo
This paper presents fMRI2GES, a novel AI system that reconstructs co-speech gestures from fMRI signals using dual brain decoding alignment, showing potential for brain-computer interfaces.
The AI Consumer Index (ACE) is introduced as a comprehensive benchmark to evaluate the gap between advanced AI models and the practical needs of consumers, revealing significant limitations in current...
This paper introduces a novel method to model and estimate the energy consumption of different execution configurations in data-sharing pipelines, also identifying reuse potential to reduce energy in ...
DeepSeek-V3.2 introduces DeepSeek Sparse Attention and a scalable reinforcement learning framework, achieving superior reasoning and agent performance comparable to top proprietary models, and excelli...
Deep Forcing is a training-free method that enhances real-time long video generation by addressing temporal repetition and motion issues through Deep Sink and Participative Compression, yielding high-...
By: Wooseok Jang, Paul Hyunbin Cho, Jisu Nam, Heeji Yoon, Seungryong Kim
In 2024, France was shaken by the far-right National Rally's victory in the European elections. In response to this unprecedented result, French President Emmanuel Macron dissolved the National Assemb...
By: Caroline Violot, Vera Sosnovik, Mathias Humbert
This paper investigates the electron-phonon contribution to total energy, an often-approximated factor in first-principles calculations. It clarifies the nature of this contribution and demonstrates i...
By: Samuel Poncé, Xavier Gonze
#imported✓ Analyzed#condensed matter physics#density functional theory
This paper computes valley splittings in Si/SiGe superlattices using ab initio density functional theory (DFT), which provides an excellent description of interfaces, strains, and atomistic disorder. ...
By: Lukas Cvitkovich, Tancredi Salamone, Christoph Wilhelmer, Biel Martinez, Tibor Grasser, Yann-Michel Niquet
This paper investigates the dissipative Yao-Lee Spin-Orbital Model. It focuses on the exact solvability of this model and the conditions under which its $\mathcal{PT}$ symmetry breaks.
By: Zihao Qi, Yuan Xue
#imported✓ Analyzed#Yao-Lee Model#Open Quantum Systems
This study validates computational tools for simulating tokamak environments, which is crucial for the safe and efficient production of medical isotopes.
By: Christopher Ehrich, Christian Bachmann, Pavel Pereslavtsev, Christian Reiter
This work presents a novel operator for 3D phase field modeling that ensures consistency across physical, energetic, and numerical aspects, enabling more accurate simulations of material phenomena.
This paper explores stochastic density functional theory using the multilevel Monte Carlo method, offering a promising approach to enhance the efficiency and accuracy of quantum mechanical simulations...
By: Xue Quan, Huajie Chen
#physics.comp-ph✓ Analyzed#Stochastic Density Functional Theory#Multilevel Monte Carlo
This paper introduces a portable and efficient framework for Lattice Boltzmann Method and Discrete Element Method simulations on GPUs, accelerating complex multi-physics problems with potential for in...
By: Raphael Maggio-Aprile, Maxime Rambosson, Christophe Coreixas, Jonas Latt
This research proposes an energy-efficient design leveraging engineered magnetic microstructures to emulate biological neuron functions, promising advancements in spintronic neuromorphic architectures...
This paper introduces a novel generative AI system that creates dynamic, multimodal content (textures, objects, soundscapes) in real-time, enabling unprecedented levels of immersion and interactivity ...
This paper presents a novel multi-agent reinforcement learning framework that significantly improves the efficiency and stability of decentralized energy grid management by optimizing renewable energy...
We propose an innovative federated learning architecture that not only ensures robust privacy for patient data but also provides interpretable insights for medical practitioners, fostering trust in AI...
This work demonstrates an integrated system where large language models propose hypotheses and design experiments, which are then autonomously executed by robotic platforms, leading to accelerated sci...
This paper presents Semantic Soft Bootstrapping, a novel method enabling long context reasoning in Large Language Models without reliance on reinforcement learning, representing a potential breakthrou...
This paper explores the potential of multi-Large Language Model (LLM) collaboration to enhance the accuracy and utility of medication recommendation systems, offering a practical real-world applicatio...
By: Huascar Sanchez, Briland Hitaj, Jules Bergmann, Linda Briesemeister
This paper focuses on the crucial challenge of detecting perspective shifts within multi-agent AI systems, which is essential for developing more cooperative and understandable AI interactions.
This paper investigates the surprising efficacy of small models combined with agentic AI in achieving significant results within hardware design, suggesting a breakthrough in efficient AI application.
This paper critiques common patterns in machine ethics for Reinforcement Learning and advocates for a virtue-focused alternative, addressing the limitations of rule-based and single-objective reward a...
This paper explores the integration of Speech AI with Relational Graph Transformers to enable continuous neurocognitive monitoring for individuals with rare neurological diseases, offering significant...
STELLA proposes a method to guide Large Language Models for improved time series forecasting by employing semantic abstractions, potentially leading to more accurate and interpretable predictions in v...
This paper presents a framework for Executable Governance for AI, demonstrating how Large Language Models can translate policies into actionable rules, thereby bridging the gap between AI ethics and p...
This paper introduces a benchmark to evaluate the epidemiology of Large Language Models, specifically focusing on their observational distribution knowledge, which is crucial for understanding and imp...
This work introduces a dual-reasoning training framework that integrates affirmative generation with structured counterfactual denial, leading to more robust, interpretable, and human-reasoning-aligne...
GovBench introduces a benchmark for evaluating Large Language Model agents in real-world data governance workflows, which is crucial for the deployment of trustworthy AI in regulated environments.
Chameleon introduces adaptive adversarial agents to address scaling-based visual prompt injection in multimodal AI systems, enhancing the robustness and security of these complex models.