Generative AI & LLMs 2025-12-12

Speculative generation has emerged as a promising technique to accelerate inference in large language models (LLMs) by leveraging parallelism to verify multiple draft tokens simultaneously. However, t...

Sergey Pankratov, Dan Alistarh
Generative AI & LLMs 2025-12-12

The rapid deployment of Large Language Models (LLMs) has created an urgent need for enhanced security and privacy measures in Machine Learning (ML). LLMs are increasingly being used to process untrust...

Andrew Adiletta, Kathryn Adiletta, Kemal Derya et al.
Generative AI & LLMs 2025-12-12

Ensuring the safety of AI-enabled systems, particularly in high-stakes domains such as autonomous driving and healthcare, has become increasingly critical. Traditional formal verification tools fall s...

Ernesto Casablanca, Oliver Schön, Paolo Zuliani et al.
Agentic AI 2025-12-11

Fairness and action smoothness are two crucial considerations in many online optimization problems, but they have yet to be addressed simultaneously. In this paper, we study a new and challenging sett...

Pengfei Li, Yuelin Han, Adam Wierman et al.
Generative AI & LLMs 2025-12-05

Real-time chunking (RTC) enables vision-language-action models (VLAs) to generate smooth, reactive robot trajectories by asynchronously predicting action chunks and conditioning on previously committe...

Kevin Black, Allen Z. Ren, Michael Equi et al.
Generative AI & LLMs 2025-12-05

We consider the fundamental problem of balanced $k$-means clustering. In particular, we introduce an optimal transport approach to alternating minimization called BalLOT, and we show that it delivers ...

Wenyan Luo, Dustin G. Mixon
Generative AI & LLMs 2025-12-05

In the era of Large Language Models (LLMs), Retrieval-Augmented Generation (RAG) architectures are gaining significant attention for their ability to ground language generation in reliable knowledge s...

Francesco Granata, Francesco Poggi, Misael Mongiovì
Computer Vision & MultiModal AI 2025-12-05

Accurate real-time waypoints estimation for the UAV-based online Terrain Following during wildfire patrol missions is critical to ensuring flight safety and enabling wildfire detection. However, exist...

Xiaobo Wu, Youmin Zhang
Computer Vision & MultiModal AI 2025-12-05

Weakly supervised semantic segmentation (WSSS) in histopathology reduces pixel-level labeling by learning from image-level labels, but it is hindered by inter-class homogeneity, intra-class heterogene...

Khang Le, Anh Mai Vu, Thi Kim Trang Vo et al.
Computer Vision & MultiModal AI 2025-12-05

In this work we investigate the relationship between kernel regularity and algorithmic performance in the bandit optimization of RKHS functions. While reproducing kernel Hilbert space (RKHS) methods t...

Madison Lee, Tara Javidi
AI in healthcare 2025-12-05

Users frequently edit camera images post-capture to achieve their preferred photofinishing style. While editing in the RAW domain provides greater accuracy and flexibility, most edits are performed on...

Abhijith Punnappurath, Luxi Zhao, Ke Zhao et al.
Explainable & Ethical AI 2025-12-05

We introduce a simple post-training method that makes transformer attention sparse without sacrificing performance. Applying a flexible sparsity regularisation under a constrained-loss objective, we s...

Florent Draye, Anson Lei, Ingmar Posner et al.
Agentic AI 2025-12-05

Modern extensible compiler frameworks-such as MLIR-enable rapid creation of domain-specific language dialects. This flexibility, however, makes correctness harder to ensure as the same extensibility t...

Sairam Vaidya, Marcel Böhme, Loris D'Antoni
Agentic AI 2025-12-05

Deep residual networks (ResNets) have demonstrated outstanding success in computer vision tasks, attributed to their ability to maintain gradient flow through deep architectures. Simultaneously, contr...

Marius F. R. Juston, Ramavarapu S. Sreenivas, Dustin Nottage et al.
Agentic AI 2025-12-05

Spiking neural networks (SNNs), central to computational neuroscience and neuromorphic machine learning (ML), require efficient simulation and gradient-based training. While AI accelerators offer prom...

Lennart P. L. Landsmeer, Amirreza Movahedin, Said Hamdioui et al.
AI in healthcare 2025-12-04

In context-specific applications such as robotics, telecommunications, and healthcare, artificial intelligence systems often face the challenge of limited training data. This scarcity introduces epist...

Osvaldo Simeone, Yaniv Romano
AI in healthcare 2025-12-04

While three-dimensional (3D) shape and pose estimation is a highly researched area that has yielded significant advances, the resulting methods, despite performing well for the adult population, gener...

Georgios Chatzichristodoulou, Niki Efthymiou, Panagiotis Filntisis et al.
Explainable & Ethical AI 2025-12-04

As concerns around data privacy in machine learning grow, the ability to unlearn, or remove, specific data points from trained models becomes increasingly important. While state of the art unlearning ...

Anat Kleiman, Robert Fisher, Ben Deaner et al.
Explainable & Ethical AI 2025-12-04

Developers are widely using AI code-generation models, aiming to increase productivity and efficiency. However, there are also quality concerns regarding the AI-generated code. The generated code is p...

Ruofan Gao, Amjed Tahir, Peng Liang et al.
Agentic AI 2025-11-28

An associative memory (AM) enables cue-response recall, and it has recently been recognized as a key mechanism underlying modern neural architectures such as Transformers. In this work, we introduce t...

Bowen Wang, Matteo Zecchin, Osvaldo Simeone
Agentic AI 2025-11-28

Flow-based generative models have recently demonstrated strong performance, yet sampling typically relies on expensive numerical integration of ordinary differential equations (ODEs). Rectified Flow e...

Xinxi Zhang, Shiwei Tan, Quang Nguyen et al.
Agentic AI 2025-11-28

Achieving fully autonomous driving systems requires learning rational decisions in a wide span of scenarios, including safety-critical and out-of-distribution ones. However, such cases are underrepres...

Haochen Tian, Tianyu Li, Haochen Liu et al.
Generative AI & LLMs 2025-11-28

Diffusion-based image editing has made semantic level image manipulation easy for general users, but it also enables realistic local forgeries that are hard to localize. Existing benchmarks mainly foc...

Rui Zhang, Hongxia Wang, Hangqing Liu et al.
Generative AI & LLMs 2025-11-28

Machine learning models perform well across domains such as diagnostics, weather forecasting, NLP, and autonomous driving, but their limited uncertainty handling restricts use in safety-critical setti...

Bernhard Klein, Falk Selker, Hendrik Borras et al.
AI in healthcare 2025-11-21

Face verification is a significant component of identity authentication in various applications including online banking and secure access to personal devices. The majority of the existing face image ...

Georgia Baltsou, Ioannis Sarridis, Christos Koutlis et al.
Generative AI & LLMs 2025-11-21

Neural network subgrid stress models often have a priori performance that is far better than the a posteriori performance, leading to neural network models that look very promising a priori completely...

Andy Wu, Sanjiva K. Lele
Generative AI & LLMs 2025-11-21

Foundation Models (FMs) are increasingly used in remote sensing (RS) for tasks such as environmental monitoring, disaster assessment, and land-use mapping. These models include unimodal vision encoder...

Binger Chen, Tacettin Emre Bök, Behnood Rasti et al.
Generative AI & LLMs 2025-11-21

Recent video generation approaches increasingly rely on planning intermediate control signals such as object trajectories to improve temporal coherence and motion fidelity. However, these methods most...

Yidong Huang, Zun Wang, Han Lin et al.
Computer Vision & MultiModal AI 2025-11-21

When performing robot/vehicle localization using ground penetrating radar (GPR) to handle adverse weather and environmental conditions, existing techniques often struggle to accurately estimate distan...

Huaichao Wang, Xuanxin Fan, Ji Liu et al.
Computer Vision & MultiModal AI 2025-11-21

We present a patient-centric architecture for electronic health record (EHR) sharing that separates content storage from authorization and audit. Encrypted FHIR resources are stored off-chain; a publi...

Tanzim Hossain Romel, Kawshik Kumar Paul, Tanberul Islam Ruhan et al.
Generative AI & LLMs 2025-11-21

Self-supervised learning (SSL) has recently advanced through non-contrastive methods that couple an invariance term with variance, covariance, or redundancy-reduction penalties. While such objectives ...

Benyamin Ghojogh, M. Hadi Sepanj, Paul Fieguth
Generative AI & LLMs 2025-11-21

Deep learning models are prone to learning shortcut solutions to problems using spuriously correlated yet irrelevant features of their training data. In high-risk applications such as medical image an...

Christopher Boland, Sotirios Tsaftaris, Sonia Dahdouh
Generative AI & LLMs 2025-11-21

Traditional evaluation metrics for textual and visual question answering, like ROUGE, METEOR, and Exact Match (EM), focus heavily on n-gram based lexical similarity, often missing the deeper semantic ...

Shrikant Kendre, Austin Xu, Honglu Zhou et al.
Generative AI & LLMs 2025-11-21

We present a differentiable extension of the VEROS ocean model, enabling automatic differentiation through its dynamical core. We describe the key modifications required to make the model fully compat...

Etienne Meunier, Said Ouala, Hugo Frezat et al.
AI in healthcare 2025-11-20

The task of dataset distillation aims to find a small set of synthetic images such that training a model on them reproduces the performance of the same model trained on a much larger dataset of real s...

George Cazenavette, Antonio Torralba, Vincent Sitzmann
AI in healthcare 2025-11-20

Vision Language Models (VLMs) have achieved impressive performance on spatial reasoning benchmarks, yet these evaluations mask critical weaknesses in understanding object interactions. Current benchma...

Vineet Bhat, Sungsu Kim, Valts Blukis et al.
Explainable & Ethical AI 2025-11-20

We introduce the Bayesian Information-Theoretic Sampling for hierarchical GAussian Process Surrogates (BITS for GAPS) framework to emulate latent components in hybrid physical systems. BITS for GAPS s...

Kyla D. Jones, Alexander W. Dowling
Explainable & Ethical AI 2025-11-20

This paper introduces Generative Augmented Reality (GAR) as a next-generation paradigm that reframes augmentation as a process of world re-synthesis rather than world composition by a conventional AR ...

Chen Liang, Jiawen Zheng, Yufeng Zeng et al.
Explainable & Ethical AI 2025-11-20

Controllable image generation has attracted increasing attention in recent years, enabling users to manipulate visual content such as identity and style. However, achieving simultaneous control over t...

Zhenyuan Qin, Xincheng Shuai, Henghui Ding
Explainable & Ethical AI 2025-11-14

We study randomized experiments in bipartite systems where only a subset of treatment-side units are eligible for assignment while all units continue to interact, generating interference. We formalize...

Albert Tan, Mohsen Bayati, James Nordlund et al.
Explainable & Ethical AI 2025-11-14

Agile methods are characterised by iterative and incremental processes with a strong focus on flexibility and accommodating changing requirements based on either technical, regulatory, or stakeholder ...

J. Antonio Dantas Macedo, Hugo Fernandes, J. Eduardo Ferreira Ribeiro
Generative AI & LLMs 2025-11-14

We present \textsf{ModularSubsetSelection} (MSS), a new algorithm for locally differentially private (LDP) frequency estimation. Given a universe of size $k$ and $n$ users, our $\varepsilon$-LDP mecha...

Héber H. Arcolezi
Generative AI & LLMs 2025-11-14

Recent text-to-image (T2I) models have made remarkable progress in generating visually realistic and semantically coherent images. However, they still suffer from randomness and inconsistency with the...

Kaishen Wang, Ruibo Chen, Tong Zheng et al.
Generative AI & LLMs 2025-11-14

We introduce proactive hearing assistants that automatically identify and separate the wearer's conversation partners, without requiring explicit prompts. Our system operates on egocentric binaural au...

Guilin Hu, Malek Itani, Tuochao Chen et al.
Agentic AI 2025-11-14

Comprehending long visual documents, where information is distributed across extensive pages of text and visual elements, is a critical but challenging task for modern Vision-Language Models (VLMs). E...

Dawei Zhu, Rui Meng, Jiefeng Chen et al.
Computer Vision & MultiModal AI 2025-11-14

Recent advances in generative modeling have substantially enhanced 3D urban generation, enabling applications in digital twins, virtual cities, and large-scale simulations. However, existing methods f...

Yijie Kang, Xinliang Wang, Zhenyu Wu et al.
Generative AI & LLMs 2025-11-14

The evolution of Visual Large Language Models (VLLMs) has revolutionized the automatic understanding of Visually Rich Documents (VRDs), which contain both textual and visual elements. Although VLLMs e...

Davide Napolitano, Luca Cagliero, Fabrizio Battiloro
Computer Vision & MultiModal AI 2025-11-14

The "Vision Zero" policy, introduced by the Swedish Parliament in 1997, aims to eliminate fatalities and serious injuries resulting from traffic accidents. To achieve this goal, the use of self-drivin...

Mustafa Erdem Kırmızıgül, Hasan Feyzi Doğruyol, Haluk Bayram
Generative AI & LLMs 2025-11-14

Recently, several instances of non-Euclidean SGD, including SignSGD, Lion, and Muon, have attracted significant interest from the optimization community due to their practical success in training deep...

Dmitry Kovalev, Ekaterina Borodich
Generative AI & LLMs 2025-11-13

Data attribution for text-to-image models aims to identify the training images that most significantly influenced a generated output. Existing attribution methods involve considerable computational re...

Sheng-Yu Wang, Aaron Hertzmann, Alexei A Efros et al.
Agentic AI 2025-11-13

The improving multi-armed bandits problem is a formal model for allocating effort under uncertainty, motivated by scenarios such as investing research effort into new technologies, performing clinical...

Avrim Blum, Marten Garicano, Kavya Ravichandran et al.
Computer Vision & MultiModal AI 2025-11-13

Although Sentinel-2 based land use and land cover (LULC) classification is critical for various environmental monitoring applications, it is a very difficult task due to some key data challenges (e.g....

Zack Dewis, Yimin Zhu, Zhengsen Xu et al.
Generative AI & LLMs 2025-11-13

Automated emotion detection is widely used in applications ranging from well-being monitoring to high-stakes domains like mental health and hiring. However, models often rely on annotations that refle...

Rebecca Dorn, Christina Chance, Casandra Rusti et al.
Explainable & Ethical AI 2025-11-13

As social robots increasingly enter public environments, their acceptance depends not only on technical reliability but also on ethical integrity, accessibility, and user trust. This paper reports on ...

Samson Oruma, Ricardo Colomo-Palacios, Vasileios Gkioulos
Agentic AI 2025-11-06

Large language model (LLM)-based agents struggle to generalize to novel and complex environments, such as unseen websites or new sets of functions, due to a fundamental mismatch between their pre-trai...

Arthur Chen, Zuxin Liu, Jianguo Zhang et al.
Generative AI & LLMs 2025-10-31

To serve global users safely and productively, LLMs need culture-specific knowledge that might not be learned during pre-training. How do we find such knowledge that is (1) salient to in-group users, ...

Caleb Ziems, William Held, Jane Yu et al.
Computer Vision & MultiModal AI 2025-10-31

Semantic segmentation of blood vessels is an important task in medical image analysis, but its progress is often hindered by the scarcity of large annotated datasets and the poor generalization of mod...

Cesar H. Comin, Wesley N. Galvão
Generative AI & LLMs 2025-10-30

We present Contamination Detection via Context (CoDeC), a practical and accurate method to detect and quantify training data contamination in large language models. CoDeC distinguishes between data me...

Michał Zawalski, Meriem Boubdir, Klaudia Bałazy et al.
Generative AI & LLMs 2025-10-30

Previous work has argued that recursive numeral systems optimise the trade-off between lexicon size and average morphosyntatic complexity (Deni\'c and Szymanik, 2024). However, showing that only natur...

Ponrawee Prasertsom, Andrea Silvi, Jennifer Culbertson et al.
Explainable & Ethical AI 2025-10-30

Hallucination--defined here as generating statements unsupported or contradicted by available evidence or conversational context--remains a major obstacle to deploying conversational AI systems in set...

Ashley Lewis, Andrew Perrault, Eric Fosler-Lussier et al.
Computer Vision & MultiModal AI 2025-10-24

Video generation models have progressed tremendously through large latent diffusion transformers trained with rectified flow techniques. Yet these models still struggle with geometric inconsistencies,...

Orest Kupyn, Fabian Manhardt, Federico Tombari et al.
Agentic AI 2025-10-24

AI agents hold the potential to revolutionize scientific productivity by automating literature reviews, replicating experiments, analyzing data, and even proposing new directions of inquiry; indeed, t...

Jonathan Bragg, Mike D'Arcy, Nishant Balepur et al.
Generative AI & LLMs 2025-10-24

Geometric data and purpose-built generative models on them have become ubiquitous in high-impact deep learning application domains, ranging from protein backbone generation and computational chemistry...

Oscar Davis, Michael S. Albergo, Nicholas M. Boffi et al.
Generative AI & LLMs 2025-10-24

We introduce a highly expressive yet distinctly tractable family for black-box variational inference (BBVI). Each member of this family is a weighted product of experts (PoE), and each weighted expert...

Diana Cai, Robert M. Gower, David M. Blei et al.
AI in healthcare 2025-10-24

Constructing comprehensive knowledge graphs requires the use of multiple ontologies in order to fully contextualize data into a domain. Ontology matching finds equivalences between concepts interconne...

Marta Contreiras Silva, Daniel Faria, Catia Pesquita
Agentic AI 2025-10-24

Achieving safe, reliable real-world robotic manipulation requires agents to evolve beyond vision and incorporate tactile sensing to overcome sensory deficits and reliance on idealised state informatio...

Elle Miller, Trevor McInroe, David Abel et al.
Agentic AI 2025-10-24

Post-disaster road assessment (PDRA) is essential for emergency response, enabling rapid evaluation of infrastructure conditions and efficient allocation of resources. Although drones provide a flexib...

Huatian Gong, Jiuh-Biing Sheu, Zheng Wang et al.
Explainable & Ethical AI 2025-10-24

Long-horizon reasoning in LLM-based agents often fails not from generative weakness but from insufficient verification of intermediate reasoning. Co-Sight addresses this challenge by turning reasoning...

Hongwei Zhang, Ji Lu, Shiqing Jiang et al.
Generative AI & LLMs 2025-10-24

Retrieval-Augmented Generation (RAG) integrates external knowledge to mitigate hallucinations, yet models often generate outputs inconsistent with retrieved content. Accurate hallucination detection r...

Likun Tan, Kuan-Wei Huang, Joy Shi et al.
Computer Vision & MultiModal AI 2025-10-24

With the widespread adoption of wearable devices in our daily lives, the demand and appeal for remote patient monitoring have significantly increased. Most research in this field has concentrated on c...

Thanh Cong Ho, Farah Kharrat, Abderrazek Abid et al.
AI in healthcare 2025-10-24

Neural networks excel at processing unstructured data but often fail to generalise out-of-distribution, whereas classical algorithms guarantee correctness but lack flexibility. We explore whether pret...

Jason Wu, Petar Veličković
AI in healthcare 2025-10-17

The clinical adoption of biomedical vision-language models is hindered by prompt optimization techniques that produce either uninterpretable latent vectors or single textual prompts. This lack of tran...

Kaushitha Silva, Mansitha Eashwara, Sanduni Ubayasiri et al.
Computer Vision & MultiModal AI 2025-10-17

Open-Vocabulary Semantic Segmentation (OVSS) assigns pixel-level labels from an open set of categories, requiring generalization to unseen and unlabelled objects. Using vision-language models (VLMs) t...

Jiayi Lin, Jiabo Huang, Shaogang Gong
Agentic AI 2025-10-17

Tool-augmented large language models (LLMs) are emerging as deep research agents, systems that decompose complex queries, retrieve external evidence, and synthesize grounded responses. Yet current age...

Yi Wan, Jiuqi Wang, Liam Li et al.
Agentic AI 2025-10-17

So-called `wicked problems', those involving complex multi-dimensional settings, non-verifiable outcomes, heterogeneous impacts and a lack of single objectively correct answers, have plagued humans th...

Richard M. Bailey
Generative AI & LLMs 2025-10-17

Accurate forecasting is critical for reliable power grid operations, particularly as the share of renewable generation, such as wind and solar, continues to grow. Given the inherent uncertainty and va...

Alireza Moradi, Mathieu Tanneau, Reza Zandehshahvar et al.
Explainable & Ethical AI 2025-10-17

Concept Bottleneck Models (CBMs) enhance interpretability by predicting human-understandable concepts as intermediate representations. However, existing CBMs often suffer from input-to-concept mapping...

Gaoxiang Huang, Songning Lai, Yutao Yue
Computer Vision & MultiModal AI 2025-10-17

Despite rapid advances in text-to-video synthesis, generated video quality remains critically dependent on precise user prompts. Existing test-time optimization methods, successful in other domains, s...

Do Xuan Long, Xingchen Wan, Hootan Nakhost et al.
Computer Vision & MultiModal AI 2025-10-17

Light field images capture multi-view scene information and play a crucial role in 3D scene reconstruction. However, their high-dimensional nature results in enormous data volumes, posing a significan...

Gai Zhang, Xinfeng Zhang, Lv Tang et al.
Agentic AI 2025-10-17

We study conformal inference in non-exchangeable environments through the lens of Blackwell's theory of approachability. We first recast adaptive conformal inference (ACI, Gibbs and Cand\`es, 2021) as...

Guillaume Principato, Gilles Stoltz
AI in healthcare 2025-10-16

Hip fractures are a major cause of disability, mortality, and healthcare burden in older adults, underscoring the need for early risk assessment. However, commonly used tools such as the DXA T-score a...

Shuo Sun, Meiling Zhou, Chen Zhao et al.
Agentic AI 2025-10-16

We argue that progress toward AGI is theory limited rather than data or scale limited. Building on the critical rationalism of Popper and Deutsch, we challenge the Platonic Representation Hypothesis. ...

Marcus A. Thomas
Agentic AI 2025-10-16

Identifying the effects of mechanical ventilation strategies and protocols in critical care requires analyzing data from heterogeneous patient-ventilator systems within the context of the clinical dec...

David J. Albers, Tell D. Bennett, Jana de Wiljes et al.
Agentic AI 2025-10-16

Reasoning language models such as OpenAI-o1, DeepSeek-R1, and Qwen achieve strong performance via extended chains of thought but often generate unnecessarily long outputs. Maximizing intelligence per ...

Shih-Yang Liu, Xin Dong, Ximing Lu et al.
Generative AI & LLMs 2025-10-16

Advanced Persistent Threats (APTs) are stealthy cyberattacks that often evade detection in system-level audit logs. Provenance graphs model these logs as connected entities and events, revealing relat...

Ahmed Aly, Essam Mansour, Amr Youssef
Generative AI & LLMs 2025-10-16

Widespread LLM adoption has introduced characteristic repetitive phraseology, termed ``slop,'' which degrades output quality and makes AI-generated text immediately recognizable. We present Antislop, ...

Samuel Paech, Allen Roush, Judah Goldfeder et al.
Generative AI & LLMs 2025-10-03

Product quantisation (PQ) is a classical method for scalable vector encoding, yet it has seen limited usage for latent representations in high-fidelity image generation. In this work, we introduce PQG...

Denis Zavadski, Nikita Philip Tatsch, Carsten Rother
Agentic AI 2025-10-03

Partial observability is a notorious challenge in reinforcement learning (RL), due to the need to learn complex, history-dependent policies. Recent empirical successes have used privileged expert dist...

Yuda Song, Dhruv Rohatgi, Aarti Singh et al.
Generative AI & LLMs 2025-10-03

Learning rate warm-up - increasing the learning rate at the beginning of training - has become a ubiquitous heuristic in modern deep learning, yet its theoretical foundations remain poorly understood....

Foivos Alimisis, Rustem Islamov, Aurelien Lucchi
Generative AI & LLMs 2025-09-26

Most existing approaches to referring segmentation achieve strong performance only through fine-tuning or by composing multiple pre-trained models, often at the cost of additional training and archite...

Anna Kukleva, Enis Simsar, Alessio Tonioni et al.
Generative AI & LLMs 2025-09-26

N-gram novelty is widely used to evaluate language models' ability to generate text outside of their training data. More recently, it has also been adopted as a metric for measuring textual creativity...

Arkadiy Saakyan, Najoung Kim, Smaranda Muresan et al.
Agentic AI 2025-09-26

Recent reinforcement learning (RL) methods have substantially enhanced the planning capabilities of Large Language Models (LLMs), yet the theoretical basis for their effectiveness remains elusive. In ...

Siwei Wang, Yifei Shen, Haoran Sun et al.
AI in healthcare 2025-09-26

Asynchronous patient-clinician messaging via EHR portals is a growing source of clinician workload, prompting interest in large language models (LLMs) to assist with draft responses. However, LLM outp...

Wenyuan Chen, Fateme Nateghi Haredasht, Kameron C. Black et al.
Generative AI & LLMs 2025-09-26

We propose a reduced-space formulation for optimizing over trained neural networks where the network's outputs and derivatives are evaluated on a GPU. To do this, we treat the neural network as a "gra...

Robert Parker, Oscar Dowson, Nicole LoGiudice et al.
Generative AI & LLMs 2025-09-26

Multimodal large language models (MLLMs) have demonstrated remarkable capabilities in aligning visual inputs with natural language outputs. Yet, the extent to which generated tokens depend on visual m...

Ruoyu Chen, Xiaoqing Guo, Kangwei Liu et al.
Agentic AI 2025-09-26

Heuristic functions are central to the performance of search algorithms such as A-star, where admissibility - the property of never overestimating the true shortest-path cost - guarantees solution opt...

Ehsan Futuhi, Nathan R. Sturtevant
AI in healthcare 2025-09-26

Deep neural networks and brains both learn and share superficial similarities: processing nodes are likened to neurons and adjustable weights are likened to modifiable synapses. But can a unified theo...

Arsham Ghavasieh, Meritxell Vila-Minana, Akanksha Khurd et al.
Agentic AI 2025-09-26

In this paper, we present two effective policy learning algorithms for multi-agent online coordination(MA-OC) problem. The first one, \texttt{MA-SPL}, not only can achieve the optimal $(1-\frac{c}{e})...

Qixin Zhang, Yan Sun, Can Jin et al.
Generative AI & LLMs 2025-09-26

Quantum generative models based on instantaneous quantum polynomial (IQP) circuits show great promise in learning complex distributions while maintaining classical trainability. However, current imple...

Xiaocheng Zou, Shijin Duan, Charles Fleming et al.
Generative AI & LLMs 2025-09-26

Spiking neural networks (SNNs) have recently been attracting significant attention for their biological plausibility and energy efficiency, but semi-supervised learning (SSL) methods for SNN-based mod...

Jini Yang, Beomseok Oh, Seungryong Kim et al.
Computer Vision & MultiModal AI 2025-09-26

Efficiently detecting human intent to interact with ubiquitous robots is crucial for effective human-robot interaction (HRI) and collaboration. Over the past decade, deep learning has gained traction ...

Farida Mohsen, Ali Safa
Agentic AI 2025-09-26

Reinforcement learning (RL) is the dominant paradigm for sharpening strategic tool use capabilities of LLMs on long-horizon, sparsely-rewarded agent tasks, yet it faces a fundamental challenge of expl...

Yulei Qin, Xiaoyu Tan, Zhengbao He et al.
Generative AI & LLMs 2025-09-26

We prove that all standard subregular language classes are linearly separable when represented by their deciding predicates. This establishes finite observability and guarantees learnability with simp...

Katsuhiko Hayashi, Hidetaka Kamigaito
Generative AI & LLMs 2025-09-26

Flow-matching generative models have emerged as a powerful paradigm for continuous data generation, achieving state-of-the-art results across domains such as images, 3D shapes, and point clouds. Despi...

Elaheh Akbari, Ping He, Ahmadreza Moradipari et al.
Agentic AI 2025-09-19

This paper investigates artificial intelligence (AI) methodologies for the synthesis and transpilation of permutation circuits across generic topologies. Our approach uses Reinforcement Learning (RL) ...

Victor Villar, Juan Cruz-Benito, Ismael Faro et al.
Agentic AI 2025-09-19

When do machine learning systems fail to generalize, and what mechanisms could improve their generalization? Here, we draw inspiration from cognitive science to argue that one weakness of machine lear...

Andrew Kyle Lampinen, Martin Engelcke, Yuxuan Li et al.
Generative AI & LLMs 2025-09-19

Code translation transforms source code from one programming language (PL) to another. Validating the functional equivalence of translation and repairing, if necessary, are critical steps in code tran...

Ali Reza Ibrahimzada, Brandon Paulsen, Reyhaneh Jabbarvand et al.
Agentic AI 2025-09-19

Designing effective reward functions remains a major challenge in reinforcement learning (RL), often requiring considerable human expertise and iterative refinement. Recent advances leverage Large Lan...

Changwei Yao, Xinzi Liu, Chen Li et al.
Computer Vision & MultiModal AI 2025-09-19

In this work, we present Blind-Spot Guided Diffusion, a novel self-supervised framework for real-world image denoising. Our approach addresses two major challenges: the limitations of blind-spot netwo...

Shen Cheng, Haipeng Li, Haibin Huang et al.
Agentic AI 2025-09-19

Atomic data determined by analysis of observed atomic spectra are essential for plasma diagnostics. For each low-ionisation open d- and f-subshell atomic species, around $10^3$ fine structure level en...

M. Ding, V. -A. Darvariu, A. N. Ryabtsev et al.
Explainable & Ethical AI 2025-09-19

Evaluating long-form answers in high-stakes domains such as law or medicine remains a fundamental challenge. Standard metrics like BLEU and ROUGE fail to capture semantic correctness, and current LLM-...

Fangyi Yu, Nabeel Seedat, Dasha Herrmannova et al.
Explainable & Ethical AI 2025-09-19

We propose an algorithm with improved query-complexity for the problem of hypothesis selection under local differential privacy constraints. Given a set of $k$ probability distributions $Q$, we descri...

Gautam Kamath, Alireza F. Pour, Matthew Regehr et al.
Generative AI & LLMs 2025-09-19

A well-known pitfall of molecular generative models is that they are not guaranteed to generate synthesizable molecules. There have been considerable attempts to address this problem, but given the ex...

Seul Lee, Karsten Kreis, Srimukh Prasad Veccham et al.
Generative AI & LLMs 2025-09-19

Neural audio codecs are a fundamental component of modern generative audio pipelines. Although recent codecs achieve strong low-bitrate reconstruction and provide powerful representations for downstre...

Luca Della Libera, Cem Subakan, Mirco Ravanelli
Generative AI & LLMs 2025-09-18

With the increasing popularity of large language models (LLMs) and LLM-based agents, reliable and effective code evaluation metrics (CEMs) have become crucial for progress across several software engi...

Simantika Bhattacharjee Dristi, Matthew B. Dwyer
Explainable & Ethical AI 2025-09-18

AI co-writing systems challenge long held ideals about agency and ownership in the creative process, thereby hindering widespread adoption. In order to address this, we investigate conceptions of agen...

Dashiel Carrera, Jeb Thomas-Mitchell, Daniel Wigdor
Generative AI & LLMs 2025-09-18

Deep learning models are notoriously opaque. Existing explanation methods often focus on localized visual explanations for individual images. Concept-based explanations, while offering global insights...

Kunal Rathore, Prasad Tadepalli
Computer Vision & MultiModal AI 2025-09-18

Plug-and-Play Priors (PnP) is a popular framework for solving imaging inverse problems by integrating learned priors in the form of denoisers trained to remove Gaussian noise from images. In standard ...

Edward P. Chandler, Shirin Shoushtari, Brendt Wohlberg et al.
Generative AI & LLMs 2025-09-18

Generative adversarial networks (GANs) have demonstrated significant progress in unpaired image-to-image translation in recent years for several applications. CycleGAN was the first to lead the way, a...

Mst Tasnim Pervin, George Bebis, Fang Jiang et al.
Computer Vision & MultiModal AI 2025-09-12

Decomposing an image into its intrinsic photometric factors--shading and reflectance--is a long-standing challenge due to the lack of extensive ground-truth data for real-world scenes. Recent methods ...

Zeqing Leo Yuan, Mani Ramanagopal, Aswin C. Sankaranarayanan et al.
Generative AI & LLMs 2025-09-12

We present Multipole Semantic Attention (MuSe), an efficient approximation of softmax attention that combines semantic clustering with multipole expansions from computational physics. Our method addre...

Rupert Mitchell, Kristian Kersting
Generative AI & LLMs 2025-09-12

Linear systems arise in generating samples and in calculating observables in lattice quantum chromodynamics~(QCD). Solving the Hermitian positive definite systems, which are sparse but ill-conditioned...

Yixuan Sun, Srinivas Eswar, Yin Lin et al.
AI in healthcare 2025-09-12

Optical Coherence Tomography (OCT) is a vital imaging modality for diagnosing and monitoring retinal diseases. However, OCT images are inherently degraded by speckle noise, which obscures fine details...

Botond Fazekas, Thomas Pinetz, Guilherme Aresta et al.
Computer Vision & MultiModal AI 2025-09-12

Mammography screening is an essential tool for early detection of breast cancer. The speed and accuracy of mammography interpretation have the potential to be improved with deep learning methods. Howe...

Yuexi Du, Lihui Chen, Nicha C. Dvornek
Generative AI & LLMs 2025-09-12

Several emerging post-Bayesian methods target a probability distribution for which an entropy-regularised variational objective is minimised. This increased flexibility introduces a computational chal...

Clémentine Chazal, Heishiro Kanagawa, Zheyang Shen et al.
Agentic AI 2025-09-12

Ensuring the resilience of computer-based railways is increasingly crucial to account for uncertainties and changes due to the growing complexity and criticality of those systems. Although their softw...

Francesco Vitale, Tommaso Zoppi, Francesco Flammini et al.
Explainable & Ethical AI 2025-09-12

Recent advances in text-based image editing have enabled fine-grained manipulation of visual content guided by natural language. However, such methods are susceptible to adversarial attacks. In this w...

Matteo Trippodo, Federico Becattini, Lorenzo Seidenari
Generative AI & LLMs 2025-09-12

Given a dataset of finitely many elements $\mathcal{T} = \{\mathbf{x}_i\}_{i = 1}^N$, the goal of dataset condensation (DC) is to construct a synthetic dataset $\mathcal{S} = \{\tilde{\mathbf{x}}_j\}_...

Tong Chen, Raghavendra Selvan
Agentic AI 2025-09-12

Reinforcement Learning (RL) agents deployed in real-world environments face degradation from sensor faults, actuator wear, and environmental shifts, yet lack intrinsic mechanisms to detect and diagnos...

Cameron Reid, Wael Hafez, Amirhossein Nazeri
AI in healthcare 2025-09-12

Contrastive learning is a widely adopted self-supervised pretraining strategy, yet its dependence on cohort composition remains underexplored. We present Contrasting by Patient Augmented Electrocardio...

Gul Rukh Khattak, Konstantinos Patlatzoglou, Joseph Barker et al.
Generative AI & LLMs 2025-09-11

Inference-time scaling has emerged as a powerful way to improve large language model (LLM) performance by generating multiple candidate responses and selecting among them. However, existing work on dy...

Jenny Y. Huang, Mehul Damani, Yousef El-Kurdi et al.
Agentic AI 2025-09-11

Social robots are increasingly experimented in public and assistive settings, but their accessibility for Deaf users remains quite underexplored. Italian Sign Language (LIS) is a fully-fledged natural...

Giulia Botta, Marco Botta, Cristina Gena et al.
Computer Vision & MultiModal AI 2025-09-11

In this paper we analyze the Gradient-Step Denoiser and its usage in Plug-and-Play algorithms. The Plug-and-Play paradigm of optimization algorithms uses off the shelf denoisers to replace a proximity...

Vincent Herfeld, Baudouin Denis de Senneville, Arthur Leclaire et al.
Computer Vision & MultiModal AI 2025-09-11

Objective: Deep learning-based deformable image registration has achieved strong accuracy, but remains sensitive to variations in input image characteristics such as artifacts, field-of-view mismatch,...

Yihao Liu, Junyu Chen, Lianrui Zuo et al.
Generative AI & LLMs 2025-09-05

In-context operator networks (ICON) are a class of operator learning methods based on the novel architectures of foundation models. Trained on a diverse set of datasets of initial and boundary conditi...

Benjamin J. Zhang, Siting Liu, Stanley J. Osher et al.
Generative AI & LLMs 2025-09-05

Chain-of-thought reasoning, while powerful, can produce unnecessarily verbose output for simpler problems. We present a framework for difficulty-aware reasoning that teaches models to dynamically adju...

Abdul Waheed, Chancharik Mitra, Laurie Z. Wang et al.
Generative AI & LLMs 2025-09-05

Editing complex real-world sound scenes is difficult because individual sound sources overlap in time. Generative models can fill-in missing or corrupted details based on their strong prior understand...

Daniel P. W. Ellis, Eduardo Fonseca, Ron J. Weiss et al.
Explainable & Ethical AI 2025-09-05

Pre-trained language models have achieved remarkable success across diverse applications but remain susceptible to spurious, concept-driven correlations that impair robustness and fairness. In this wo...

Aysenur Kocak, Shuo Yang, Bardh Prenkaj et al.
Agentic AI 2025-09-05

This paper presents a robust model predictive control (MPC) framework that explicitly addresses the non-Gaussian noise inherent in deep learning-based perception modules used for state estimation. Rec...

Nariman Niknejad, Gokul S. Sankar, Bahare Kiumarsi et al.
AI in healthcare 2025-09-05

Federated learning has the potential to unlock siloed data and distributed resources by enabling collaborative model training without sharing private data. As more complex foundational models gain wid...

Cosmin-Andrei Hatfaludi, Alex Serban
Computer Vision & MultiModal AI 2025-09-04

Technological advances have spurred an increase in data complexity and dimensionality. We are now in an era in which data sets containing thousands of features are commonplace. To digest and analyze s...

Justin Lin, Julia Fukuyama
Explainable & Ethical AI 2025-09-04

Like students facing hard exam questions, large language models sometimes guess when uncertain, producing plausible yet incorrect statements instead of admitting uncertainty. Such "hallucinations" per...

Adam Tauman Kalai, Ofir Nachum, Santosh S. Vempala et al.
Explainable & Ethical AI 2025-09-04

Recent AI work trends towards incorporating human-centric objectives, with the explicit goal of aligning AI models to personal preferences and societal values. Using standard preference elicitation me...

Cyrus Cousins, Vijay Keswani, Vincent Conitzer et al.
Agentic AI 2025-09-04

Building reliable LLM agents requires decisions at two levels: the graph (which modules exist and how information flows) and the configuration of each node (models, prompts, tools, control knobs). Mos...

Wenxiao Wang, Priyatham Kattakinda, Soheil Feizi
Agentic AI 2025-09-04

We present an imitation learning approach for spacecraft guidance, navigation, and control(GNC) that achieves high performance from limited data. Using only 100 expert demonstrations, equivalent to 6,...

Alejandro Posadas-Nava, Andrea Scorsoglio, Luca Ghilardi et al.
Generative AI & LLMs 2025-09-04

Multimodal foundation models can process several modalities. However, since the space of possible modalities is large and evolving over time, training a model from scratch to encompass all modalities ...

Osman Batur İnce, André F. T. Martins, Oisin Mac Aodha et al.
Computer Vision & MultiModal AI 2025-09-03

Vision-language models (VLMs) like CLIP have shown impressive zero-shot and few-shot learning capabilities across diverse applications. However, adapting these models to new fine-grained domains remai...

Taha Koleilat, Hassan Rivaz, Yiming Xiao
Computer Vision & MultiModal AI 2025-09-03

Nonnegative matrix factorization (NMF) is a known unsupervised data-reduction method. The principle of the common cause (PCC) is a basic methodological approach in probabilistic causality, which seeks...

E. Khalafyan, A. E. Allahverdyan, A. Hovhannisyan
Generative AI & LLMs 2025-09-03

Estimating scene lighting from a single image or video remains a longstanding challenge in computer vision and graphics. Learning-based approaches are constrained by the scarcity of ground-truth HDR e...

Ruofan Liang, Kai He, Zan Gojcic et al.
Explainable & Ethical AI 2025-08-29

Additive two-tower models are popular learning-to-rank methods for handling biased user feedback in industry settings. Recent studies, however, report a concerning phenomenon: training two-tower model...

Philipp Hager, Onno Zoeter, Maarten de Rijke
Explainable & Ethical AI 2025-08-29

Misleading visualizations are a potent driver of misinformation on social media and the web. By violating chart design principles, they distort data and lead readers to draw inaccurate conclusions. Pr...

Jonathan Tonglet, Jan Zimny, Tinne Tuytelaars et al.
AI in healthcare 2025-08-29

Bias in medical artificial intelligence is conventionally viewed as a defect requiring elimination. However, human reasoning inherently incorporates biases shaped by education, culture, and experience...

Farhad Abtahi, Mehdi Astaraki, Fernando Seoane
Generative AI & LLMs 2025-08-29

Best-of-n sampling improves the accuracy of large language models (LLMs) and large reasoning models (LRMs) by generating multiple candidate solutions and selecting the one with the highest reward. The...

Joshua Ong Jun Leang, Zheng Zhao, Aryo Pradipta Gema et al.
Agentic AI 2025-08-29

Planning with pretrained diffusion models has emerged as a promising approach for solving test-time guided control problems. However, standard gradient guidance typically performs optimally under conv...

Hyeonseong Jeon, Cheolhong Min, Jaesik Park
Computer Vision & MultiModal AI 2025-08-29

We propose a realistic scenario for the unsupervised video learning where neither task boundaries nor labels are provided when learning a succession of tasks. We also provide a non-parametric learning...

Nattapong Kurpukdee, Adrian G. Bors
Computer Vision & MultiModal AI 2025-08-29

Healthcare systems generate diverse multimodal data, including Electronic Health Records (EHR), clinical notes, and medical images. Effectively leveraging this data for clinical prediction is challeng...

Xiaoyang Wang, Christopher C. Yang
Generative AI & LLMs 2025-08-29

Supervised fine-tuning (SFT) is a pivotal approach to adapting large language models (LLMs) for downstream tasks; however, performance often suffers from the ``seesaw phenomenon'', where indiscriminat...

Yao Wang, Di Liang, Minlong Peng
Explainable & Ethical AI 2025-08-29

Artificial intelligence (AI)-based computer perception (CP) technologies use mobile sensors to collect behavioral and physiological data for clinical decision-making. These tools can reshape how clini...

Maya Guhan, Meghan E. Hurley, Eric A. Storch et al.
AI in healthcare 2025-08-29

Accurate interpretation of clinical narratives is critical for patient care, but the complexity of these notes makes automation challenging. While Large Language Models (LLMs) show promise, single-mod...

Yeawon Lee, Xiaoyang Wang, Christopher C. Yang
Explainable & Ethical AI 2025-08-29

As privacy regulations such as the GDPR and HIPAA and responsibility frameworks for artificial intelligence such as the AI Act gain traction, the ethical and responsible use of real-world data faces i...

Tobias Hyrup, Emmanouil Panagiotou, Arjun Roy et al.
AI in healthcare 2025-08-21

Electrocardiogram (ECG) analysis is foundational for cardiovascular disease diagnosis, yet the performance of deep learning models is often constrained by limited access to annotated data. Self-superv...

Yi Yuan, Joseph Van Duyn, Runze Yan et al.
AI in healthcare 2025-08-21

Understanding the nuanced performance of machine learning models is essential for responsible deployment, especially in high-stakes domains like healthcare and finance. This paper introduces a novel f...

Xin Du, Sikun Yang, Wouter Duivesteijn et al.
AI in healthcare 2025-08-21

Diabetic Retinopathy (DR) is a major cause of global blindness, necessitating early and accurate diagnosis. While deep learning models have shown promise in DR detection, their black-box nature often ...

Masato Ito, Kaito Tanaka, Keisuke Matsuda et al.
Generative AI & LLMs 2025-08-21

Extracting meaning from uncertain, noisy data is a fundamental problem across time series analysis, pattern recognition, and language modeling. This survey presents a unified mathematical framework th...

Mohammed Elmusrati
Computer Vision & MultiModal AI 2025-08-21

Generative video modeling has made significant strides, yet ensuring structural and temporal consistency over long sequences remains a challenge. Current methods predominantly rely on RGB signals, lea...

Zhiheng Liu, Xueqing Deng, Shoufa Chen et al.
Computer Vision & MultiModal AI 2025-08-21

E-commerce platforms are rich in multimodal data, featuring a variety of images that depict product details. However, this raises an important question: do these images always enhance product understa...

Xinyi Ling, Hanwen Du, Zhihui Zhu et al.
Explainable & Ethical AI 2025-08-21

This paper argues that a techno-philosophical reading of the EU AI Act provides insight into the long-term dynamics of data in AI systems, specifically, how the lifecycle from ingestion to deployment ...

Mark Cote, Susana Aires
Explainable & Ethical AI 2025-08-21

Modeling feature interactions in tabular data remains a key challenge in predictive modeling, for example, as used for insurance pricing. This paper proposes the Tree-like Pairwise Interaction Network...

Ronald Richman, Salvatore Scognamiglio, Mario V. Wüthrich
Computer Vision & MultiModal AI 2025-08-21

Effective modeling of heterogeneous subpopulations presents a significant challenge due to variations in individual characteristics and behaviors. This paper proposes a novel approach to address this ...

Elif Konyar, Mostafa Reisi Gahrooei, Kamran Paynabar
Generative AI & LLMs 2025-08-21

Artificial intelligence (AI)-based models are revolutionizing weather forecasting and have surpassed leading numerical weather prediction systems on various benchmark tasks. However, their ability to ...

Zhongwei Zhang, Erich Fischer, Jakob Zscheischler et al.
Computer Vision & MultiModal AI 2025-08-21

Visual diffusion models achieve remarkable progress, yet they are typically trained at limited resolutions due to the lack of high-resolution data and constrained computation resources, hampering thei...

Haonan Qiu, Ning Yu, Ziqi Huang et al.
Agentic AI 2025-08-21

Accurate and efficient simulation of modern robots remains challenging due to their high degrees of freedom and intricate mechanisms. Neural simulators have emerged as a promising alternative to tradi...

Jie Xu, Eric Heiden, Iretiayo Akinola et al.
Agentic AI 2025-08-21

We present several advances to the physics and equality constrained artificial neural networks (PECANN) framework that substantially improve its capability to learn solutions of canonical partial diff...

Qifeng Hu, Shamsulhaq Basir, Inanc Senocak
Generative AI & LLMs 2025-08-21

Modest statistical differences between the sampling performances of the D-Wave quantum annealer (QA) and the classical Markov Chain Monte Carlo (MCMC), when applied to Restricted Boltzmann Machines (R...

Abdelmoula El-Yazizi, Yaroslav Koshka
Agentic AI 2025-08-21

We present NiceWebRL, a research tool that enables researchers to use machine reinforcement learning (RL) environments for online human subject experiments. NiceWebRL is a Python library that allows a...

Wilka Carvalho, Vikram Goddla, Ishaan Sinha et al.
Generative AI & LLMs 2025-08-18

Commonsense validation evaluates whether a sentence aligns with everyday human understanding, a critical capability for developing robust natural language understanding systems. While substantial prog...

Kareem Elozeiri, Mervat Abassy, Preslav Nakov et al.
Generative AI & LLMs 2025-08-18

Watermarking has recently emerged as an effective strategy for detecting the generations of large language models (LLMs). The strength of a watermark typically depends strongly on the entropy afforded...

Dara Bahri, John Wieting
Computer Vision & MultiModal AI 2025-08-18

Reconstructing complete and interactive 3D scenes remains a fundamental challenge in computer vision and robotics, particularly due to persistent object occlusions and limited sensor coverage. Multivi...

Wenhao Hu, Zesheng Li, Haonan Zhou et al.
AI in healthcare 2025-08-18

Anatomic tracer studies are critical for validating and improving diffusion MRI (dMRI) tractography. However, large-scale analysis of data from such studies is hampered by the labor-intensive process ...

Kyriaki-Margarita Bintsi, Yaël Balbastre, Jingjing Wu et al.
Computer Vision & MultiModal AI 2025-08-18

Programmable structures are systems whose undeformed geometries and material property distributions are deliberately designed to achieve prescribed deformed configurations under specific loading condi...

Sara Karimi, Nikolaos N. Vlassis
AI in healthcare 2025-08-18

We propose a two-stage multimodal framework that enhances disease classification and region-aware radiology report generation from chest X-rays, leveraging the MIMIC-Eye dataset. In the first stage, w...

Tanjim Islam Riju, Shuchismita Anwar, Saman Sarker Joy et al.
Generative AI & LLMs 2025-08-18

Developing large language models is expensive and involves making decisions with small experiments, typically by evaluating on large, multi-task evaluation suites. In this work, we analyze specific pr...

David Heineman, Valentin Hofmann, Ian Magnusson et al.
Computer Vision & MultiModal AI 2025-08-18

We present 4DNeX, the first feed-forward framework for generating 4D (i.e., dynamic 3D) scene representations from a single image. In contrast to existing methods that rely on computationally intensiv...

Zhaoxi Chen, Tianqi Liu, Long Zhuo et al.
Generative AI & LLMs 2025-08-18

Thinking LLMs solve complex tasks at the expense of increased compute and overthinking on simpler problems, while non-thinking LLMs are faster and cheaper but underthink on harder reasoning problems. ...

Pranjal Aggarwal, Seungone Kim, Jack Lanchantin et al.
Explainable & Ethical AI 2025-08-18

Calibration requires that predictions are conditionally unbiased and, therefore, reliably interpretable as probabilities. Calibration measures quantify how far a predictor is from perfect calibration....

Jason Hartline, Lunjia Hu, Yifan Wu
Generative AI & LLMs 2025-08-18

Foundational modelling of multi-dimensional time-series data in industrial systems presents a central trade-off: channel-dependent (CD) models capture specific cross-variable dynamics but lack robustn...

Michael Mayr, Georgios C. Chasparis
Agentic AI 2025-08-18

In classical AI, perception relies on learning state-based representations, while planning, which can be thought of as temporal reasoning over action sequences, is typically achieved through search. W...

Alicja Ziarko, Michal Bortkiewicz, Michal Zawalski et al.
Agentic AI 2025-08-18

Mobile manipulation in dynamic environments is challenging due to movable obstacles blocking the robot's path. Traditional methods, which treat navigation and manipulation as separate tasks, often fai...

Yuying Zhang, Joni Pajarinen
Agentic AI 2025-08-18

This work introduces an automated testing approach that employs agents controlling game characters to detect potential bugs within a game level. Harnessing the power of Bayesian Optimization (BO) to e...

Carlos Celemin
Generative AI & LLMs 2025-08-18

Diffusion language models, as a promising alternative to traditional autoregressive (AR) models, enable faster generation and richer conditioning on bidirectional context. However, they suffer from a ...

Haoyu He, Katrin Renz, Yong Cao et al.
Generative AI & LLMs 2025-08-18

Multi-modal models have achieved remarkable progress in recent years. Nevertheless, they continue to exhibit notable limitations in spatial understanding and reasoning, which are fundamental capabilit...

Zhongang Cai, Yubo Wang, Qingping Sun et al.
Computer Vision & MultiModal AI 2025-08-18

Cone-beam computed tomography (CBCT) has become an invaluable imaging modality in dentistry, enabling 3D visualization of teeth and surrounding structures for diagnosis and treatment planning. Automat...

Dominic LaBella, Keshav Jha, Jared Robbins et al.
Computer Vision & MultiModal AI 2025-08-18

This work studies the challenge of transfer animations between characters whose skeletal topologies differ substantially. While many techniques have advanced retargeting techniques in decades, transfe...

Ling-Hao Chen, Yuhong Zhang, Zixin Yin et al.
Generative AI & LLMs 2025-08-08

Being able to effectively read scientific plots, or chart understanding, is a central part toward building effective agents for science. However, existing multimodal large language models (MLLMs), esp...

Yuwei Yang, Zeyu Zhang, Yunzhong Hou et al.
AI in healthcare 2025-08-08

We introduce the multivariate fields of experts, a new framework for the learning of image priors. Our model generalizes existing fields of experts methods by incorporating multivariate potential func...

Stanislas Ducotterd, Michael Unser
Generative AI & LLMs 2025-08-08

Humans communicate with increasing efficiency in multi-turn interactions, by adapting their language and forming ad-hoc conventions. In contrast, prior work shows that LLMs do not naturally show this ...

Yilun Hua, Evan Wang, Yoav Artzi
Computer Vision & MultiModal AI 2025-08-08

Recent approaches for 3D relighting have shown promise in integrating 2D image relighting generative priors to alter the appearance of a 3D representation while preserving the underlying structure. Ne...

Yehonathan Litman, Fernando De la Torre, Shubham Tulsiani
Generative AI & LLMs 2025-08-08

Urbanization, climate change, and agricultural stress are increasing the demand for precise and timely environmental monitoring. Land Surface Temperature (LST) is a key variable in this context and is...

Sofiane Bouaziz, Adel Hafiane, Raphael Canals et al.
AI in healthcare 2025-08-08

Segmentation of lesions on CT enables automatic measurement for clinical assessment of chronic diseases (e.g., lymphoma). Integrating large language models (LLMs) into the lesion segmentation workflow...

Ruida Cheng, Tejas Sudharshan Mathai, Pritam Mukherjee et al.
Explainable & Ethical AI 2025-08-08

Blockchain-enabled federated learning (BCFL) addresses fundamental challenges of trust, privacy, and coordination in collaborative AI systems. This chapter provides comprehensive architectural analysi...

Murtaza Rangwala, Venugopal K R, Rajkumar Buyya
AI in healthcare 2025-08-08

Achieving precise control over a molecule's biological activity-encompassing targeted activation/inhibition, cooperative multi-target modulation, and off-target toxicity mitigation-remains a critical ...

Renyi Zhou, Huimin Zhu, Jing Tang et al.
Computer Vision & MultiModal AI 2025-08-08

Haptic captioning is the task of generating natural language descriptions from haptic signals, such as vibrations, for use in virtual reality, accessibility, and rehabilitation applications. While pre...

Guimin Hu, Daniel Hershcovich, Hasti Seifi
Agentic AI 2025-08-08

We present GLM-4.5, an open-source Mixture-of-Experts (MoE) large language model with 355B total parameters and 32B activated parameters, featuring a hybrid reasoning method that supports both thinkin...

GLM-4. 5 Team, :, Aohan Zeng et al.