AITopics | reconstruction

Collaborating Authors

reconstruction

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Physen-Noise2Noise: Physics-Guided Self-Supervised Defocus Deblurring with Bias Correction under Low-Light Conditions

Huang, Ziyan, Wu, Lang, Wang, Hongji, Liu, Yifei, Tang, Dongliang, Wang, Hongqiao

arXiv.org Machine LearningMay-26-2026

Low-light, long-exposure defocus deblurring remains a challenging problem due to the simultaneous presence of severe blur and complex biased noise. Existing methods typically rely on simplified noise assumptions, which limits their effectiveness under realistic imaging conditions. In this work, we propose Physen-Noise2Noise, a self-supervised deblurring framework guided by the physical model of defocus imaging, which leverages noisy multi-frame observations without requiring clean reference images. Unlike conventional Noise2Noise-based approaches that assume zero-mean noise, we derive a frequency-domain constraint inherent to the defocus imaging process and incorporate it into the learning framework via a learnable noise bias parameter. In addition, a multi-frame noisy initialization strategy is introduced to suppress complex biased noise prior to deblurring, providing a more stable starting point for reconstruction. This formulation explicitly models biased noise and enables joint bias correction and high-frequency detail recovery during training. Furthermore, we develop a pretrain-finetune variant to enhance robustness and generalization under challenging noise conditions. Extensive experiments on both simulation and real-world datasets demonstrate that the proposed method consistently outperforms state-of-the-art self-supervised approaches for defocus deblurring in the presence of complex biased noise.

artificial intelligence, machine learning, noise, (17 more...)

arXiv.org Machine Learning

2605.2459

Genre: Research Report > New Finding (0.46)

Industry: Media > Photography (0.48)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

StrTransformer: Source-Wise Structured Transformers for Unsupervised Blind Source Recovery

Wei, Yuan-Hao

arXiv.org Machine LearningMay-26-2026

This paper proposes StrTransformer, a source-wise structured Transformer framework for blind source recovery and branch-wise latent modeling. Instead of using an encoder to infer latent variables, StrTransformer directly optimizes the latent source matrix together with an observation-space mixer and source-wise structural Transformer branches. The mixer enforces reconstruction consistency, while each Transformer branch imposes a differentiable structural constraint on one latent source trajectory. Specifically, each source is converted into multi-scale patch tokens, randomly masked, processed by a locality-biased Transformer, and evaluated through a masked patch reconstruction energy. This energy acts as an implicit source-wise structural prior. To encourage different latent branches to specialize into different temporal regimes, StrTransformer further introduces an ordered multi-scale controller that learns branch-specific patch-scale weights, ordered scale centers, and locality attention slopes. The resulting objective combines observation reconstruction, source-wise structural regularization, and modular auxiliary penalties for separation and scale specialization. We analyze the decoupling and coupling structure of the objective, the regularized exact-reconstruction fiber, and the reduction of permutation symmetry induced by ordered branch descriptors. A controlled case study shows that the learned branches converge to distinct temporal-scale structures and recover source-aligned latent trajectories under post-hoc evaluation.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2605.25648

Genre: Research Report > Experimental Study (0.34)

Industry: Energy (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Data Science (0.68)

Add feedback

Improved Baselines with Representation Autoencoders

Singh, Jaskirat, Zheng, Boyang, Wu, Zongze, Zhang, Richard, Shechtman, Eli, Xie, Saining

arXiv.org Machine LearningMay-19-2026

Representation Autoencoders (RAE) replace traditional VAE with pretrained vision encoders. In this paper, we systematically investigate several design choices and find three insights which simplify and improve RAE. First, we study a generalized formulation where the representation is defined as sum of the last k encoder layers rather than solely the final layer. This simple change greatly improves reconstruction without encoder finetuning or specialized data (e.g., text, faces). Second, we study the prevalent assumption that RAE (using pretrained representation as encoder) replaces representation alignment (REPA), which distills the same representation to intermediate layers instead. Through large-scale empirical analysis, we uncover a surprising finding: RAE and REPA exhibit complementary working mechanisms, allowing the same representation to be used as both encoder and target for intermediate diffusion layers. Finally, the original RAE struggles with classifier-free guidance (CFG) and requires training a second, weaker diffusion model for AutoGuidance (AG). We show that REPA itself can be viewed as x-prediction in RAE latent space. By simply re-parameterizing the output of the DiT model, it can provide guidance for "free". Overall, RAEv2 leads to more than 10x faster convergence over the original RAE, achieving a state-of-the-art gFID of 1.06 in just 80 epochs on ImageNet-256. On FDr^k, RAEv2 achieves a state-of-the-art 2.17 at just 80 epochs compared to the previous best 3.26 (800 epochs) without any post-training. This motivates EP_FID@k (epochs to reach unguided gFID <= k) as a measure of training efficiency. RAEv2 attains an EP_FID@2 of 35 epochs, versus 177 for the original RAE. We also validate our approach across diverse settings for text-to-image generation and navigation world models, showing consistent improvements. Code is available at https://raev2.github.io.

artificial intelligence, encoder, machine learning, (16 more...)

arXiv.org Machine Learning

2605.18324

Country: Asia (0.28)

Genre: Research Report (0.64)

Industry: Media > Photography (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Testing properties of trees in graphical models with covariance queries

Burova, Sofiya, Calvillo, Francisco, Lugosi, Gábor, Zwiernik, Piotr

arXiv.org Machine LearningMay-18-2026

We consider the problem of testing properties of graphs underlying high-dimensional graphical models. We adopt the model of covariance queries introduced by Lugosi, Truszkowski, Velona, and Zwiernik (2021). We study the case when the underlying graph is a tree. The main results of the paper show that, while reconstructing the entire tree may be costly, certain global structural properties can be tested efficiently. In particular, we design randomized tests for global structural properties that use a sub-quadratic number of queries. We develop testing procedures for several fundamental properties, including the number of leaves, the maximum degree, the typical distance, and the diameter of the tree. For each property, we obtain explicit query complexity bounds that depend on the target threshold and tolerance parameters.

artificial intelligence, machine learning, probability, (18 more...)

arXiv.org Machine Learning

2605.15996

Country:

North America > United States (0.46)
Europe > Spain (0.29)

Genre: Research Report (0.64)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Systems & Languages (0.63)

Add feedback

Text Knows What, Tables Know When: Clinical Timeline Reconstruction via Retrieval-Augmented Multimodal Alignment

Kumar, Sayantan, Noroozizadeh, Shahriar, Kim, Juyong, Weiss, Jeremy C.

arXiv.org Machine LearningMay-15-2026

Reconstructing precise clinical timelines is essential for modeling patient trajectories and forecasting risk in complex, heterogeneous conditions like sepsis. While unstructured clinical narratives offer semantically rich and contextually complete descriptions of a patient's course, they often lack temporal precision and contain ambiguous event timing. Conversely, structured electronic health record (EHR) data provides precise temporal anchors but misses a substantial portion of clinically meaningful events. We introduce a retrieval-augmented multimodal alignment framework that bridges this gap to improve the temporal precision of absolute clinical timelines extracted from text. Our approach formulates timeline reconstruction as a graph-based multistep process: it first extracts central anchor events from narratives to build an initial temporal scaffold, places non-central events relative to this backbone, and then calibrates the timeline using retrieved structured EHR rows as external temporal evidence. Evaluated using instruction-tuned large language models on the i2m4 benchmark spanning MIMIC-III and MIMIC-IV, our multimodal pipeline consistently improves absolute timestamp accuracy (AULTC) and improves temporal concordance across nearly all evaluated models over unimodal text-only reconstruction, without compromising event match rates. Furthermore, our empirical gap analysis reveals that 34.8% of text-derived events are entirely absent from tabular records, demonstrating that aligning these modalities can produce a more temporally faithful and clinically informative reconstruction of patient trajectories than either source alone.

large language model, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2605.15168

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)

Add feedback

On Hallucinations in Inverse Problems: Fundamental Limits and Provable Assessment Methods

Iagaru, David, Gottschling, Nina M., Hansen, Anders C., Garnier, Josselin

arXiv.org Machine LearningMay-14-2026

While deep learning has revolutionised inverse problems, its safe deployment is hindered by three primary reliability concerns: hallucinations, instabilities, and performance volatility [48]. Hallucinations manifest as high-fidelity features that are factually false; instabilities reflect heightened sensitivity to measurement noise; and performance volatility refers to significant fluctuations in reconstruction quality across the data, yielding high-fidelity results for some samples while failing on seemingly similar images. In many applications, the risk of generating realistic but unfaithful content can impede the safe deployment of AI methods for inverse problems. The choice of "hallucinate" as the Cambridge Dictionary's word of the year in 2023 illustrates this open problem [53]. The problem of AI hallucinations persists, as the Financial Times [44] highlighted that, "AI hallucinations haunt users more than job losses." A first step toward training AI methods that do not suffer from hallucinations is the assessment and identification of hallucinated outputs. Consider the inverse problem of recovering xfrom noisy measurements y " Fpx,eq, x PM1 ĂX, e PEĂY, (1.1)

artificial intelligence, machine learning, reconstruction, (18 more...)

arXiv.org Machine Learning

2605.13146

Country:

North America > United States (1.00)
Europe (0.67)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Energy (1.00)
(2 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Simultaneous Monitoring of Shape and Surface Color via 4D Point Clouds: A Registration-free Approach

Patalano, Mariafrancesca, Capizzi, Giovanna, Paynabar, Kamran

arXiv.org Machine LearningMay-12-2026

Advanced manufacturing technologies allow for the production of intricate parts featuring high shape complexity and spatially-varying material composition. Data fusion of point clouds with chromatic attributes provides 4D point clouds, a compact and informative representation that encodes both shape and material information. In this paper, we present a registration-free framework for Simultaneous Monitoring of shApe and Color (SMAC) via 4D point clouds. The proposed framework leverages Laplace-Beltrami operator spectral properties to capture and monitor geometric features and the relationship between shape and surface color. A combined monitoring scheme is proposed to effectively detect shape deformations and color anomalies, along with a spatially-aware post-signal diagnostic procedure to determine the source of change and localize color anomalies. Importantly, neither component relies on registration or mesh reconstruction, eliminating error-prone and computationally expensive preprocessing steps. A Monte Carlo simulation study and a case study on functionally graded materials demonstrate that SMAC achieves effective detection performance, particularly for subtle defects, while providing diagnostic capabilities to identify the source and location of anomalies.

artificial intelligence, machine learning, point cloud, (16 more...)

arXiv.org Machine Learning

2605.08753

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.49)

Add feedback

Feature Starvation as Geometric Instability in Sparse Autoencoders

Chaudhry, Faris, Yano, Keisuke, Monod, Anthea

arXiv.org Machine LearningMay-8-2026

Sparse autoencoders (SAEs) are used to disentangle the dense, polysemantic internal representations of large language models (LLMs) into interpretable, monosemantic concepts. However, standard $\ell_1$-regularized SAEs suffer from feature starvation (dead neurons) and shrinkage bias, often requiring computationally expensive heuristic resampling and nondifferentiable hard-masking methods to bypass these challenges. We argue that feature starvation is not merely an empirical artifact of poor data diversity, but a fundamental optimization-geometric pathology of overcomplete dictionaries: the $\ell_1$-induced sparse coding map is unstable and fundamentally misaligned with shallow, amortized encoders. To address this structural instability, we introduce adaptive elastic net SAEs (AEN-SAEs), a fully differentiable architecture grounded in classical sparse regression. AEN-SAEs combine an $\ell_2$ structural term that enforces strong convexity and Lipschitz stability with adaptive $\ell_1$ reweighting that eliminates shrinkage bias and suppresses spurious features, thereby jointly controlling the curvature and interaction structure of the induced polyhedral geometry. Theoretically, we show that AEN-SAEs yield a Lipschitz-continuous sparse coding map and recover the global feature support under mild assumptions. Empirically, across synthetic settings and LLMs (Pythia 70M, Llama 3.1 8B), AEN-SAEs mitigate feature starvation without auxiliary heuristics while maintaining competitive reconstruction abilities.

feature starvation, large language model, machine learning, (18 more...)

arXiv.org Machine Learning

2605.05341

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

A Bayesian Approach for Task-Specific Next-Best-View Selection with Uncertain Geometry

Zhu, Jingsen, Sellán, Silvia, Terenin, Alexander

arXiv.org Machine LearningMay-7-2026

We develop a framework for task-specific active next-best-view selection in 3D reconstruction from point clouds, by casting the problem in the language of Bayesian decision theory. Our framework works by (a) placing a prior distribution over the space of implicit surfaces, (b) using recently-developed stochastic surface reconstruction methods to calculate the resulting posterior distribution, then (c) using the posterior distribution to carefully reason about which view to scan next. This enables us to perform camera selection in a manner that is directly optimized for the intended use of the reconstructed data - meaning, we reduce uncertainty only in those regions that make a difference in the task at hand, as opposed to prior approaches that reduce it uniformly across space. We evaluate our method across three distinct downstream tasks: semantic classification, segmentation, and PDE-guided physics simulation. Experimental results demonstrate that our framework achieves superior task performance with fewer views compared to commonly used baselines and prior general uncertainty-reduction techniques.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

2605.05095

Country: North America > United States > California > Los Angeles County > Los Angeles (0.16)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

f8928b073ccbec15d35f2a9d39430bfd-Supplemental-Conference.pdf

Neural Information Processing SystemsMay-1-2026, 05:07:49 GMT

Our experiments in Section 3 and Section 4 were conducted with an adversary who has side informa-684 tion about the target point. Here, we reduce the amount of background knowledge the adversary has685 about the target, and measure how this affects the reconstruction upper bound and attack success.686 We do this in the following set-up: Given a target z, we initialize our reconstruction from uniform687 noise and optimize with the gradient-based reconstruction attack introduced in Section 2 to produce688 ˆz.

artificial intelligence, machine learning, reconstruction, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback