AITopics | conv3d

Collaborating Authors

conv3d

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

cfb95059128406d088ccb7b01bb2af6e-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-19-2026, 12:01:23 GMT

conv3d, predictor, relu, (14 more...)

Neural Information Processing Systems

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Italy (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
Asia > China > Beijing > Beijing (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

698d51a19d8a121ce581499d7b701668-Supplemental.pdf

Neural Information Processing SystemsFeb-9-2026, 04:29:09 GMT

Section 2, Section 3 and Section 4 provide more visualization results on a number of 3D modelingtasks,includingshapereconstruction,generationandinterpolation. Note that all the hierarchical aggregators {Ei} share the same networkparameters. Hierarchical decoder D includes an implicit octant decoder and some hierarchical local decoders {Di}, which maps the decoded shape code and a sample point (x,y,z) to the local geometry and the local latent feature, respectively. IN means the instance normalization 3D operator[9]. We provide two tables (Table 1 and Table 2) to detail the network structures of 3D voxel encoder andimplicitdecoder,respectively.

artificial intelligence, leakyrelu, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

57da66da25d0ce77e0129b246f358851-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 03:30:58 GMT

regularization, representation, segmentation, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
North America > Canada (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

cfb95059128406d088ccb7b01bb2af6e-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-19-2025, 02:23:55 GMT

artificial intelligence, machine learning, relu, (16 more...)

Neural Information Processing Systems

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Italy (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
Asia > China > Beijing > Beijing (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

57da66da25d0ce77e0129b246f358851-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-14-2025, 23:40:12 GMT

regularization, representation, segmentation, (15 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
North America > Canada (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Diff-Ensembler: Learning to Ensemble 2D Diffusion Models for Volume-to-Volume Medical Image Translation

Zhu, Xiyue, Kwark, Dou Hoon, Zhu, Ruike, Hong, Kaiwen, Tao, Yiqi, Luo, Shirui, Li, Yudu, Liang, Zhi-Pei, Kindratenko, Volodymyr

arXiv.org Artificial IntelligenceJan-13-2025

Despite success in volume-to-volume translations in medical images, most existing models struggle to effectively capture the inherent volumetric distribution using 3D representations. The current state-of-the-art approach combines multiple 2D-based networks through weighted averaging, thereby neglecting the 3D spatial structures. Directly training 3D models in medical imaging presents significant challenges due to high computational demands and the need for large-scale datasets. To address these challenges, we introduce Diff-Ensembler, a novel hybrid 2D-3D model for efficient and effective volumetric translations by ensembling perpendicularly trained 2D diffusion models with a 3D network in each diffusion step. Moreover, our model can naturally be used to ensemble diffusion models conditioned on different modalities, allowing flexible and accurate fusion of input conditions. Extensive experiments demonstrate that Diff-Ensembler attains superior accuracy and volumetric realism in 3D medical image super-resolution and modality translation. We further demonstrate the strength of our model's volumetric realism using tumor segmentation as a downstream task.

artificial intelligence, machine learning, resnetblock, (18 more...)

arXiv.org Artificial Intelligence

2501.0743

Country:

Europe (0.46)
North America > Canada (0.28)

Genre: Research Report > Promising Solution (0.48)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Local Spatiotemporal Representation Learning for Longitudinally-consistent Neuroimage Analysis

Ren, Mengwei, Dey, Neel, Styner, Martin A., Botteron, Kelly, Gerig, Guido

arXiv.org Artificial IntelligenceDec-12-2023

Recent self-supervised advances in medical computer vision exploit global and local anatomical self-similarity for pretraining prior to downstream tasks such as segmentation. However, current methods assume i.i.d. image acquisition, which is invalid in clinical study designs where follow-up longitudinal scans track subject-specific temporal changes. Further, existing self-supervised methods for medically-relevant image-to-image architectures exploit only spatial or temporal self-similarity and only do so via a loss applied at a single image-scale, with naive multi-scale spatiotemporal extensions collapsing to degenerate solutions. To these ends, this paper makes two contributions: (1) It presents a local and multi-scale spatiotemporal representation learning method for image-to-image architectures trained on longitudinal images. It exploits the spatiotemporal self-similarity of learned multi-scale intra-subject features for pretraining and develops several feature-wise regularizations that avoid collapsed identity representations; (2) During finetuning, it proposes a surprisingly simple self-supervised segmentation consistency regularization to exploit intra-subject correlation. Benchmarked in the one-shot segmentation setting, the proposed framework outperforms both well-tuned randomly-initialized baselines and current self-supervised techniques designed for both i.i.d. and longitudinal datasets. These improvements are demonstrated across both longitudinal neurodegenerative adult MRI and developing infant brain MRI and yield both higher performance and longitudinal consistency.

regularization, representation, segmentation, (17 more...)

arXiv.org Artificial Intelligence

2206.04281

Country:

North America > United States > New York (0.04)
North America > Canada (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Health Care Technology (0.88)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

FourierNets enable the design of highly non-local optical encoders for computational imaging

Deb, Diptodip, Jiao, Zhenfei, Sims, Ruth, Chen, Alex B., Broxton, Michael, Ahrens, Misha B., Podgorski, Kaspar, Turaga, Srinivas C.

arXiv.org Artificial IntelligenceNov-2-2022

Differentiable simulations of optical systems can be combined with deep learning-based reconstruction networks to enable high performance computational imaging via end-to-end (E2E) optimization of both the optical encoder and the deep decoder. This has enabled imaging applications such as 3D localization microscopy, depth estimation, and lensless photography via the optimization of local optical encoders. More challenging computational imaging applications, such as 3D snapshot microscopy which compresses 3D volumes into single 2D images, require a highly non-local optical encoder. We show that existing deep network decoders have a locality bias which prevents the optimization of such highly non-local optical encoders. We address this with a decoder based on a shallow neural network architecture using global kernel Fourier convolutional neural networks (FourierNets). We show that FourierNets surpass existing deep network based decoders at reconstructing photographs captured by the highly non-local DiffuserCam optical encoder. Further, we show that FourierNets enable E2E optimization of highly non-local optical encoders for 3D snapshot microscopy. By combining FourierNets with a large-scale multi-GPU differentiable optical simulation, we are able to optimize non-local optical encoders 170$\times$ to 7372$\times$ larger than prior state of the art, and demonstrate the potential for ROI-type specific optical encoding with a programmable microscope.

artificial intelligence, machine learning, optical encoder, (19 more...)

arXiv.org Artificial Intelligence

2104.10611

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > China > Hubei Province > Wuhan (0.04)

Genre: Research Report > New Finding (0.45)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

From Patterson Maps to Atomic Coordinates: Training a Deep Neural Network to Solve the Phase Problem for a Simplified Case

Hurwitz, David

arXiv.org Machine LearningMar-30-2020

This work demonstrates that, for a simple case of 10 randomly positioned atoms, a neural network can be trained to infer atomic coordinates from Patterson maps. The network was trained entirely on synthetic data. For the training set, the network outputs were 3D maps of randomly positioned atoms. From each output map, a Patterson map was generated and used as input to the network. The network generalized to cases not in the test set, inferring atom positions from Patterson maps. A key finding in this work is that the Patterson maps presented to the network input during training must uniquely describe the atomic coordinates they are paired with on the network output or the network will not train and it will not generalize. The network cannot train on conflicting data. Avoiding conflicts is handled in 3 ways: 1. Patterson maps are invariant to translation. To remove this degree of freedom, output maps are centered on the average of their atom positions. 2. Patterson maps are invariant to centrosymmetric inversion. This conflict is removed by presenting the network output with both the atoms used to make the Patterson Map and their centrosymmetry-related counterparts simultaneously. 3. The Patterson map does not uniquely describe a set of coordinates because the origin for each vector in the Patterson map is ambiguous. By adding empty space around the atoms in the output map, this ambiguity is removed. Forcing output atoms to be closer than half the output box edge dimension means the origin of each peak in the Patterson map must be the origin to which it is closest.

atom, neural network, patterson map, (15 more...)

arXiv.org Machine Learning

2003.13767

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Industry:

Health & Medicine (0.93)
Government > Regional Government > North America Government > United States Government (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.65)

Add feedback

See and Think: Disentangling Semantic Scene Completion

Liu, Shice, HU, YU, Zeng, Yiming, Tang, Qiankun, Jin, Beibei, Han, Yinhe, Li, Xiaowei

Neural Information Processing SystemsDec-31-2018

Semantic scene completion predicts volumetric occupancy and object category of a 3D scene, which helps intelligent agents to understand and interact with the surroundings. In this work, we propose a disentangled framework, sequentially carrying out 2D semantic segmentation, 2D-3D reprojection and 3D semantic scene completion. This three-stage framework has three advantages: (1) explicit semantic segmentation significantly boosts performance; (2) flexible fusion ways of sensor data bring good extensibility; (3) progress in any subtask will promote the holistic performance. Experimental results show that regardless of inputing a single depth or RGB-D, our framework can generate high-quality semantic scene completion, and outperforms state-of-the-art approaches on both synthetic and real datasets.

artificial intelligence, deep learning, machine learning, (12 more...)

Neural Information Processing Systems

Genre: Research Report (0.54)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback