AITopics | resolution image

3DILG: Irregular Latent Grids for 3D Generative Modeling

Neural Information Processing SystemsDec-24-2025, 17:31:38 GMT

We propose a new representation for encoding 3D shapes as neural fields. The representation is designed to be compatible with the transformer architecture and to benefit both shape reconstruction and shape generation. Existing works on neural fields are grid-based representations with latents being defined on a regular grid. In contrast, we define latents on irregular grids which facilitates our representation to be sparse and adaptive. In the context of shape reconstruction from point clouds, our shape representation built on irregular grids improves upon grid-based methods in terms of reconstruction accuracy.

irregular latent grid, name change, representation, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (0.40)
Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

4491777b1aa8b5b32c2e8666dbe1a495-Paper.pdf

Neural Information Processing SystemsSep-24-2025, 21:28:33 GMT

artificial intelligence, machine learning, wavelet flow, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
North America > United States (0.14)

Industry: Energy > Oil & Gas (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

666dd0d92a64396e753c691db93493d4-Paper-Conference.pdf

Neural Information Processing SystemsAug-15-2025, 11:11:57 GMT

artificial intelligence, equivariant model, machine learning, (20 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > Scotland > City of Glasgow > Glasgow (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

4491777b1aa8b5b32c2e8666dbe1a495-Paper.pdf

Neural Information Processing SystemsAug-14-2025, 03:49:18 GMT

artificial intelligence, machine learning, wavelet flow, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.28)
North America > United States (0.28)

Industry: Energy > Oil & Gas (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)

Add feedback

Chest X-ray Classification using Deep Convolution Models on Low-resolution images with Uncertain Labels

Agarwal, Snigdha, Sinha, Neelam

arXiv.org Artificial IntelligenceApr-15-2025

Deep Convolutional Neural Networks have consistently proven to achieve state-of-the-art results on a lot of imaging tasks over the past years' majority of which comprise of high-quality data. However, it is important to work on low-resolution images since it could be a cheaper alternative for remote healthcare access where the primary need of automated pathology identification models occurs. Medical diagnosis using low-resolution images is challenging since critical details may not be easily identifiable. In this paper, we report classification results by experimenting on different input image sizes of Chest X-rays to deep CNN models and discuss the feasibility of classification on varying image sizes. We also leverage the noisy labels in the dataset by proposing a Randomized Flipping of labels techniques. We use an ensemble of multi-label classification models on frontal and lateral studies. Our models are trained on 5 out of the 14 chest pathologies of the publicly available CheXpert dataset. We incorporate techniques such as augmentation, regularization for model improvement and use class activation maps to visualize the neural network's decision making. Comparison with classification results on data from 200 subjects, obtained on the corresponding high-resolution images, reported in the original CheXpert paper, has been presented. For pathologies Cardiomegaly, Consolidation and Edema, we obtain 3% higher accuracy with our model architecture.

artificial intelligence, machine learning, pathology, (16 more...)

arXiv.org Artificial Intelligence

2504.09033

Country: Asia > India (0.15)

Genre: Research Report > Experimental Study (0.69)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Leveraging ChatGPT's Multimodal Vision Capabilities to Rank Satellite Images by Poverty Level: Advancing Tools for Social Science Research

Sarmadi, Hamid, Hall, Ola, Rögnvaldsson, Thorsteinn, Ohlsson, Mattias

arXiv.org Artificial IntelligenceJan-24-2025

This paper investigates the novel application of Large Language Models (LLMs) with vision capabilities to analyze satellite imagery for village-level poverty prediction. Although LLMs were originally designed for natural language understanding, their adaptability to multimodal tasks, including geospatial analysis, has opened new frontiers in data-driven research. By leveraging advancements in vision-enabled LLMs, we assess their ability to provide interpretable, scalable, and reliable insights into human poverty from satellite images. Using a pairwise comparison approach, we demonstrate that ChatGPT can rank satellite images based on poverty levels with accuracy comparable to domain experts. These findings highlight both the promise and the limitations of LLMs in socioeconomic research, providing a foundation for their integration into poverty assessment workflows. This study contributes to the ongoing exploration of unconventional data sources for welfare analysis and opens pathways for cost-effective, large-scale poverty monitoring.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2501.14546

Country:

Africa > Tanzania (0.05)
Europe > Sweden > Skåne County > Lund (0.04)
Europe > Sweden > Halland County > Halmstad (0.04)
Africa > Sub-Saharan Africa (0.04)

Genre:

Research Report (1.00)
Overview > Innovation (0.34)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.36)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Review for NeurIPS paper: Wavelet Flow: Fast Training of High Resolution Normalizing Flows

Neural Information Processing SystemsJan-23-2025, 21:42:23 GMT

Summary and Contributions: This paper introduces a hierarchical structure for normalizing flows for density estimation and data generation based on wavelet transforms, allowing for a natural factorization of the data distribution based on different resolutions of the data. For density estimation, each image is fed into a sequence of wavelet transforms. Each wavelet transform takes an image and outputs a lower resolution image (obtained by a low-pass filter) and a tensor of detail coefficients (obtained by a high-pass filter). Repeatedly applying wavelet transforms to the output images leads to a set of detail coefficient tensors for each scale and a final 1x1x3 "image" representing the average intensity per channel. The original representation can be recovered from this representation with a sequence of inverse wavelet transforms.

high resolution normalizing flow, representation, wavelet transform, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.33)

Add feedback

3DILG: Irregular Latent Grids for 3D Generative Modeling

Neural Information Processing SystemsJan-17-2025, 12:53:10 GMT

We propose a new representation for encoding 3D shapes as neural fields. The representation is designed to be compatible with the transformer architecture and to benefit both shape reconstruction and shape generation. Existing works on neural fields are grid-based representations with latents being defined on a regular grid. In contrast, we define latents on irregular grids which facilitates our representation to be sparse and adaptive. In the context of shape reconstruction from point clouds, our shape representation built on irregular grids improves upon grid-based methods in terms of reconstruction accuracy.

irregular latent grid, representation, shape reconstruction, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (0.43)
Information Technology > Artificial Intelligence > Machine Learning (0.43)

Add feedback

Reviews: Pose Guided Person Image Generation

Neural Information Processing SystemsOct-7-2024, 17:41:13 GMT

The paper proposes a human image generator conditioned on appearance and human pose. The proposed generation is based on adversarial training architecture where two-step generative networks that produces high resolution image to feed into a discriminator. In the generator part, the first generator produce a coarse image using a U-shape network given appearance and pose map, then the second generator takes the coarse input with the original appearance to predict residual to refine the coarse image. The paper proposes a few important ideas. Conditioned on appearance and pose information, the proposed generator stacks two networks to adopt a coarse-to-fine strategy.

architecture, generator, pose guided person image generation, (9 more...)

Neural Information Processing Systems

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.56)
Information Technology > Artificial Intelligence > Vision (0.54)

Add feedback

Towards Optimal Trade-offs in Knowledge Distillation for CNNs and Vision Transformers at the Edge

Violos, John, Papadopoulos, Symeon, Kompatsiaris, Ioannis

arXiv.org Artificial IntelligenceJun-25-2024

This paper discusses four facets of the Knowledge Distillation (KD) process for Convolutional Neural Networks (CNNs) and Vision Transformer (ViT) architectures, particularly when executed on edge devices with constrained processing capabilities. First, we conduct a comparative analysis of the KD process between CNNs and ViT architectures, aiming to elucidate the feasibility and efficacy of employing different architectural configurations for the teacher and student, while assessing their performance and efficiency. Second, we explore the impact of varying the size of the student model on accuracy and inference speed, while maintaining a constant KD duration. Third, we examine the effects of employing higher resolution images on the accuracy, memory footprint and computational workload. Last, we examine the performance improvements obtained by fine-tuning the student model after KD to specific downstream tasks. Through empirical evaluations and analyses, this research provides AI practitioners with insights into optimal strategies for maximizing the effectiveness of the KD process on edge devices.

architecture, student model, transformer, (12 more...)

arXiv.org Artificial Intelligence

2407.12808

Country:

Europe > Greece > Central Macedonia > Thessaloniki (0.05)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Education (0.95)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Filters

Collaborating Authors

resolution image

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

3DILG: Irregular Latent Grids for 3D Generative Modeling

4491777b1aa8b5b32c2e8666dbe1a495-Paper.pdf

666dd0d92a64396e753c691db93493d4-Paper-Conference.pdf

4491777b1aa8b5b32c2e8666dbe1a495-Paper.pdf

Chest X-ray Classification using Deep Convolution Models on Low-resolution images with Uncertain Labels

Leveraging ChatGPT's Multimodal Vision Capabilities to Rank Satellite Images by Poverty Level: Advancing Tools for Social Science Research

Review for NeurIPS paper: Wavelet Flow: Fast Training of High Resolution Normalizing Flows

3DILG: Irregular Latent Grids for 3D Generative Modeling

Reviews: Pose Guided Person Image Generation

Towards Optimal Trade-offs in Knowledge Distillation for CNNs and Vision Transformers at the Edge