silhouette
SDF-SRN: Learning Signed Distance 3D Object Reconstruction from Static Images
Dense 3D object reconstruction from a single image has recently witnessed remarkable advances, but supervising neural networks with ground-truth 3D shapes is impractical due to the laborious process of creating paired image-shape datasets. Recent efforts have turned to learning 3D reconstruction without 3D supervision from RGB images with annotated 2D silhouettes, dramatically reducing the cost and effort of annotation. These techniques, however, remain impractical as they still require multi-view annotations of the same object instance during training. As a result, most experimental efforts to date have been limited to synthetic datasets. In this paper, we address this issue and propose SDF-SRN, an approach that requires only a single view of objects at training time, offering greater utility for real-world scenarios. SDF-SRN learns implicit 3D shape representations to handle arbitrary shape topologies that may exist in the datasets. To this end, we derive a novel differentiable rendering formulation for learning signed distance functions (SDF) from 2D silhouettes. Our method outperforms the state of the art under challenging single-view supervision settings on both synthetic and real-world datasets.
ChromouVQA: Benchmarking Vision-Language Models under Chromatic Camouflaged Images
Zhang, Yunfei, He, Yizhuo, Shao, Yuanxun, Yao, Zhengtao, Xu, Haoyan, Dong, Junhao, Yao, Zhen, Dong, Zhikang
Vision-Language Models (VLMs) have advanced multimodal understanding, yet still struggle when targets are embedded in cluttered backgrounds requiring figure-ground segregation. To address this, we introduce ChromouVQA, a large-scale, multi-task benchmark based on Ishihara-style chromatic camouflaged images. We extend classic dot plates with multiple fill geometries and vary chromatic separation, density, size, occlusion, and rotation, recording full metadata for reproducibility. The benchmark covers nine vision-question-answering tasks, including recognition, counting, comparison, and spatial reasoning. Evaluations of humans and VLMs reveal large gaps, especially under subtle chromatic contrast or disruptive geometric fills. We also propose a model-agnostic contrastive recipe aligning silhouettes with their camouflaged renderings, improving recovery of global shapes. ChromouVQA provides a compact, controlled benchmark for reproducible evaluation and extension. Code and dataset are available at https://github.com/Chromou-VQA-Benchmark/Chromou-VQA.
- North America > United States > California (0.14)
- North America > United States > New York > Suffolk County > Stony Brook (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Mixed Data Clustering Survey and Challenges
Guerard, Guillaume, Djebali, Sonia
This paradigm challenges traditional data management and analysis techniques by demanding innovative solutions capable of processing, analyzing, and deriving insights from vast and diverse datasets. In particular, the inclusion of mixed data types, such as numerical and categorical variables, poses significant challenges to conventional methodologies, necessitating the development of novel approaches to effectively leverage the wealth of information available [2]. Traditionally, data handling methods were designed around homogeneous datasets, typically consisting of numerical values. However, the big data paradigm introduces a multitude of data types, including structured, unstructured, and semi-structured data, which demand a departure from traditional approaches. Moreover, the three primary characteristics of big data--volume, velocity, and variety--amplify the complexity of data analysis, requiring scalable and adaptable solutions capable of processing large volumes of data at high speeds while accommodating diverse data formats and structures. These methods for handling mixed data often involve separate analyses of categorical and numerical variables, treating them as distinct entities rather than integrating their interdependencies.
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- North America > United States > California > San Diego County > San Diego (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Sensing and Signal Processing > Image Processing (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)