iso
Conformal novelty detection with false discovery rate control at the boundary
Gao, Zijun, Roquain, Etienne, Xiang, Daniel
Conformal novelty detection is a classical machine learning task for which uncertainty quantification is essential for providing reliable results. Recent work has shown that the BH procedure applied to conformal p-values controls the false discovery rate (FDR). Unfortunately, the BH procedure can lead to over-optimistic assessments near the rejection threshold, with an increase of false discoveries at the margin as pointed out by Soloff et al. (2024). This issue is solved therein by the support line (SL) correction, which is proven to control the boundary false discovery rate (bFDR) in the independent, non-conformal setting. The present work extends the SL method to the conformal setting: first, we show that the SL procedure can violate the bFDR control in this specific setting. Second, we propose several alternatives that provably control the bFDR in the conformal setting. Finally, numerical experiments with both synthetic and real data support our theoretical findings and show the relevance of the new proposed procedures.
- North America > United States > California (0.14)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > Canada > Ontario > Toronto (0.04)
Inference Stage Optimization for Cross-scenario 3D Human Pose Estimation (Supplementary Material)
We compute the limb length ratios of upper to lower arm and leg (both for the left and right sides) as well as torso, for geometric distribution analysis. The joints and body parts of interest are defined in Fig. S1. All the results are reported under unscaled protocol. How does the choice of self-supervised learning technique impact accuracy? We can observe Adv ( Joint, V anilla and Online settings) improves accuracy upon Baseline by a large margin.
- North America > Canada (0.05)
- Asia > Singapore (0.05)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.05)
- North America > Canada (0.04)
- Asia > Singapore (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
- Information Technology > Artificial Intelligence > Vision > Video Understanding (0.57)
- Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.54)
Table R1
ISO can perform model adaptation with a batch of instances, if available. Comparison based on faster version of ISO will be added in the revision. During inference, it is updated using Eqn. We will add more details in the revision. ISO makes marginal improvement over Joint (from 42.6 to 41.8 in MPJPE) since training and testing However, even though there is no significant distribution shift, ISO still makes positive effect.
Inference Stage Optimization for Cross-scenario 3D Human Pose Estimation (Supplementary Material)
We compute the limb length ratios of upper to lower arm and leg (both for the left and right sides) as well as torso, for geometric distribution analysis. The joints and body parts of interest are defined in Fig. S1. All the results are reported under unscaled protocol. How does the choice of self-supervised learning technique impact accuracy? We can observe Adv ( Joint, V anilla and Online settings) improves accuracy upon Baseline by a large margin.
- North America > Canada (0.05)
- Asia > Singapore (0.05)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.05)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > Singapore (0.04)
- Asia > Middle East > Jordan (0.04)
CameraBench: Benchmarking Visual Reasoning in MLLMs via Photography
Fang, I-Sheng, Chen, Jun-Cheng
Large language models (LLMs) and multimodal large language models (MLLMs) have significantly advanced artificial intelligence. However, visual reasoning, reasoning involving both visual and textual inputs, remains underexplored. Recent advancements, including the reasoning models like OpenAI o1 and Gemini 2.0 Flash Thinking, which incorporate image inputs, have opened this capability. In this ongoing work, we focus specifically on photography-related tasks because a photo is a visual snapshot of the physical world where the underlying physics (i.e., illumination, blur extent, etc.) interplay with the camera parameters. Successfully reasoning from the visual information of a photo to identify these numerical camera settings requires the MLLMs to have a deeper understanding of the underlying physics for precise visual comprehension, representing a challenging and intelligent capability essential for practical applications like photography assistant agents. We aim to evaluate MLLMs on their ability to distinguish visual differences related to numerical camera settings, extending a methodology previously proposed for vision-language models (VLMs). Our preliminary results demonstrate the importance of visual reasoning in photography-related tasks. Moreover, these results show that no single MLLM consistently dominates across all evaluation tasks, demonstrating ongoing challenges and opportunities in developing MLLMs with better visual reasoning.
Machine Learning Methods for Automated Interstellar Object Classification with LSST
Cloete, Richard, Vereš, Peter, Loeb, Abraham
The Legacy Survey of Space and Time, to be conducted with the Vera C. Rubin Observatory, is poised to revolutionize our understanding of the Solar System by providing an unprecedented wealth of data on various objects, including the elusive interstellar objects (ISOs). Detecting and classifying ISOs is crucial for studying the composition and diversity of materials from other planetary systems. However, the rarity and brief observation windows of ISOs, coupled with the vast quantities of data to be generated by LSST, create significant challenges for their identification and classification. This study aims to address these challenges by exploring the application of machine learning algorithms to the automated classification of ISO tracklets in simulated LSST data. We employed various machine learning algorithms, including random forests (RFs), stochastic gradient descent (SGD), gradient boosting machines (GBMs), and neural networks (NNs), to classify ISO tracklets in simulated LSST data. We demonstrate that GBM and RF algorithms outperform SGD and NN algorithms in accurately distinguishing ISOs from other Solar System objects. RF analysis shows that many derived Digest2 values are more important than direct observables in classifying ISOs from the LSST tracklets. The GBM model achieves the highest precision, recall, and F1 score, with values of 0.9987, 0.9986, and 0.9987, respectively. These findings lay the foundation for the development of an efficient and robust automated system for ISO discovery using LSST data, paving the way for a deeper understanding of the materials and processes that shape planetary systems beyond our own. The integration of our proposed machine learning approach into the LSST data processing pipeline will optimize the survey's potential for identifying these rare and valuable objects, enabling timely follow-up observations and further characterization.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > France (0.04)
Mixture of Experts in Image Classification: What's the Sweet Spot?
Videau, Mathurin, Leite, Alessandro, Schoenauer, Marc, Teytaud, Olivier
Mixture-of-Experts (MoE) models have shown promising potential for parameter-efficient scaling across various domains. However, the implementation in computer vision remains limited, and often requires large-scale datasets comprising billions of samples. In this study, we investigate the integration of MoE within computer vision models and explore various MoE configurations on open datasets. When introducing MoE layers in image classification, the best results are obtained for models with a moderate number of activated parameters per sample. However, such improvements gradually vanish when the number of parameters per sample increases.