atypicality
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > Oregon > Multnomah County > Portland (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (5 more...)
- Leisure & Entertainment (1.00)
- Health & Medicine > Therapeutic Area > Dermatology (0.67)
- Health & Medicine > Therapeutic Area > Oncology (0.45)
Beyond Confidence: Reliable Models Should Also Consider Atypicality
While most machine learning models can provide confidence in their predictions, confidence is insufficient to understand a prediction's reliability. For instance, the model may have a low confidence prediction if the input is not well-represented in the training dataset or if the input is inherently ambiguous. In this work, we investigate the relationship between how atypical~(rare) a sample or a class is and the reliability of a model's predictions. We first demonstrate that atypicality is strongly related to miscalibration and accuracy. In particular, we empirically show that predictions for atypical inputs or atypical classes are more overconfident and have lower accuracy. Using these insights, we show incorporating atypicality improves uncertainty quantification and model performance for discriminative neural networks and large language models. In a case study, we show that using atypicality improves the performance of a skin lesion classifier across different skin tone groups without having access to the group attributes. Overall, we propose that models should use not only confidence but also atypicality to improve uncertainty quantification and performance. Our results demonstrate that simple post-hoc atypicality estimators can provide significant value.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > Oregon > Multnomah County > Portland (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (5 more...)
- Leisure & Entertainment (1.00)
- Health & Medicine > Therapeutic Area > Dermatology (0.67)
- Health & Medicine > Therapeutic Area > Oncology (0.45)
Affect Models Have Weak Generalizability to Atypical Speech
Narain, Jaya, Romana, Amrit, Mitra, Vikramjit, Lea, Colin, Ren, Shirley
Speech and voice conditions can alter the acoustic properties of speech, which could impact the performance of paralinguistic models for affect for people with atypical speech. We evaluate publicly available models for recognizing categorical and dimensional affect from speech on a dataset of atypical speech, comparing results to datasets of typical speech. We investigate three dimensions of speech atypicality: intelligibility, which is related to pronounciation; monopitch, which is related to prosody, and harshness, which is related to voice quality. We look at (1) distributional trends of categorical affect predictions within the dataset, (2) distributional comparisons of categorical affect predictions to similar datasets of typical speech, and (3) correlation strengths between text and speech predictions for spontaneous speech for valence and arousal. We find that the output of affect models is significantly impacted by the presence and degree of speech atypicalities. For instance, the percentage of speech predicted as sad is significantly higher for all types and grades of atypical speech when compared to similar typical speech datasets. In a preliminary investigation on improving robustness for atypical speech, we find that fine-tuning models on pseudo-labeled atypical speech data improves performance on atypical speech without impacting performance on typical speech. Our results emphasize the need for broader training and evaluation datasets for speech emotion models, and for modeling approaches that are robust to voice and speech differences.
- North America > United States > California > Santa Clara County > Cupertino (0.05)
- Asia > China > Hong Kong (0.04)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.68)
- Health & Medicine > Consumer Health (0.68)
Leveraging Large Models for Evaluating Novel Content: A Case Study on Advertisement Creativity
Hou, Zhaoyi Joey, Kovashka, Adriana, Li, Xiang Lorraine
Evaluating creativity is challenging, even for humans, not only because of its subjectivity but also because it involves complex cognitive processes. Inspired by work in marketing, we attempt to break down visual advertisement creativity into atypicality and originality. With fine-grained human annotations on these dimensions, we propose a suit of tasks specifically for such a subjective problem. We also evaluate the alignment between state-of-the-art (SoTA) vision language models (VLM) and humans on our proposed benchmark, demonstrating both the promises and challenges of using VLMs for automatic creativity assessment.
- North America > United States (0.14)
- North America > Mexico > Mexico City (0.14)
- North America > Canada (0.14)
- Asia > Middle East > UAE (0.14)
- Marketing (1.00)
- Consumer Products & Services (1.00)
- Health & Medicine > Therapeutic Area (0.68)
- Health & Medicine > Consumer Health (0.46)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
- Information Technology > Artificial Intelligence > Vision (0.88)
Enhancing Healthcare LLM Trust with Atypical Presentations Recalibration
Qin, Jeremy, Liu, Bang, Nguyen, Quoc Dinh
Black-box large language models (LLMs) are increasingly deployed in various environments, making it essential for these models to effectively convey their confidence and uncertainty, especially in high-stakes settings. However, these models often exhibit overconfidence, leading to potential risks and misjudgments. Existing techniques for eliciting and calibrating LLM confidence have primarily focused on general reasoning datasets, yielding only modest improvements. Accurate calibration is crucial for informed decision-making and preventing adverse outcomes but remains challenging due to the complexity and variability of tasks these models perform. In this work, we investigate the miscalibration behavior of black-box LLMs within the healthcare setting. We propose a novel method, \textit{Atypical Presentations Recalibration}, which leverages atypical presentations to adjust the model's confidence estimates. Our approach significantly improves calibration, reducing calibration errors by approximately 60\% on three medical question answering datasets and outperforming existing methods such as vanilla verbalized confidence, CoT verbalized confidence and others. Additionally, we provide an in-depth analysis of the role of atypicality within the recalibration framework.
- North America > United States (0.04)
- North America > Canada > Quebec > Montreal (0.04)
The complementary contributions of academia and industry to AI research
Liang, Lizhen, Zhuang, Han, Zou, James, Acuna, Daniel E.
Artificial intelligence (AI) has seen tremendous development in industry and academia. However, striking recent advances by industry have stunned the world, inviting a fresh perspective on the role of academic research in this field. Here, we characterize the impact and type of AI produced by both environments over the last 25 years and establish several patterns. We find that articles published by teams consisting exclusively of industry researchers tend to get greater attention, with a higher chance of being highly cited and citation-disruptive, and several times more likely to produce state-of-the-art models. In contrast, we find that exclusively academic teams publish the bulk of AI research and tend to produce higher novelty work, with single papers having several times higher likelihood of being unconventional and atypical. The respective impact-novelty advantages of industry and academia are robust to controls for subfield, team size, seniority, and prestige. We find that academic-industry collaborations struggle to replicate the novelty of academic teams and tend to look similar to industry teams. Together, our findings identify the unique and nearly irreplaceable contributions that both academia and industry make toward the healthy progress of AI.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Colorado (0.04)
- North America > United States > New York (0.04)
- (6 more...)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.46)
- Government > Regional Government (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Beyond Confidence: Reliable Models Should Also Consider Atypicality
Yuksekgonul, Mert, Zhang, Linjun, Zou, James, Guestrin, Carlos
While most machine learning models can provide confidence in their predictions, confidence is insufficient to understand a prediction's reliability. For instance, the model may have a low confidence prediction if the input is not well-represented in the training dataset or if the input is inherently ambiguous. In this work, we investigate the relationship between how atypical (rare) a sample or a class is and the reliability of a model's predictions. We first demonstrate that atypicality is strongly related to miscalibration and accuracy. In particular, we empirically show that predictions for atypical inputs or atypical classes are more overconfident and have lower accuracy. Using these insights, we show incorporating atypicality improves uncertainty quantification and model performance for discriminative neural networks and large language models. In a case study, we show that using atypicality improves the performance of a skin lesion classifier across different skin tone groups without having access to the group attributes. Overall, we propose that models should use not only confidence but also atypicality to improve uncertainty quantification and performance. Our results demonstrate that simple post-hoc atypicality estimators can provide significant value.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > Oregon > Multnomah County > Portland (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (5 more...)
- Leisure & Entertainment (1.00)
- Health & Medicine > Therapeutic Area > Dermatology (0.87)
When Not to Classify: Detection of Reverse Engineering Attacks on DNN Image Classifiers
Wang, Yujia, Miller, David J., Kesidis, George
This paper addresses detection of a reverse engineering (RE) attack targeting a deep neural network (DNN) image classifier; by querying, RE's aim is to discover the classifier's decision rule. RE can enable test-time evasion attacks, which require knowledge of the classifier. Recently, we proposed a quite effective approach (ADA) to detect test-time evasion attacks. In this paper, we extend ADA to detect RE attacks (ADA-RE). We demonstrate our method is successful in detecting "stealthy" RE attacks before they learn enough to launch effective test-time evasion attacks.
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > Pennsylvania > Centre County > University Park (0.04)
- Asia (0.04)
Toward a Taxonomy and Computational Models of Abnormalities in Images
Saleh, Babak (Rutgers University) | Elgammal, Ahmed (Rutgers University) | Feldman, Jacob (Rutgers University) | Farhadi, Ali (University of Washington)
The human visual system can spot an abnormal image, and reason about what makes it strange. This task has not received enough attention in computer vision. In this paper we study various types of atypicalities in images in a more comprehensive way than has been done before. We propose a new dataset of abnormal images showing a wide range of atypicalities. We design human subject experiments to discover a coarse taxonomy of the reasons for abnormality. Our experiments reveal three major categories of abnormality: object-centric, scene-centric, and contextual. Based on this taxonomy, we propose a comprehensive computational model that can predict all different types of abnormality in images and outperform prior arts in abnormality recognition.
- North America > United States > New Jersey (0.04)
- North America > United States > Indiana > Lake County > Griffith (0.04)
- Europe > Norway > Norwegian Sea (0.04)