AITopics | descriptiveness

Collaborating Authors

descriptiveness

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

fa1cfe4e956d85e016b1f8f49b189a0b-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 02:00:00 GMT

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
(16 more...)

Genre: Overview (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)

Add feedback

LOTUS: A Leaderboard for Detailed Image Captioning from Quality to Societal Bias and User Preferences

Hirota, Yusuke, Li, Boyi, Hachiuma, Ryo, Wu, Yueh-Hua, Ivanovic, Boris, Nakashima, Yuta, Pavone, Marco, Choi, Yejin, Wang, Yu-Chiang Frank, Yang, Chao-Han Huck

arXiv.org Artificial IntelligenceDec-2-2025

Large Vision-Language Models (LVLMs) have transformed image captioning, shifting from concise captions to detailed descriptions. We introduce LOTUS, a leaderboard for evaluating detailed captions, addressing three main gaps in existing evaluations: lack of standardized criteria, bias-aware assessments, and user preference considerations. LOTUS comprehensively evaluates various aspects, including caption quality (e.g., alignment, descriptiveness), risks (\eg, hallucination), and societal biases (e.g., gender bias) while enabling preference-oriented evaluations by tailoring criteria to diverse user preferences. Our analysis of recent LVLMs reveals no single model excels across all criteria, while correlations emerge between caption detail and bias risks. Preference-oriented evaluations demonstrate that optimal model selection depends on user priorities.

caption, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2025.acl-industry.22

2507.19362

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.82)

Add feedback

Geopolitical Parallax: Beyond Walter Lippmann Just After Large Language Models

Yavuz, Mehmet Can, Kabir, Humza Gohar, Özkan, Aylin

arXiv.org Artificial IntelligenceAug-28-2025

Objectivity in journalism has long been contested, oscillating between ideals of neutral, fact-based reporting and the inevitability of subjective framing. With the advent of large language models (LLMs), these tensions are now mediated by algorithmic systems whose training data and design choices may themselves embed cultural or ideological biases. This study investigates geopolitical parallax-systematic divergence in news quality and subjectivity assessments-by comparing article-level embeddings from Chinese-origin (Qwen, BGE, Jina) and Western-origin (Snowflake, Granite) model families. We evaluate both on a human-annotated news quality benchmark spanning fifteen stylistic, informational, and affective dimensions, and on parallel corpora covering politically sensitive topics, including Palestine and reciprocal China-United States coverage. Using logistic regression probes and matched-topic evaluation, we quantify per-metric differences in predicted positive-class probabilities between model families. Our findings reveal consistent, non-random divergences aligned with model origin. In Palestine-related coverage, Western models assign higher subjectivity and positive emotion scores, while Chinese models emphasize novelty and descriptiveness. Cross-topic analysis shows asymmetries in structural quality metrics Chinese-on-US scoring notably lower in fluency, conciseness, technicality, and overall quality-contrasted by higher negative emotion scores. These patterns align with media bias theory and our distinction between semantic, emotional, and relational subjectivity, and extend LLM bias literature by showing that geopolitical framing effects persist in downstream quality assessment tasks. We conclude that LLM-based media evaluation pipelines require cultural calibration to avoid conflating content differences with model-induced bias.

dimension, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2508.19492

Country:

North America > United States (1.00)
Asia > Middle East > Palestine (0.47)

Genre: Research Report > New Finding (1.00)

Industry: Media > News (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)

Add feedback

Bridging the Visual Gap: Fine-Tuning Multimodal Models with Knowledge-Adapted Captions

Yanuka, Moran, Kish, Assaf Ben, Bitton, Yonatan, Szpektor, Idan, Giryes, Raja

arXiv.org Artificial IntelligenceNov-13-2024

Recent research increasingly focuses on training vision-language models (VLMs) with long, detailed image captions. However, small-scale VLMs often struggle to balance the richness of these captions with the risk of hallucinating content during fine-tuning. In this paper, we explore how well VLMs adapt to such captions. To quantify caption quality, we propose Decomposed NLI (DNLI), an evaluation framework that breaks down generated captions into individual propositions, assessing each in isolation. This fine-grained analysis reveals a critical balance between capturing descriptive details and preventing hallucinations. Our findings show that simply reducing caption complexity or employing standard data curation techniques does not effectively resolve this issue. To tackle this challenge, we introduce Knowledge Adapted (KnowAda) fine-tuning, a data-centric approach that automatically adapts training data with the model's existing knowledge and visual understanding. KnowAda minimizes hallucinations while preserving high descriptiveness. We validate this approach across several small-scale VLMs (up to 7B parameters) and dense caption datasets, demonstrating that KnowAda effectively balances hallucination reduction and descriptiveness. Our results show that KnowAda outperforms various baselines in both automatic metrics and human evaluations. We will release our code and models.

caption, information, proposition, (15 more...)

arXiv.org Artificial Intelligence

2411.09018

Country:

North America > United States > Nevada > Clark County > Las Vegas (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Transportation (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)

Add feedback

Interpreting Inflammation Prediction Model via Tag-based Cohort Explanation

Meng, Fanyu, Larke, Jules, Liu, Xin, Kong, Zhaodan, Chen, Xin, Lemay, Danielle, Tagkopoulos, Ilias

arXiv.org Artificial IntelligenceOct-17-2024

One significant application is in nutrition science, where ML models can provide dietary recommendations, detect food quality and safety issues during production, and surveil public health and epidemiology. However, the complex and often opaque nature of these models presents challenges in understanding and trusting their predictions. To address these issues, explainability techniques have garnered considerable interest, aiming to make ML models more interpretable and transparent. Explainability can be approached from different perspectives, including local explanations that focus on individual predictions and global explanations that provide insights into the overall behavior of the model. However, there is a growing need for intermediate-level explanations that balance these two extremes, offering contextually relevant insights that are both comprehensive and specific (Sokol and Flach, 2020; Arrieta et al., 2020; Adadi and Berrada, 2018). Cohort explainability, also referred to as subgroup explainability, explains model predictions by analyzing groups of instances with shared characteristics and emerges as a promising solution to this challenge.

data mining, explanation, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2410.14082

Country:

North America > United States > California > Yolo County > Davis (0.14)
North America > United States > Georgia > Fulton County > Atlanta (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry:

Health & Medicine > Consumer Health (1.00)
Education > Health & Safety > School Nutrition (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback

Learning Descriptive Image Captioning via Semipermeable Maximum Likelihood Estimation

Yue, Zihao, Hu, Anwen, Zhang, Liang, Jin, Qin

arXiv.org Artificial IntelligenceOct-28-2023

Image captioning aims to describe visual content in natural language. As 'a picture is worth a thousand words', there could be various correct descriptions for an image. However, with maximum likelihood estimation as the training objective, the captioning model is penalized whenever its prediction mismatches with the label. For instance, when the model predicts a word expressing richer semantics than the label, it will be penalized and optimized to prefer more concise expressions, referred to as conciseness optimization. In contrast, predictions that are more concise than labels lead to richness optimization. Such conflicting optimization directions could eventually result in the model generating general descriptions. In this work, we introduce Semipermeable MaxImum Likelihood Estimation (SMILE), which allows richness optimization while blocking conciseness optimization, thus encouraging the model to generate longer captions with more details. Extensive experiments on two mainstream image captioning datasets MSCOCO and Flickr30K demonstrate that SMILE significantly enhances the descriptiveness of generated captions. We further provide in-depth investigations to facilitate a better understanding of how SMILE works.

caption, descriptiveness, optimization, (13 more...)

arXiv.org Artificial Intelligence

2306.1346

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
(17 more...)

Genre:

Overview (0.67)
Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

From Probability to Consilience: How Explanatory Values Implement Bayesian Reasoning

Wojtowicz, Zachary, DeDeo, Simon

arXiv.org Artificial IntelligenceJun-3-2020

Recent work in cognitive science has uncovered a diversity of explanatory values, or dimensions along which we judge explanations as better or worse. We propose a Bayesian account of how these values fit together to guide explanation. The resulting taxonomy provides a set of predictors for which explanations people prefer and shows how core values from psychology, statistics, and the philosophy of science emerge from a common mathematical framework. In addition to operationalizing the explanatory virtues associated with, for example, scientific argument-making, this framework also enables us to reinterpret the explanatory vices that drive conspiracy theories, delusions, and extremist ideologies. Intuitively, philosophically, and as seen in laboratory experiments, explanations are judged as better or worse on the basis of many different criteria. These explanatory values appear in early childhood [1, 2, 3, 4, 5] and their influence extends to some of the most sophisticated social knowledge formation processes we know [6]. We lack, however, an understanding of the origin of these values or an account of how they fit together to guide belief formation. The multiplicity of values also appears to conflict with Bayesian models of cognition, which speak solely in terms of degrees of beliefs and suggest we judge explanations as better or worse on the basis of a single quantity, the posterior likelihood (see Glossary). In this opinion, we show how to resolve these conflicts by arguing that previously-identified explanatory values capture different components of a full Bayesian calculation and, when considered together and weighed appropriately, implement Bayesian cognition. This framework shows how key explanatory values identified by laboratory experiments and philosophers of science--co-explanation, descriptiveness, precision, unification, power, and simplicity--emerge naturally from the mathematical structure of probabilistic inference, thereby reconciling them with Bayesian models of cognition [7, 8]. Second, it shows how these values combine to produce preferences for one explanation over another.

artificial intelligence, explanation, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2006.02359

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York > New York County > New York City (0.04)
(8 more...)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Improving latent variable descriptiveness with AutoGen

Mansbridge, Alex, Fierimonte, Roberto, Feige, Ilya, Barber, David

arXiv.org Machine LearningJun-12-2018

Powerful generative models, particularly in Natural Language Modelling, are commonly trained by maximizing a variational lower bound on the data log likelihood. These models often suffer from poor use of their latent variable, with ad-hoc annealing factors used to encourage retention of information in the latent variable. We discuss an alternative and general approach to latent variable modelling, based on an objective that combines the data log likelihood as well as the likelihood of a perfect reconstruction through an autoencoder. Tying these together ensures by design that the latent variable captures information about the observations, whilst retaining the ability to generate well. Interestingly, though this approach is a priori unrelated to VAEs, the lower bound attained is identical to the standard VAE bound but with the addition of a simple pre-factor; thus, providing a formal interpretation of the commonly used, ad-hoc pre-factors in training VAEs.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

1806.0448

Country: Asia > Afghanistan > Herat Province > Herat (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

SentiCap: Generating Image Descriptions with Sentiments

Mathews, Alexander Patrick (Australian National University) | Xie, Lexing (Australian National University and National ICT Australia) | He, Xuming (Australian National University and National ICT Australia)

AAAI ConferencesApr-19-2016

The recent progress on image recognition and language modeling is making automatic description of image content a reality. However, stylized, non-factual aspects of the written description are missing from the current systems. One such style is descriptions with emotions, which is commonplace in everyday communication, and influences decision-making and interpersonal relationships. We design a system to describe an image with emotions, and present a model that automatically generates captions with positive or negative sentiments. We propose a novel switching recurrent neural network with word-level regularization, which is able to produce emotional image captions using only 2000+ training sentences containing sentiments. We evaluate the captions with different automatic and crowd-sourcing metrics. Our model compares favourably in common quality metrics for image captioning. In 84.6% of cases the generated positive captions were judged as being at least as descriptive as the factual captions. Of these positive captions 88% were confirmed by the crowd-sourced workers as having the appropriate sentiment.

Add feedback