AITopics | semantic distribution

Collaborating Authors

semantic distribution

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Label Forensics: Interpreting Hard Labels in Black-Box Text Classifier

Du, Mengyao, Yang, Gang, Fang, Han, Yin, Quanjun, Chang, Ee-chien

arXiv.org Artificial IntelligenceDec-2-2025

The widespread adoption of natural language processing techniques has led to an unprecedented growth of text classifiers across the modern web. Yet many of these models circulate with their internal semantics undocumented or even intentionally withheld. Such opaque classifiers, which may expose only hard-label outputs, can operate in unregulated web environments or be repurposed for unknown intents, raising legitimate forensic and auditing concerns. In this paper, we position ourselves as investigators and work to infer the semantic concept each label encodes in an undocumented black-box classifier. Specifically, we introduce label forensics, a black-box framework that reconstructs a label's semantic meaning. Concretely, we represent a label by a sentence embedding distribution from which any sample reliably reflects the concept the classifier has implicitly learned for that label. We believe this distribution should maintain two key properties: precise, with samples consistently classified into the target label, and general, covering the label's broad semantic space. To realize this, we design a semantic neighborhood sampler and an iterative optimization procedure to select representative seed sentences that jointly maximize label consistency and distributional coverage. The final output, an optimized seed sentence set combined with the sampler, constitutes the empirical distribution representing the label's semantics. Experiments on multiple black-box classifiers achieve an average label consistency of around 92.24 percent, demonstrating that the embedding regions accurately capture each classifier's label semantics. We further validate our framework on an undocumented HuggingFace classifier, enabling fine-grained label interpretation and supporting responsible AI auditing.

classifier, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2512.01514

Country:

Asia (0.94)
North America > United States (0.46)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Air (1.00)
Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

On the Utility of Foundation Models for Fast MRI: Vision-Language-Guided Image Reconstruction

Feng, Ruimin, He, Xingxin, Mercer, Ronald, Stewart, Zachary, Liu, Fang

arXiv.org Artificial IntelligenceNov-26-2025

Purpose: To investigate whether a vision-language foundation model can enhance undersampled MRI reconstruction by providing high-level contextual information beyond conventional priors. Methods: We proposed a semantic distribution-guided reconstruction framework that uses a pre-trained vision-language foundation model to encode both the reconstructed image and auxiliary information into high-level semantic features. A contrastive objective aligns the reconstructed representation with the target semantic distribution, ensuring consistency with high-level perceptual cues. The proposed objective works with various deep learning-based reconstruction methods and can flexibly incorporate semantic priors from multimodal sources. To test the effectiveness of these semantic priors, we evaluated reconstruction results guided by priors derived from either image-only or image-language auxiliary information. Results: Experiments on knee and brain datasets demonstrate that semantic priors from images preserve fine anatomical structures and achieve superior perceptual quality, as reflected in lower LPIPS values, higher Tenengrad scores, and improved scores in the reader study, compared with conventional regularization. The image-language information further expands the semantic distribution and enables high-level control over reconstruction attributes. Across all evaluations, the contrastive objective consistently guided the reconstructed features toward the desired semantic distributions while maintaining data fidelity, demonstrating the effectiveness of the proposed optimization framework. Conclusion: The study highlights that vision-language foundation models can improve undersampled MRI reconstruction through semantic-space optimization.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2511.19641

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Distilling LLM Prior to Flow Model for Generalizable Agent's Imagination in Object Goal Navigation

Li, Badi, Lu, Ren-jie, Zhou, Yu, Meng, Jingke, Zheng, Wei-shi

arXiv.org Artificial IntelligenceOct-22-2025

The Object Goal Navigation (ObjectNav) task challenges agents to locate a specified object in an unseen environment by imagining unobserved regions of the scene. Prior approaches rely on deterministic and discriminative models to complete semantic maps, overlooking the inherent uncertainty in indoor layouts and limiting their ability to generalize to unseen environments. In this work, we propose GOAL, a generative flow-based framework that models the semantic distribution of indoor environments by bridging observed regions with LLM-enriched full-scene semantic maps. During training, spatial priors inferred from large language models (LLMs) are encoded as two-dimensional Gaussian fields and injected into target maps, distilling rich contextual knowledge into the flow model and enabling more generalizable completions. Extensive experiments demonstrate that GOAL achieves state-of-the-art performance on MP3D and Gibson, and shows strong generalization in transfer settings to HM3D. Codes and pretrained models are available at https://github.com/Badi-Li/GOAL.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2508.09423

Country:

Europe (1.00)
Asia > China (0.67)
North America > United States > Louisiana (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Gaussian Semantic Field for One-shot LiDAR Global Localization

Yin, Pengyu, Yuan, Shenghai, Cao, Haozhi, Ji, Xingyu, Bai, Ruofei, Chen, Siyu, Xie, Lihua

arXiv.org Artificial IntelligenceOct-15-2025

Abstract-- We present a one-shot LiDAR global localization algorithm featuring semantic disambiguation ability based on a lightweight tri-layered scene graph. While landmark semantic registration-based methods have shown promising performance improvements in global localization compared with geometric-only methods, landmarks can be repetitive and misleading for correspondence establishment. We propose to mitigate this problem by modeling semantic distributions with continuous functions learned from a population of Gaussian processes. Compared with discrete semantic labels, the continuous functions capture finer-grained geo-semantic information and also provide more detailed metric information for correspondence establishment. We insert this continuous function as the middle layer between the object layer and the metric-semantic layer, forming a tri-layered 3D scene graph, serving as a light-weight yet performant backend for one-shot localization. We term our global localization pipeline Outram-GSF (Gaussian semantic field) and conduct a wide range of experiments on publicly available data sets, validating the superior performance against the current state-of-the-art.

localization, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2510.12101

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Keywords and Instances: A Hierarchical Contrastive Learning Framework Unifying Hybrid Granularities for Text Generation

Li, Mingzhe, Lin, XieXiong, Chen, Xiuying, Chang, Jinxiong, Zhang, Qishen, Wang, Feng, Wang, Taifeng, Liu, Zhongyi, Chu, Wei, Zhao, Dongyan, Yan, Rui

arXiv.org Artificial IntelligenceMar-11-2025

Contrastive learning has achieved impressive success in generation tasks to militate the "exposure bias" problem and discriminatively exploit the different quality of references. Existing works mostly focus on contrastive learning on the instance-level without discriminating the contribution of each word, while keywords are the gist of the text and dominant the constrained mapping relationships. Hence, in this work, we propose a hierarchical contrastive learning mechanism, which can unify hybrid granularities semantic meaning in the input text. Concretely, we first propose a keyword graph via contrastive correlations of positive-negative pairs to iteratively polish the keyword representations. Then, we construct intra-contrasts within instance-level and keyword-level, where we assume words are sampled nodes from a sentence distribution. Finally, to bridge the gap between independent contrast levels and tackle the common contrast vanishing problem, we propose an inter-contrast mechanism that measures the discrepancy between contrastive keyword nodes respectively to the instance distribution. Experiments demonstrate that our model outperforms competitive baselines on paraphrasing, dialogue generation, and storytelling tasks.

contrastive learning, keyword, representation, (16 more...)

arXiv.org Artificial Intelligence

2205.13346

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Austria (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Exploring Semantic Clustering in Deep Reinforcement Learning for Video Games

Zhang, Liang, Lieffers, Justin, Pyarelal, Adarsh

arXiv.org Artificial IntelligenceSep-30-2024

In this paper, we investigate the semantic clustering properties of deep reinforcement learning (DRL) for video games, enriching our understanding of the internal dynamics of DRL and advancing its interpretability. In this context, semantic clustering refers to the inherent capacity of neural networks to internally group video inputs based on semantic similarity. To achieve this, we propose a novel DRL architecture that integrates a semantic clustering module featuring both feature dimensionality reduction and online clustering. This module seamlessly integrates into the DRL training pipeline, addressing instability issues observed in previous t-SNE-based analysis methods and eliminating the necessity for extensive manual annotation of semantic analysis. Through experiments, we validate the effectiveness of the proposed module and the semantic clustering properties in DRL for video games. Additionally, based on these properties, we introduce new analytical methods to help understand the hierarchical structure of policies and the semantic distribution within the feature space.

agent, fdr space, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2409.17411

Country:

North America > United States > Arizona > Pima County > Tucson (0.14)
North America > United States > New York > Richmond County > New York City (0.04)
North America > United States > New York > Queens County > New York City (0.04)
(14 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

Add feedback

Gentle-CLIP: Exploring Aligned Semantic In Low-Quality Multimodal Data With Soft Alignment

Song, Zijia, Zang, Zelin, Wang, Yelin, Yang, Guozheng, Zheng, Jiangbin, yu, Kaicheng, Chen, Wanyu, Li, Stan Z.

arXiv.org Artificial IntelligenceJun-9-2024

Multimodal fusion breaks through the barriers between diverse modalities and has already yielded numerous impressive performances. However, in various specialized fields, it is struggling to obtain sufficient alignment data for the training process, which seriously limits the use of previously elegant models. Thus, semi-supervised learning attempts to achieve multimodal alignment with fewer matched pairs but traditional methods like pseudo-labeling are difficult to apply in domains with no label information. To address these problems, we transform semi-supervised multimodal alignment into a manifold matching problem and propose a new method based on CLIP, named Gentle-CLIP. Specifically, we design a novel semantic density distribution loss to explore implicit semantic alignment information from unpaired multimodal data by constraining the latent representation distribution with fine granularity, thus eliminating the need for numerous strictly matched pairs. Meanwhile, we introduce multi-kernel maximum mean discrepancy as well as self-supervised contrastive loss to pull separate modality distributions closer and enhance the stability of the representation distribution. In addition, the contrastive loss used in CLIP is employed on the supervised matched data to prevent negative optimization. Extensive experiments conducted on a range of tasks in various fields, including protein, remote sensing, and the general vision-language field, demonstrate the effectiveness of our proposed Gentle-CLIP.

arxiv preprint arxiv, learning, representation, (15 more...)

arXiv.org Artificial Intelligence

2406.05766

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > California > Merced County > Merced (0.04)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.67)

Add feedback

Semantic Mechanical Search with Large Vision and Language Models

Sharma, Satvik, Huang, Huang, Shivakumar, Kaushik, Chen, Lawrence Yunliang, Hoque, Ryan, Ichter, Brian, Goldberg, Ken

arXiv.org Artificial IntelligenceOct-30-2023

Moving objects to find a fully-occluded target object, known as mechanical search, is a challenging problem in robotics. As objects are often organized semantically, we conjecture that semantic information about object relationships can facilitate mechanical search and reduce search time. Large pretrained vision and language models (VLMs and LLMs) have shown promise in generalizing to uncommon objects and previously unseen real-world environments. In this work, we propose a novel framework called Semantic Mechanical Search (SMS). SMS conducts scene understanding and generates a semantic occupancy distribution explicitly using LLMs. Compared to methods that rely on visual similarities offered by CLIP embeddings, SMS leverages the deep reasoning capabilities of LLMs. Unlike prior work that uses VLMs and LLMs as end-to-end planners, which may not integrate well with specialized geometric planners, SMS can serve as a plug-in semantic module for downstream manipulation or navigation policies. For mechanical search in closed-world settings such as shelves, we compare with a geometric-based planner and show that SMS improves mechanical search performance by 24% across the pharmacy, kitchen, and office domains in simulation and 47.1% in physical experiments. For open-world real environments, SMS can produce better semantic distributions compared to CLIP-based methods, with the potential to be integrated with downstream navigation policies to improve object navigation tasks. Code, data, videos, and the appendix are available: https://sites.google.com/view/semantic-mechanical-search

arxiv preprint arxiv, international conference, semantic distribution, (14 more...)

arXiv.org Artificial Intelligence

2302.12915

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Middle East > Republic of Türkiye > Aksaray Province > Aksaray (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report (1.00)

Industry:

Consumer Products & Services > Personal Products > Beauty Care Products (0.93)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Consumer Health (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Meticulously Selecting 1% of the Dataset for Pre-training! Generating Differentially Private Images Data with Semantics Query

Li, Kecen, Gong, Chen, Li, Zhixiang, Zhao, Yuzhong, Hou, Xinwen, Wang, Tianhao

arXiv.org Artificial IntelligenceOct-19-2023

Differential Privacy (DP) image data synthesis, which leverages the DP technique to generate synthetic data to replace the sensitive data, allowing organizations to share and utilize synthetic images without privacy concerns. Previous methods incorporate the advanced techniques of generative models and pre-training on a public dataset to produce exceptional DP image data, but suffer from problems of unstable training and massive computational resource demands. This paper proposes a novel DP image synthesis method, termed PRIVIMAGE, which meticulously selects pre-training data, promoting the efficient creation of DP datasets with high fidelity and utility. PRIVIMAGE first establishes a semantic query function using a public dataset. Then, this function assists in querying the semantic distribution of the sensitive dataset, facilitating the selection of data from the public dataset with analogous semantics for pre-training. Finally, we pre-train an image generative model using the selected data and then fine-tune this model on the sensitive dataset using Differentially Private Stochastic Gradient Descent (DP-SGD). PRIVIMAGE allows us to train a lightly parameterized generative model, reducing the noise in the gradient during DP-SGD training and enhancing training stability. Extensive experiments demonstrate that PRIVIMAGE uses only 1% of the public dataset for pre-training and 7.6% of the parameters in the generative model compared to the state-of-the-art method, whereas achieves superior synthetic performance and conserves more computational resources. On average, PRIVIMAGE achieves 30.1% lower FID and 12.6% higher Classification Accuracy than the state-of-the-art method. The replication package and datasets can be accessed online.

dataset, public dataset, sensitive dataset, (14 more...)

arXiv.org Artificial Intelligence

2311.1285

Country:

North America > United States > Virginia (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.86)
Information Technology > Communications > Web > Semantic Web (0.71)

Add feedback

Measuring and Improving Semantic Diversity of Dialogue Generation

Han, Seungju, Kim, Beomsu, Chang, Buru

arXiv.org Artificial IntelligenceOct-22-2022

Response diversity has become an important criterion for evaluating the quality of open-domain dialogue generation models. However, current evaluation metrics for response diversity often fail to capture the semantic diversity of generated responses, as they mainly consider lexical aspects of the generated responses. In this paper, we introduce a new automatic evaluation metric to measure the semantic diversity of generated responses. Through human evaluation, we demonstrate that our proposed metric captures human judgments on response diversity better than existing lexical-level diversity metrics. Furthermore, motivated by analyzing an existing dialogue dataset, we propose a simple yet effective learning method that improves the semantic diversity of generated responses. Our learning method weights training samples based on the semantic distribution of the training set. We show that our learning method improves response diversity and coherency better than other baseline methods through automatic and human evaluation.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2210.05725

Country:

North America > United States (0.04)
Asia > China (0.04)
Oceania > New Zealand (0.04)
(6 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback