AITopics | semantic variable

Collaborating Authors

semantic variable

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On the Value of Cross-Modal Misalignment in Multimodal Representation Learning

Cai, Yichao, Liu, Yuhang, Gao, Erdun, Jiang, Tianjiao, Zhang, Zhen, Hengel, Anton van den, Shi, Javen Qinfeng

arXiv.org Artificial IntelligenceSep-29-2025

Multimodal representation learning, exemplified by multimodal contrastive learning (MMCL) using image-text pairs, aims to learn powerful representations by aligning cues across modalities. This approach relies on the core assumption that the exemplar image-text pairs constitute two representations of an identical concept. However, recent research has revealed that real-world datasets often exhibit cross-modal misalignment. There are two distinct viewpoints on how to address this issue: one suggests mitigating the misalignment, and the other leveraging it. We seek here to reconcile these seemingly opposing perspectives, and to provide a practical guide for practitioners. Using latent variable models we thus formalize cross-modal misalignment by introducing two specific mechanisms: Selection bias, where some semantic variables are absent in the text, and perturbation bias, where semantic variables are altered -- both leading to misalignment in data pairs. Our theoretical analysis demonstrates that, under mild assumptions, the representations learned by MMCL capture exactly the information related to the subset of the semantic variables invariant to selection and perturbation biases. This provides a unified perspective for understanding misalignment. Based on this, we further offer actionable insights into how misalignment should inform the design of real-world ML systems. We validate our theoretical findings via extensive empirical studies on both synthetic data and real image-text datasets, shedding light on the nuanced impact of cross-modal misalignment on multimodal representation learning.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2504.10143

Country:

Oceania > Australia (0.28)
Europe > Switzerland (0.27)

Genre: Research Report > New Finding (0.92)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Parrot: Efficient Serving of LLM-based Applications with Semantic Variable

Lin, Chaofan, Han, Zhenhua, Zhang, Chengruidong, Yang, Yuqing, Yang, Fan, Chen, Chen, Qiu, Lili

arXiv.org Artificial IntelligenceMay-30-2024

The rise of large language models (LLMs) has enabled LLM-based applications (a.k.a. AI agents or co-pilots), a new software paradigm that combines the strength of LLM and conventional software. Diverse LLM applications from different tenants could design complex workflows using multiple LLM requests to accomplish one task. However, they have to use the over-simplified request-level API provided by today's public LLM services, losing essential application-level information. Public LLM services have to blindly optimize individual LLM requests, leading to sub-optimal end-to-end performance of LLM applications. This paper introduces Parrot, an LLM service system that focuses on the end-to-end experience of LLM-based applications. Parrot proposes Semantic Variable, a unified abstraction to expose application-level knowledge to public LLM services. A Semantic Variable annotates an input/output variable in the prompt of a request, and creates the data pipeline when connecting multiple LLM requests, providing a natural way to program LLM applications. Exposing Semantic Variables to the public LLM service allows it to perform conventional data flow analysis to uncover the correlation across multiple LLM requests. This correlation opens a brand-new optimization space for the end-to-end performance of LLM-based applications. Extensive evaluations demonstrate that Parrot can achieve up to an order-of-magnitude improvement for popular and practical use cases of LLM applications.

application, llm request, parrot, (15 more...)

arXiv.org Artificial Intelligence

2405.19888

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Georgia > Chatham County > Savannah (0.04)
(7 more...)

Genre: Research Report (0.50)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Semantic Communication for Cooperative Multi-Task Processing over Wireless Networks

Razlighi, Ahmad Halimi, Bockelmann, Carsten, Dekorsy, Armin

arXiv.org Artificial IntelligenceMay-22-2024

The same authors in [9] studied distributed relevant Applications involving machine-to-machine or human-tomachine information encoding for collaborative feature extraction to communications often have to prioritize task execution fulfill a task, leveraging distributed IB. Moreover, [10] offered over the exact reconstruction of transmitted information a framework for collaborative retrieval of the message using at the receiver. Unlike the traditional information multiple received semantic information and also expanded it theory established by Shannon, which emphasizes the accurate using reinforcement learning in [11]. To consider some physical transmission and reception of bits, the design of communication layer communication aspects, [12] contributed to resource systems for these applications takes a distinct approach, allocation in a multi-user system according to the single task drawing attention to task performance rather than fidelity in accuracy, channel conditions, and computing requests.

communication, information, semantic communication, (14 more...)

arXiv.org Artificial Intelligence

2404.08483

Country: Europe > Germany > Bremen > Bremen (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Semantic Latent Decomposition with Normalizing Flows for Face Editing

Li, Binglei, Huang, Zhizhong, Shan, Hongming, Zhang, Junping

arXiv.org Artificial IntelligenceSep-11-2023

Navigating in the latent space of StyleGAN has shown effectiveness for face editing. However, the resulting methods usually encounter challenges in complicated navigation due to the entanglement among different attributes in the latent space. To address this issue, this paper proposes a novel framework, termed SDFlow, with a semantic decomposition in original latent space using continuous conditional normalizing flows. Specifically, SDFlow decomposes the original latent code into different irrelevant variables by jointly optimizing two components: (i) a semantic encoder to estimate semantic variables from input faces and (ii) a flow-based transformation module to map the latent code into a semantic-irrelevant variable in Gaussian distribution, conditioned on the learned semantic variables. To eliminate the entanglement between variables, we employ a disentangled learning strategy under a mutual information framework, thereby providing precise manipulation controls. Experimental results demonstrate that SDFlow outperforms existing state-of-the-art face editing methods both qualitatively and quantitatively. The source code is made available at https://github.com/phil329/SDFlow.

editing, sdflow, semantic variable, (15 more...)

arXiv.org Artificial Intelligence

2309.05314

Country: Asia > China > Shanghai > Shanghai (0.05)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback