AITopics | unified representation

Collaborating Authors

unified representation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

c89f09849eb5af489abb122394ff0f0b-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 01:43:23 GMT

artificial intelligence, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Middle East > Israel (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Speech (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Learning Topology-Agnostic EEG Representations with Geometry-Aware Modeling

Neural Information Processing SystemsFeb-16-2026, 09:57:59 GMT

Large-scale pre-training has shown great potential to enhance models on downstream tasks in vision and language. Developing similar techniques for scalp electroencephalogram (EEG) is suitable since unlabelled data is plentiful.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country: Asia > China (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

6a4d5952d4c018a1c1af9fa590a10dda-Paper.pdf

Neural Information Processing SystemsFeb-12-2026, 11:46:05 GMT

mapping, modality, representation, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Asia > Singapore (0.04)
Asia > China > Guangdong Province (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.46)

Industry: Information Technology (0.93)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Add feedback

FairVFL: A Fair Vertical Federated Learning Framework with Contrastive Adversarial Learning

Neural Information Processing SystemsDec-24-2025, 00:27:32 GMT

Vertical federated learning (VFL) is a privacy-preserving machine learning paradigm that can learn models from features distributed on different platforms in a privacy-preserving way. Since in real-world applications the data may contain bias on fairness-sensitive features (e.g., gender), VFL models may inherit bias from training data and become unfair for some user groups. However, existing fair machine learning methods usually rely on the centralized storage of fairness-sensitive features to achieve model fairness, which are usually inapplicable in federated scenarios. In this paper, we propose a fair vertical federated learning framework (FairVFL), which can improve the fairness of VFL models. The core idea of FairVFL is to learn unified and fair representations of samples based on the decentralized feature fields in a privacy-preserving way. Specifically, each platform with fairness-insensitive features first learns local data representations from local features.

fair vertical federated learning framework, representation, unified representation, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

CAMO: Causality-Guided Adversarial Multimodal Domain Generalization for Crisis Classification

Ma, Pingchuan, Zhao, Chengshuai, Jiang, Bohan, Vishnubhatla, Saketh, Jeong, Ujun, Beigi, Alimohammad, Raglin, Adrienne, Liu, Huan

arXiv.org Artificial IntelligenceDec-10-2025

Crisis classification in social media aims to extract actionable disaster-related information from multimodal posts, which is a crucial task for enhancing situational awareness and facilitating timely emergency responses. However, the wide variation in crisis types makes achieving generalizable performance across unseen disasters a persistent challenge. Existing approaches primarily leverage deep learning to fuse textual and visual cues for crisis classification, achieving numerically plausible results under in-domain settings. However, they exhibit poor generalization across unseen crisis types because they 1. do not disentangle spurious and causal features, resulting in performance degradation under domain shift, and 2. fail to align heterogeneous modality representations within a shared space, which hinders the direct adaptation of established single-modality domain generalization (DG) techniques to the multimodal setting. To address these issues, we introduce a causality-guided multimodal domain generalization (MMDG) framework that combines adversarial disentanglement with unified representation learning for crisis classification. The adversarial objective encourages the model to disentangle and focus on domain-invariant causal features, leading to more generalizable classifications grounded in stable causal mechanisms. The unified representation aligns features from different modalities within a shared latent space, enabling single-modality DG strategies to be seamlessly extended to multimodal learning. Experiments on the different datasets demonstrate that our approach achieves the best performance in unseen disaster scenarios.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2512.08071

Country: North America > United States > New York (0.14)

Genre: Research Report (0.64)

Industry: Government > Military (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Unified representation of tractography and diffusion-weighted MRI data using sparse multidimensional arrays

Neural Information Processing SystemsNov-21-2025, 16:04:06 GMT

Recently, linear formulations and convex optimization methods have been proposed to predict diffusion-weighted Magnetic Resonance Imaging (dMRI) data given estimates of brain connections generated using tractography algorithms. The size of the linear models comprising such methods grows with both dMRI data and connectome resolution, and can become very large when applied to modern data. In this paper, we introduce a method to encode dMRI signals and large connectomes, i.e., those that range from hundreds of thousands to millions of fascicles (bundles of neuronal axons), by using a sparse tensor decomposition. We show that this tensor decomposition accurately approximates the Linear Fascicle Evaluation (LiFE) model, one of the recently developed linear models. We provide a theoretical analysis of the accuracy of the sparse decomposed model, LiFESD, and demonstrate that it can reduce the size of the model significantly. Also, we develop algorithms to implement the optimisation solver using the tensor representation in an efficient way.

sparse multidimensional array, tractography and diffusion-weighted mri data, unified representation, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing with Unified Representation

Yan, Canxiang, Jin, Chunxiang, Huang, Dawei, Yu, Haibing, Peng, Han, Zhan, Hui, Gao, Jie, Peng, Jing, Chen, Jingdong, Zhou, Jun, Ren, Kaimeng, Yang, Ming, Yang, Mingxue, Xu, Qiang, Zhao, Qin, Xiong, Ruijie, Lin, Shaoxiong, Wang, Xuezhi, Yuan, Yi, Wu, Yifei, Lyu, Yongjie, He, Zhengyu, Qiu, Zhihao, Fang, Zhiqiang, Huang, Ziyuan

arXiv.org Artificial IntelligenceNov-11-2025

Existing speech models suffer from competing requirements on token representations by understanding and generation tasks. This discrepancy in representation prevents speech language models from performing instruction-based free-form editing. To solve this challenge, we introduce a novel framework that unifies speech understanding, generation, and editing. The core of our unified model is a unified continuous speech tokenizer MingTok-Audio, the first continuous tokenizer to effectively integrate semantic and acoustic features, which makes it suitable for both understanding and generation tasks. Based on this unified continuous audio tokenizer, we developed the speech language model Ming-UniAudio, which achieved a balance between generation and understanding capabilities. Ming-UniAudio sets new state-of-the-art (SOTA) records on 8 out of 12 metrics on the ContextASR benchmark. Notably, for Chinese voice cloning, it achieves a highly competitive Seed-TTS-WER of 0.95. Leveraging this foundational model, we further trained a dedicated speech editing model Ming-UniAudio-Edit, the first speech language model that enables universal, free-form speech editing guided solely by natural language instructions, handling both semantic and acoustic modifications without timestamp condition. To rigorously assess the editing capability and establish a foundation for future research, we introduce Ming-Freeform-Audio-Edit, the first comprehensive benchmark tailored for instruction-based free-form speech editing, featuring diverse scenarios and evaluation dimensions spanning semantic correctness, acoustic quality, and instruction alignment. We open-sourced the continuous audio tokenizer, the unified foundational model, and the free-form instruction-based editing model to facilitate the development of unified audio understanding, generation, and manipulation.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2511.05516

Genre: Research Report (0.65)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

NExT-OMNI: Towards Any-to-Any Omnimodal Foundation Models with Discrete Flow Matching

Luo, Run, Xia, Xiaobo, Wang, Lu, Chen, Longze, Shan, Renke, Luo, Jing, Yang, Min, Chua, Tat-Seng

arXiv.org Artificial IntelligenceOct-17-2025

Next-generation multimodal foundation models capable of any-to-any cross-modal generation and multi-turn interaction will serve as core components of artificial general intelligence systems, playing a pivotal role in human-machine interaction. However, most existing multimodal models remain constrained by autoregressive architectures, whose inherent limitations prevent a balanced integration of understanding and generation capabilities. Although hybrid and decoupling strategies have been explored to address these tasks within unified frameworks separately, their redundant, non-integrated designs limit their applicability to broader scenarios, such as cross-modal retrieval. In this work, we introduce NExT-OMNI, an open-source omnimodal foundation model that achieves unified modeling through discrete flow paradigms. By leveraging metric-induced probability paths and kinetic optimal velocities, NExT-OMNI natively supports any-to-any understanding and generation with enhanced response efficiency, while enabling broader application scenarios through concise unified representations rather than task-decoupled designs. Trained on large-scale interleaved text, image, video, and audio data, NExT-OMNI delivers competitive performance on multimodal generation and understanding benchmarks, while outperforming prior unified models in multi-turn multimodal interaction and cross-modal retrieval, highlighting its architectural advantages as a next-generation multimodal foundation model. To advance further research, we release training details, data protocols, and open-source both the code and model checkpoints.

arxiv preprint arxiv, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2510.13721

Country: Asia (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(5 more...)

Add feedback

Achieving Cross Modal Generalization with Multimodal Unified Representation Y an Xia 1 Hai Huang

Neural Information Processing SystemsOct-9-2025, 07:16:38 GMT

During pre-training, we investigate various modality combinations, including audio-visual, audio-text, and the tri-modal combination of audio-visual-text.

artificial intelligence, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Middle East > Israel (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Filters

Collaborating Authors

unified representation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

c89f09849eb5af489abb122394ff0f0b-Paper-Conference.pdf

Learning Topology-Agnostic EEG Representations with Geometry-Aware Modeling

ffdb280e7c7b4c4af30e04daf5a84b98-Paper-Conference.pdf

6a4d5952d4c018a1c1af9fa590a10dda-Paper.pdf

FairVFL: A Fair Vertical Federated Learning Framework with Contrastive Adversarial Learning

CAMO: Causality-Guided Adversarial Multimodal Domain Generalization for Crisis Classification

Unified representation of tractography and diffusion-weighted MRI data using sparse multidimensional arrays

Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing with Unified Representation

NExT-OMNI: Towards Any-to-Any Omnimodal Foundation Models with Discrete Flow Matching

Achieving Cross Modal Generalization with Multimodal Unified Representation Y an Xia 1 Hai Huang