AITopics | contrastive analysis

Collaborating Authors

contrastive analysis

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Feature Selection in the Contrastive Analysis Setting

Neural Information Processing SystemsDec-26-2025, 20:15:45 GMT

Contrastive analysis (CA) refers to the exploration of variations uniquely enriched in a dataset as compared to a corresponding dataset generated from sources of variation that are irrelevant to a given task. For example, a biomedical data analyst may wish to find a small set of genes to use as a proxy for variations in genomic data only present among patients with a given disease (target) as opposed to healthy control subjects (background). However, as of yet the problem of feature selection in the CA setting has received little attention from the machine learning community.

contrastive analysis, feature selection, name change, (4 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.60)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Contrastive Analysis of Constituent Order Preferences Within Adverbial Roles in English and Chinese News: A Large-Language-Model-Driven Approach

Ma, Yiran Rex

arXiv.org Artificial IntelligenceAug-21-2025

Based on comparable English-Chinese news corpora annotated by Large Language Model (LLM), this paper attempts to explore the differences in constituent order of English-Chinese news from the perspective of functional chunks with adverbial roles, and analyze their typical positional preferences and distribution patterns. It is found that: (1) English news prefers linear narrative of core information first, and functional chunks are mostly post-positioned, while Chinese news prefers overall presentation mode of background first, and functional chunks are often pre-positioned; (2) In SVO structure, both English and Chinese news show differences in the distribution of functional chunks, but the tendency of Chinese pre-positioning is more significant, while that of English post-positioning is relatively mild; (3) When function blocks are co-occurring, both English and Chinese news show high flexibility, and the order adjustment is driven by information and pragmatic purposes. The study reveals that word order has both systematic preference and dynamic adaptability, providing new empirical support for contrastive study of English-Chinese information structure.

artificial intelligence, large language model, natural language, (14 more...)

arXiv.org Artificial Intelligence

2508.14054

Country:

Asia > China (0.93)
North America (0.67)

Genre: Research Report > Experimental Study (0.68)

Industry:

Government (0.93)
Media > News (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Feature Selection in the Contrastive Analysis Setting

Neural Information Processing SystemsJan-19-2025, 22:49:20 GMT

Contrastive analysis (CA) refers to the exploration of variations uniquely enriched in a target dataset as compared to a corresponding background dataset generated from sources of variation that are irrelevant to a given task. For example, a biomedical data analyst may wish to find a small set of genes to use as a proxy for variations in genomic data only present among patients with a given disease (target) as opposed to healthy control subjects (background). However, as of yet the problem of feature selection in the CA setting has received little attention from the machine learning community. We motivate our approach with a novel information-theoretic analysis of representation learning in the CA setting, and we empirically validate CFS on a semi-synthetic dataset and four real-world biomedical datasets. We find that our method consistently outperforms previously proposed state-of-the-art supervised and fully unsupervised feature selection methods not designed for the CA setting.

contrastive analysis, dataset, feature selection, (2 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Double InfoGAN for Contrastive Analysis

Carton, Florence, Louiset, Robin, Gori, Pietro

arXiv.org Artificial IntelligenceJan-31-2024

Contrastive Analysis (CA) deals with the discovery of what is common and what is distinctive of a target domain compared to a background one. This is of great interest in many applications, such as medical imaging. Current state-of-the-art (SOTA) methods are latent variable models based on VAE (CA-VAEs). However, they all either ignore important constraints or they don't enforce fundamental assumptions. This may lead to sub-optimal solutions where distinctive factors are mistaken for common ones (or viceversa). Furthermore, the generated images have a rather poor quality, typical of VAEs, decreasing their interpretability and usefulness. Here, we propose Double InfoGAN, the first GAN based method for CA that leverages the high-quality synthesis of GAN and the separation power of InfoGAN. Experimental results on four visual datasets, from simple synthetic examples to complex medical images, show that the proposed method outperforms SOTA CA-VAEs in terms of latent separation and image quality. Datasets and code are available online.

dataset, double infogan, infogan, (17 more...)

arXiv.org Artificial Intelligence

2401.17776

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(4 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)

Add feedback

Large Pre-Trained Models with Extra-Large Vocabularies: A Contrastive Analysis of Hebrew BERT Models and a New One to Outperform Them All

Gueta, Eylon, Shmidman, Avi, Shmidman, Shaltiel, Shmidman, Cheyn Shmuel, Guedalia, Joshua, Koppel, Moshe, Bareket, Dan, Seker, Amit, Tsarfaty, Reut

arXiv.org Artificial IntelligenceMay-15-2023

We present a new pre-trained language model (PLM) for modern Hebrew, termed AlephBERTGimmel, which employs a much larger vocabulary (128K items) than standard Hebrew PLMs before. We perform a contrastive analysis of this model against all previous Hebrew PLMs (mBERT, heBERT, AlephBERT) and assess the effects of larger vocabularies on task performance. Our experiments show that larger vocabularies lead to fewer splits, and that reducing splits is better for model performance, across different tasks. All in all this new model achieves new SOTA on all available Hebrew benchmarks, including Morphological Segmentation, POS Tagging, Full Morphological Analysis, NER, and Sentiment Analysis. Subsequently we advocate for PLMs that are larger not only in terms of number of layers or training data, but also in terms of their vocabulary. We release the new model publicly for unrestricted use.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2211.15199

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Washington > King County > Seattle (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.47)

Add feedback

Assessing the Capacity of Transformer to Abstract Syntactic Representations: A Contrastive Analysis Based on Long-distance Agreement

Li, Bingzhi, Wisniewski, Guillaume, Crabbé, Benoît

arXiv.org Artificial IntelligenceJan-4-2023

The long-distance agreement, evidence for syntactic structure, is increasingly used to assess the syntactic generalization of Neural Language Models. Much work has shown that transformers are capable of high accuracy in varied agreement tasks, but the mechanisms by which the models accomplish this behavior are still not well understood. To better understand transformers' internal working, this work contrasts how they handle two superficially similar but theoretically distinct agreement phenomena: subject-verb and object-past participle agreement in French. Using probing and counterfactual analysis methods, our experiments show that i) the agreement task suffers from several confounders which partially question the conclusions drawn so far and ii) transformers handle subject-verb and object-past participle agreements in a way that is consistent with their modeling in theoretical linguistics.

abstract syntactic representation, artificial intelligence, natural language, (3 more...)

arXiv.org Artificial Intelligence

2212.04523

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.93)

Add feedback