AITopics | Chen, Blair

Collaborating Authors

Chen, Blair

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Gemini Embedding: Generalizable Embeddings from Gemini

Lee, Jinhyuk, Chen, Feiyang, Dua, Sahil, Cer, Daniel, Shanbhogue, Madhuri, Naim, Iftekhar, Ábrego, Gustavo Hernández, Li, Zhe, Chen, Kaifeng, Vera, Henrique Schechter, Ren, Xiaoqi, Zhang, Shanfeng, Salz, Daniel, Boratko, Michael, Han, Jay, Chen, Blair, Huang, Shuo, Rao, Vikram, Suganthan, Paul, Han, Feng, Doumanoglou, Andreas, Gupta, Nithi, Moiseev, Fedor, Yip, Cathy, Jain, Aashi, Baumgartner, Simon, Shahi, Shahrokh, Gomez, Frank Palma, Mariserla, Sandeep, Choi, Min, Shah, Parashar, Goenka, Sonam, Chen, Ke, Xia, Ye, Chen, Koert, Duddu, Sai Meher Karthik, Chen, Yichang, Walker, Trevor, Zhou, Wenlei, Ghiya, Rakesh, Gleicher, Zach, Gill, Karan, Dong, Zhe, Seyedhosseini, Mojtaba, Sung, Yunhsuan, Hoffmann, Raphael, Duerig, Tom

arXiv.org Artificial IntelligenceMar-10-2025

Embedding models, which transform inputs into dense vector representations, are pivotal for capturing semantic information across various domains and modalities. Text embedding models represent words and sentences as vectors, strategically positioning semantically similar texts in close proximity within the embedding space (Gao et al., 2021; Le and Mikolov, 2014; Reimers and Gurevych, 2019). Recent research has focused on developing general-purpose embedding models capable of excelling in diverse downstream tasks, including information retrieval, clustering, and classification (Cer et al., 2018; Muennighoff et al., 2023). Leveraging their vast pre-training knowledge, large language models (LLMs) have emerged as a promising avenue for constructing such general-purpose embedding models, with the potential to significantly enhance performance across a broad spectrum of applications (Anil et al., 2023a,b; Brown et al., 2020). The integration of LLMs has revolutionized the development of high-quality embedding models through two primary approaches. Firstly, LLMs have been employed to refine training datasets by generating higher quality examples. Techniques such as hard negative mining (Lee et al., 2024) and synthetic data generation (Dai et al., 2022; Wang et al., 2023) enable the distillation of LLM knowledge into smaller, more efficient embedding models, leading to substantial performance gains. Secondly, recognizing that the embedding model parameters are frequently initialized from language models (Devlin et al., 2019; Karpukhin et al., 2020), researchers have explored leveraging LLM parameters directly for initialization (Ni et al., 2021).

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.07891

Country:

Asia > India (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Industry: Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

LLM-Assisted Content Conditional Debiasing for Fair Text Embedding

Deng, Wenlong, Chen, Blair, Zhao, Beidi, Zhang, Chiyu, Li, Xiaoxiao, Thrampoulidis, Christos

arXiv.org Artificial IntelligenceJun-24-2024

Mitigating biases in machine learning models has become an increasing concern in Natural Language Processing (NLP), particularly in developing fair text embeddings, which are crucial yet challenging for real-world applications like search engines. In response, this paper proposes a novel method for learning fair text embeddings. First, we define a novel content-conditional equal distance (CCED) fairness for text embeddings, ensuring content-conditional independence between sensitive attributes and text embeddings. Building on CCED, we introduce a content-conditional debiasing (CCD) loss to ensure that embeddings of texts with different sensitive attributes but identical content maintain the same distance from the embedding of their corresponding neutral text. Additionally, we tackle the issue of insufficient training data by using Large Language Models (LLMs) with instructions to fairly augment texts into different sensitive groups. Our extensive evaluations show that our approach effectively enhances fairness while maintaining the utility of embeddings. Furthermore, our augmented dataset, combined with the CCED metric, serves as an new benchmark for evaluating fairness.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2402.14208

Country:

North America > Mexico > Mexico City (0.14)
North America > Canada > British Columbia (0.14)
Europe > Middle East > Malta (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry:

Information Technology (0.47)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Gecko: Versatile Text Embeddings Distilled from Large Language Models

Lee, Jinhyuk, Dai, Zhuyun, Ren, Xiaoqi, Chen, Blair, Cer, Daniel, Cole, Jeremy R., Hui, Kai, Boratko, Michael, Kapadia, Rajvi, Ding, Wen, Luan, Yi, Duddu, Sai Meher Karthik, Abrego, Gustavo Hernandez, Shi, Weiqiang, Gupta, Nithi, Kusupati, Aditya, Jain, Prateek, Jonnalagadda, Siddhartha Reddy, Chang, Ming-Wei, Naim, Iftekhar

arXiv.org Artificial IntelligenceMar-29-2024

Text embedding models represent natural language as dense vectors, positioning semantically similar text near each other within the embedding space (Gao et al., 2021; Le and Mikolov, 2014; Reimers and Gurevych, 2019). These embeddings are commonly used for a wide range of downstream tasks including document retrieval, sentence similarity, classification, and clustering (Muennighoff et al., 2023). Instead of building separate embedding models for each downstream task, recent efforts seek to create a single embedding model supporting many tasks. The recent development of general-purpose text embedding models presents a challenge: these models require large amounts of training data to comprehensively cover desired domains and skills. Recent embedding efforts have focused on using extensive collections of training examples (Li et al., 2023; Wang et al., 2022).

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2403.20327

Country: Asia > Middle East > UAE (0.14)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment > Sports > Olympic Games (0.68)
Media > Film (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

An Investigation of how Label Smoothing Affects Generalization

Chen, Blair, Ziyin, Liu, Wang, Zihao, Liang, Paul Pu

arXiv.org Machine LearningOct-23-2020

It has been hypothesized that label smoothing can reduce overfitting and improve generalization, and current empirical evidence seems to corroborate these effects. However, there is a lack of mathematical understanding of when and why such empirical improvements occur. In this paper, as a step towards understanding why label smoothing is effective, we propose a theoretical framework to show how label smoothing provides in controlling the generalization loss. In particular, we show that this benefit can be precisely formulated and identified in the label noise setting, where the training is partially mislabeled. Our theory also predicts the existence of an optimal label smoothing point, a single value for the label smoothing hyperparameter that minimizes generalization loss. Extensive experiments are done to confirm the predictions of our theory. We believe that our findings will help both theoreticians and practitioners understand label smoothing, and better apply them to real-world datasets.

artificial intelligence, evolutionary algorithm, generalization loss, (17 more...)

arXiv.org Machine Learning

2010.12648

Country: Asia (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback