AITopics

Genre: Research Report > Promising Solution (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.59)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.59)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.59)

Neural Information Processing SystemsJan-22-2025, 05:42:42 GMT

Reviews: Assessing Social and Intersectional Biases in Contextualized Word Representations

I look forward to the final version including more details about the tests, as requested by reviewer 2.] This paper studies the presence of social biases in contextualized word representations. First, word co-occurnce statistics of pronouns and stereotypical occupations are provided for various datasets used for training contextualizers. Then, the word/sentence embedding association test is extended for the contextual case. Using templates, instead of aggregating over word representations (in sentence test) or taking the context-free word embedding (in word test), the contextual word representation is used. Then, an association test compares the association between a concept and an attribute using a permutation test.

contextualized word representation, representation, word representation, (9 more...)

Genre: Research Report (0.54)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)

Neural Information Processing SystemsJan-22-2025, 05:42:31 GMT

Reviews: Assessing Social and Intersectional Biases in Contextualized Word Representations

The authors here find that in contextualized embeddings yielded from modern neural LMs, biases persist. This is an important finding. Reviewers expressed some concerns regarding the experimental setup and metrics, but felt these were adequately addressed by the author response.

contextualized word representation, social and intersectional bias

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.58)

Neural Information Processing SystemsOct-9-2024, 16:08:06 GMT

Assessing Social and Intersectional Biases in Contextualized Word Representations

Social bias in machine learning has drawn significant attention, with work ranging from demonstrations of bias in a multitude of applications, curating definitions of fairness for different contexts, to developing algorithms to mitigate bias. In natural language processing, gender bias has been shown to exist in context-free word embeddings. Recently, contextual word representations have outperformed word embeddings in several downstream NLP tasks. These word representations are conditioned on their context within a sentence, and can also be used to encode the entire sentence. In this paper, we analyze the extent to which state-of-the-art models for contextual word representations, such as BERT and GPT-2, encode biases with respect to gender, race, and intersectional identities.

contextual word representation, contextualized word representation, social and intersectional bias, (2 more...)

Genre: Research Report > Promising Solution (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)

Vijayakumar, Soniya, Bäumel, Tanja, Ostermann, Simon, van Genabith, Josef

Where exactly does contextualization in a PLM happen?

arXiv.org Artificial IntelligenceDec-11-2023

Pre-trained Language Models (PLMs) have shown to be consistently successful in a plethora of NLP tasks due to their ability to learn contextualized representations of words (Ethayarajh, 2019). BERT (Devlin et al., 2018), ELMo (Peters et al., 2018) and other PLMs encode word meaning via textual context, as opposed to static word embeddings, which encode all meanings of a word in a single vector representation. In this work, we present a study that aims to localize where exactly in a PLM word contextualization happens. In order to find the location of this word meaning transformation, we investigate representations of polysemous words in the basic BERT uncased 12 layer architecture (Devlin et al., 2018), a masked language model trained on an additional sentence adjacency objective, using qualitative and quantitative measures.

contextualization, representation, sublayersim, (14 more...)

2312.06514

Country:

Europe > Italy > Tuscany > Florence (0.06)
North America > United States > New York > New York County > New York City (0.05)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.05)
(3 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)

arXiv.org Artificial IntelligenceAug-18-2021

On the Opportunities and Risks of Foundation Models

Bommasani, Rishi, Hudson, Drew A., Adeli, Ehsan, Altman, Russ, Arora, Simran, von Arx, Sydney, Bernstein, Michael S., Bohg, Jeannette, Bosselut, Antoine, Brunskill, Emma, Brynjolfsson, Erik, Buch, Shyamal, Card, Dallas, Castellon, Rodrigo, Chatterji, Niladri, Chen, Annie, Creel, Kathleen, Davis, Jared Quincy, Demszky, Dora, Donahue, Chris, Doumbouya, Moussa, Durmus, Esin, Ermon, Stefano, Etchemendy, John, Ethayarajh, Kawin, Fei-Fei, Li, Finn, Chelsea, Gale, Trevor, Gillespie, Lauren, Goel, Karan, Goodman, Noah, Grossman, Shelby, Guha, Neel, Hashimoto, Tatsunori, Henderson, Peter, Hewitt, John, Ho, Daniel E., Hong, Jenny, Hsu, Kyle, Huang, Jing, Icard, Thomas, Jain, Saahil, Jurafsky, Dan, Kalluri, Pratyusha, Karamcheti, Siddharth, Keeling, Geoff, Khani, Fereshte, Khattab, Omar, Kohd, Pang Wei, Krass, Mark, Krishna, Ranjay, Kuditipudi, Rohith, Kumar, Ananya, Ladhak, Faisal, Lee, Mina, Lee, Tony, Leskovec, Jure, Levent, Isabelle, Li, Xiang Lisa, Li, Xuechen, Ma, Tengyu, Malik, Ali, Manning, Christopher D., Mirchandani, Suvir, Mitchell, Eric, Munyikwa, Zanele, Nair, Suraj, Narayan, Avanika, Narayanan, Deepak, Newman, Ben, Nie, Allen, Niebles, Juan Carlos, Nilforoshan, Hamed, Nyarko, Julian, Ogut, Giray, Orr, Laurel, Papadimitriou, Isabel, Park, Joon Sung, Piech, Chris, Portelance, Eva, Potts, Christopher, Raghunathan, Aditi, Reich, Rob, Ren, Hongyu, Rong, Frieda, Roohani, Yusuf, Ruiz, Camilo, Ryan, Jack, Ré, Christopher, Sadigh, Dorsa, Sagawa, Shiori, Santhanam, Keshav, Shih, Andy, Srinivasan, Krishnan, Tamkin, Alex, Taori, Rohan, Thomas, Armin W., Tramèr, Florian, Wang, Rose E., Wang, William, Wu, Bohan, Wu, Jiajun, Wu, Yuhuai, Xie, Sang Michael, Yasunaga, Michihiro, You, Jiaxuan, Zaharia, Matei, Zhang, Michael, Zhang, Tianyi, Zhang, Xikun, Zhang, Yuhui, Zheng, Lucia, Zhou, Kaitlyn, Liang, Percy

AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles(e.g., model architectures, training procedures, data, systems, security, evaluation, theory) to their applications (e.g., law, healthcare, education) and societal impact (e.g., inequity, misuse, economic and environmental impact, legal and ethical considerations). Though foundation models are based on standard deep learning and transfer learning, their scale results in new emergent capabilities,and their effectiveness across so many tasks incentivizes homogenization. Homogenization provides powerful leverage but demands caution, as the defects of the foundation model are inherited by all the adapted models downstream. Despite the impending widespread deployment of foundation models, we currently lack a clear understanding of how they work, when they fail, and what they are even capable of due to their emergent properties. To tackle these questions, we believe much of the critical research on foundation models will require deep interdisciplinary collaboration commensurate with their fundamentally sociotechnical nature.

computer based training, law enforcement, programming language design and implementation, (61 more...)

2108.07258

Country:

Europe > Germany (0.45)
North America > United States > New York > New York County > New York City (0.27)
Asia > China (0.27)
(18 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)
(2 more...)

Industry:

Social Sector (1.00)
Media > News (1.00)
Leisure & Entertainment > Games (1.00)
(36 more...)

Technology:

Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
(8 more...)

arXiv.org Artificial IntelligenceApr-26-2021

GPT2MVS: Generative Pre-trained Transformer-2 for Multi-modal Video Summarization

Huang, Jia-Hong, Murn, Luka, Mrak, Marta, Worring, Marcel

Traditional video summarization methods generate fixed video representations regardless of user interest. Therefore such methods limit users' expectations in content search and exploration scenarios. Multi-modal video summarization is one of the methods utilized to address this problem. When multi-modal video summarization is used to help video exploration, a text-based query is considered as one of the main drivers of video summary generation, as it is user-defined. Thus, encoding the text-based query and the video effectively are both important for the task of multi-modal video summarization. In this work, a new method is proposed that uses a specialized attention network and contextualized word representations to tackle this task. The proposed model consists of a contextualized video summary controller, multi-modal attention mechanisms, an interactive attention network, and a video summary generator. Based on the evaluation of the existing multi-modal video summarization benchmark, experimental results show that the proposed model is effective with the increase of +5.88% in accuracy and +4.06% increase of F1-score, compared with the state-of-the-art method.

representation, summarization, video summarization, (14 more...)

2104.12465

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > Taiwan > Taiwan Province > Taipei (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.34)
Research Report > Promising Solution (0.34)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Tan, Yi Chern, Celis, L. Elisa

Assessing Social and Intersectional Biases in Contextualized Word Representations

Neural Information Processing SystemsMar-19-2020, 02:02:44 GMT

Social bias in machine learning has drawn significant attention, with work ranging from demonstrations of bias in a multitude of applications, curating definitions of fairness for different contexts, to developing algorithms to mitigate bias. In natural language processing, gender bias has been shown to exist in context-free word embeddings. Recently, contextual word representations have outperformed word embeddings in several downstream NLP tasks. These word representations are conditioned on their context within a sentence, and can also be used to encode the entire sentence. In this paper, we analyze the extent to which state-of-the-art models for contextual word representations, such as BERT and GPT-2, encode biases with respect to gender, race, and intersectional identities.

contextualized word representation, machine learning, natural language, (6 more...)

Genre: Research Report > Promising Solution (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)

#artificialintelligenceOct-17-2019, 23:14:59 GMT

r/MachineLearning - [R] How Contextual are Contextualized Word Representations?

Abstract: Replacing static word embeddings with contextualized word representations has yielded significant improvements on many NLP tasks. However, just how contextual are the contextualized representations produced by models such as ELMo and BERT? Are there infinitely many context-specific representations for each word, or are words essentially assigned one of a finite number of word-sense representations? For one, we find that the contextualized representations of all words are not isotropic in any layer of the contextualizing model. While representations of the same word in different contexts still have a greater cosine similarity than those of two different words, this self-similarity is much lower in upper layers.

contextualized representation, contextualized word representation, representation, (4 more...)

#artificialintelligence

Industry: Media > News (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

arXiv.org Artificial IntelligenceSep-12-2019

Retrofitting Contextualized Word Embeddings with Paraphrases

Shi, Weijia, Chen, Muhao, Zhou, Pei, Chang, Kai-Wei

Contextualized word embedding models, such as ELMo, generate meaningful representations of words and their context. These models have been shown to have a great impact on downstream applications. However, in many cases, the contextualized embedding of a word changes drastically when the context is paraphrased. As a result, the downstream model is not robust to paraphrasing and other linguistic variations. To enhance the stability of contextualized word embedding models, we propose an approach to retrofitting contextualized embedding models with paraphrase contexts. Our method learns an orthogonal transformation on the input space, which seeks to minimize the variance of word representations on paraphrased contexts. Experiments show that the retrofitted model significantly outperforms the original ELMo on various sentence classification and language inference tasks.

artificial intelligence, machine learning, natural language, (16 more...)

1909.097

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)