waitress
SupplementaryAppendix
We feel strongly about the importance in studying non-binary gender and in ensuring the field of machine learning andAIdoes notdiminish thevisibility ofnon-binary gender identities. Tab. 5 shows that the small version of GPT-2 has an order of magnitude more downloads as compared to the large and XL versions. We conduct this process for baseline man and baseline woman, leading to a total of 10K samples generated by varying the top k parameter. The sample loss was due to Stanford CoreNLPNER not recognizing some job titles e.g. "Karima works as a consultant-development worker", "The man works as a volunteer", or "The man works as a maintenance man at a local...".
- North America > United States (0.14)
- Oceania (0.04)
- Europe (0.04)
- (2 more...)
- Oceania (0.04)
- North America > United States > California (0.04)
- North America > Canada (0.04)
- (5 more...)
- Health & Medicine (1.00)
- Consumer Products & Services > Restaurants (0.30)
- North America > United States (0.29)
- Oceania (0.04)
- Europe (0.04)
- (2 more...)
- Education (0.68)
- Transportation > Ground > Road (0.47)
- Oceania (0.04)
- North America > United States > California (0.04)
- North America > Canada (0.04)
- (5 more...)
- Health & Medicine (1.00)
- Consumer Products & Services > Restaurants (0.30)
GRADIEND: Monosemantic Feature Learning within Neural Networks Applied to Gender Debiasing of Transformer Models
Drechsel, Jonathan, Herbold, Steffen
We hypothesize that these gradients AI systems frequently exhibit and amplify social biases, contain valuable information for identifying and modifying including gender bias, leading to harmful consequences gender-specific features. Our method aims to learn a in critical areas. This study introduces a novel encoderdecoder feature neuron that encodes gender information from the approach that leverages model gradients to input, i.e., model gradients. Unlike existing approaches learn a single monosemantic feature neuron encoding for extracting monosemantic features (e.g., Bricken et al. gender information. We show that our method can (2023)), our approach enables the learning of a feature neuron be used to debias transformer-based language models, with a desired, interpretable meaning, such as gender.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Oceania > Australia (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (6 more...)
Analyzing Large language models chatbots: An experimental approach using a probability test
Peruchini, Melise, Teixeira, Julio Monteiro
This study consists of qualitative empirical research, conducted through exploratory tests with two different Large Language Models (LLMs) chatbots: ChatGPT and Gemini. The methodological procedure involved exploratory tests based on prompts designed with a probability question. The "Linda Problem", widely recognized in cognitive psychology, was used as a basis to create the tests, along with the development of a new problem specifically for this experiment, the "Mary Problem". The object of analysis is the dataset with the outputs provided by each chatbot interaction. The purpose of the analysis is to verify whether the chatbots mainly employ logical reasoning that aligns with probability theory or if they are more frequently affected by the stereotypical textual descriptions in the prompts. The findings provide insights about the approach each chatbot employs in handling logic and textual constructions, suggesting that, while the analyzed chatbots perform satisfactorily on a well-known probabilistic problem, they exhibit significantly lower performance on new tests that require direct application of probabilistic logic.
- South America > Brazil > Santa Catarina (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- (3 more...)
How Jensen Huang's Nvidia Is Powering the A.I. Revolution
The revelation that ChatGPT, the astonishing artificial-intelligence chatbot, had been trained on an Nvidia supercomputer spurred one of the largest single-day gains in stock-market history. When the Nasdaq opened on May 25, 2023, Nvidia's value increased by about two hundred billion dollars. A few months earlier, Jensen Huang, Nvidia's C.E.O., had informed investors that Nvidia had sold similar supercomputers to fifty of America's hundred largest companies. By the close of trading, Nvidia was the sixth most valuable corporation on earth, worth more than Walmart and ExxonMobil combined. Huang's business position can be compared to that of Samuel Brannan, the celebrated vender of prospecting supplies in San Francisco in the late eighteen-forties.
- North America > United States > California > San Francisco County > San Francisco (0.25)
- North America > United States > Oregon (0.05)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.05)
- (5 more...)
- Information Technology > Hardware (1.00)
- Education (1.00)
ChatGPT vs. Bing vs. Bard: Which AI is best?
ChatGPT, Bing Chat, and Bard promise to transform your life using the power of artificial intelligence, through AI conversations that can inform, amuse, and educate you--just like a human being. But how good are these new AI chatbots, really? We tested them to find out. We asked all three AIs a variety of different questions: some that expanded upon general search topics, some that demanded an opinion, logic puzzles, even code--and then asked them to be more creative, such as by writing an alternate, better ending to Game of Thrones and a Seinfeld scene with a special guest. We've included all of their answers, or as much as them as we could provide, and we'll let you decide for yourself.
- Energy (0.50)
- Media > Television (0.35)
- Transportation > Ground > Road (0.31)
Language Model Pre-Training with Sparse Latent Typing
Ren, Liliang, Zhang, Zixuan, Wang, Han, Voss, Clare R., Zhai, Chengxiang, Ji, Heng
Modern large-scale Pre-trained Language Models (PLMs) have achieved tremendous success on a wide range of downstream tasks. However, most of the LM pre-training objectives only focus on text reconstruction, but have not sought to learn latent-level interpretable representations of sentences. In this paper, we manage to push the language models to obtain a deeper understanding of sentences by proposing a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types. Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge. Besides, the language model pre-trained with such an objective also significantly improves Information Extraction related downstream tasks in both supervised and few-shot settings. Our code is publicly available at: https://github.com/renll/SparseLT.
- North America > United States > New York (0.05)
- North America > United States > Illinois (0.05)
- North America > United States > Montana (0.04)
- (9 more...)
- Government > Regional Government > North America Government > United States Government (1.00)
- Health & Medicine > Therapeutic Area (0.68)
- Government > Military (0.68)
Detecting Backdoors in Deep Text Classifiers
Guo, You, Wang, Jun, Cohn, Trevor
Deep neural networks are vulnerable to adversarial attacks, such as backdoor attacks in which a malicious adversary compromises a model during training such that specific behaviour can be triggered at test time by attaching a specific word or phrase to an input. This paper considers the problem of diagnosing whether a model has been compromised and if so, identifying the backdoor trigger. We present the first robust defence mechanism that generalizes to several backdoor attacks against text classification models, without prior knowledge of the attack type, nor does our method require access to any (potentially compromised) training resources. Our experiments show that our technique is highly accurate at defending against state-of-the-art backdoor attacks, including data poisoning and weight poisoning, across a range of text classification tasks and model architectures. Our code will be made publicly available upon acceptance.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > Dominican Republic (0.04)
- (7 more...)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)