AITopics

2311.08687

Country: Europe > Greece (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)

arXiv.org Artificial IntelligenceMay-26-2023

MixCE: Training Autoregressive Language Models by Mixing Forward and Reverse Cross-Entropies

Zhang, Shiyue, Wu, Shijie, Irsoy, Ozan, Lu, Steven, Bansal, Mohit, Dredze, Mark, Rosenberg, David

Autoregressive language models are trained by minimizing the cross-entropy of the model distribution Q relative to the data distribution P -- that is, minimizing the forward cross-entropy, which is equivalent to maximum likelihood estimation (MLE). We have observed that models trained in this way may "over-generalize", in the sense that they produce non-human-like text. Moreover, we believe that reverse cross-entropy, i.e., the cross-entropy of P relative to Q, is a better reflection of how a human would evaluate text generated by a model. Hence, we propose learning with MixCE, an objective that mixes the forward and reverse cross-entropies. We evaluate models trained with this objective on synthetic data settings (where P is known) and real data, and show that the resulting models yield better generated text without complex decoding strategies. Our code and models are publicly available at https://github.com/bloomberg/mixce-acl2023

justification, machine learning, natural language, (20 more...)

2305.16958

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

arXiv.org Artificial IntelligenceMay-21-2023

Generalizing Fairness using Multi-Task Learning without Demographic Information

Aguirre, Carlos, Dredze, Mark

To ensure the fairness of machine learning systems, we can include a fairness loss during training based on demographic information associated with the training data. However, we cannot train debiased classifiers for most tasks since the relevant datasets lack demographic annotations. Can we utilize demographic data for a related task to improve the fairness of our target task? We demonstrate that demographic fairness objectives transfer to new tasks trained within a multi-task framework. We adapt a single-task fairness loss to a multi-task setting to exploit demographic labels from a related task in debiasing a target task. We explore different settings with missing demographic data and show how our loss can improve fairness even without in-task demographics, across various domains and tasks.

artificial intelligence, fairness, machine learning, (16 more...)

2305.12671

Country:

Europe (0.67)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Health Care Providers & Services (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

arXiv.org Artificial IntelligenceDec-13-2022

Do Text-to-Text Multi-Task Learners Suffer from Task Conflict?

Mueller, David, Andrews, Nicholas, Dredze, Mark

Traditional multi-task learning architectures train a single model across multiple tasks through a shared encoder followed by task-specific decoders. Learning these models often requires specialized training algorithms that address task-conflict in the shared parameter updates, which otherwise can lead to negative transfer. A new type of multi-task learning within NLP homogenizes multi-task architectures as a shared encoder and language model decoder, which does surprisingly well across a range of diverse tasks. Does this new architecture suffer from task-conflicts that require specialized training algorithms? We study how certain factors in the shift towards text-to-text models affects multi-task conflict and negative transfer, finding that both directional conflict and transfer are surprisingly constant across architectures.

artificial intelligence, machine learning, natural language, (19 more...)

2212.06645

Country:

Europe (1.00)
North America > United States (0.28)

Genre: Research Report > New Finding (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

AAAI ConferencesMay-15-2019

Visual Attention Model for Cross-Sectional Stock Return Prediction and End-to-End Multimodal Market Representation Learning

Technical and fundamental analysis are traditional tools used to analyze individual stocks; however, the finance literature has shown that the price movement of each individual stock correlates heavily with other stocks, especially those within the same sector. In this paper we propose a general-purpose market representation that incorporates fundamental and technical indicators and relationships between individual stocks. We treat the daily stock market as a ‘market image’ where rows (grouped by market sector) represent individual stocks and columns represent indicators. We apply a convolutional neural network over this market image to build market features in a hierarchical way. We use a recurrent neural network, with an attention mechanism over the market feature maps, to model temporal dynamics in the market. We show that our proposed model outperforms strong baselines in both short-term and long-term stock return prediction tasks. We also show another use for our market image: to construct concise and dense market embeddings suitable for downstream prediction tasks.

deep learning, market image, neural network, (21 more...)

The Thirty-Second International Flairs Conference

Country:

North America > United States (0.14)
Asia > Middle East > Iran (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

AAAI ConferencesFeb-4-2017

Examining Patterns of Influenza Vaccination in Social Media

Huang, Xiaolei (University of Colorado Boulder) | Smith, Michael C. (George Washington University) | Paul, Michael J. (University of Colorado Boulder) | Ryzhkov, Dmytro (University of Colorado Boulder) | Quinn, Sandra C. (University of Maryland, College Park) | Broniatowski, David A. (George Washington University) | Dredze, Mark (Johns Hopkins University)

Traditional data on influenza vaccination has several limitations: high cost, limited coverage of underrepresented groups, and low sensitivity to emerging public health issues. Social media, such as Twitter, provide an alternative way to understand a population’s vaccination-related opinions and behaviors. In this study, we build and employ several natural language classifiers to examine and analyze behavioral patterns regarding influenza vaccination in Twitter across three dimensions: temporality (by week and month), geography (by US region), and demography (by gender). Our best results are highly correlated official government data, with a correlation over 0.90, providing validation of our approach. We then suggest a number of directions for future work.

immunology, tweet, us government, (22 more...)

Workshops at the Thirty-First AAAI Conference on Artificial Intelligence

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > United States > Colorado > Boulder County > Boulder (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

AAAI ConferencesApr-19-2016

Collective Supervision of Topic Models for Predicting Surveys with Social Media

Benton, Adrian (Johns Hopkins University) | Paul, Michael J. (University of Colorado Boulder) | Hancock, Braden (Stanford University) | Dredze, Mark (Johns Hopkins University)

This paper considers survey prediction from social media. We use topic models to correlate social media messages with survey outcomes and to provide an interpretable representation of the data. Rather than rely on fully unsupervised topic models, we use existing aggregated survey data to inform the inferred topics, a class of topic model supervision referred to as collective supervision. We introduce and explore a variety of topic model variants and provide an empirical analysis, with conclusions of the most effective models for this task.

immunology, social media, supervision, (22 more...)

Thirtieth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Colorado > Boulder County > Boulder (0.14)
North America > United States > California > Santa Clara County (0.14)

Genre:

Questionnaire & Opinion Survey (0.69)
Research Report > Experimental Study (0.47)

Industry:

Health & Medicine > Therapeutic Area > Immunology (0.96)
Health & Medicine > Consumer Health (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

AAAI ConferencesApr-12-2016

Studying Anonymous Health Issues and Substance Use on College Campuses with Yik Yak

Koratana, Animesh (Johns Hopkins University) | Dredze, Mark (Johns Hopkins University) | Chisolm, Margaret S. (Johns Hopkins University) | Johnson, Matthew W. (Johns Hopkins University) | Paul, Michael J. (University of Colorado Boulder)

This study investigates the public health intelligence utility of Yik Yak, a social media platform that allows users to anonymously post and view messages within precise geographic locations. Our dataset contains 122,179 “yaks” collected from 120 college campuses across the United States during 2015. We first present an exploratory analysis of the topics commonly discussed in Yik Yak, clarifying the health issues for which this may serve as a source of information. We then present an in-depth content analysis of data describing substance use, an important public health issue that is not often discussed in public social media, but commonly discussed on Yik Yak under the cloak of anonymity.

immunology, social media, yak, (19 more...)

Workshops at the Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > United States > Colorado > Boulder County > Boulder (0.14)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Public Health (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (0.96)

Neural Information Processing SystemsDec-31-2012

Factorial LDA: Sparse Multi-Dimensional Text Models

Paul, Michael, Dredze, Mark

Multi-dimensional latent variable models can capture the many latent factors in a text corpus, such as topic, author perspective and sentiment. We introduce factorial LDA, a multi-dimensional latent variable model in which a document is influenced by K different factors, and each word token depends on a K-dimensional vector of latent variables. Our model incorporates structured word priors and learns a sparse product of factors. Experiments on research abstracts show that our model can learn latent factors such as research topic, scientific discipline, and focus (e.g. methods vs. applications.) Our modeling improvements reduce test perplexity and improve human interpretability of the discovered factors.

artificial intelligence, text processing, tuple, (21 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.87)

AAAI ConferencesNov-5-2012

Investigating Twitter as a Source for Studying Behavioral Responses to Epidemics

Lamb, Alex (Johns Hopkins University) | Paul, Michael J. (Johns Hopkins University) | Dredze, Mark (Johns Hopkins University)

Recent studies have shown an ability to track influenza rates from Twitter since Twitter users tweet illnesses (“i am home sick with the flu”). However, users may also tweet concerned awareness of illness (“don’t want to get sick, need a flu shot”). Identifying these messages can support computational epidemic response models. We present preliminary results for mining concerned awareness of influenza tweets. We describe our data set construction and experiments with binary classification of data into influenza versus general messages and classification into concerned awareness and existing infection.

immunology, social media, tweet, (19 more...)

2012 AAAI Fall Symposium Series

Country: North America > United States (0.15)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)