Goto

Collaborating Authors

 Resnick, Paul


Towards Bidirectional Human-AI Alignment: A Systematic Review for Clarifications, Framework, and Future Directions

arXiv.org Artificial Intelligence

Recent advancements in general-purpose AI have highlighted the importance of guiding AI systems towards the intended goals, ethical principles, and values of individuals and groups, a concept broadly recognized as alignment. However, the lack of clarified definitions and scopes of human-AI alignment poses a significant obstacle, hampering collaborative efforts across research domains to achieve this alignment. In particular, ML- and philosophy-oriented alignment research often views AI alignment as a static, unidirectional process (i.e., aiming to ensure that AI systems' objectives match humans) rather than an ongoing, mutual alignment problem [429]. This perspective largely neglects the long-term interaction and dynamic changes of alignment. To understand these gaps, we introduce a systematic review of over 400 papers published between 2019 and January 2024, spanning multiple domains such as Human-Computer Interaction (HCI), Natural Language Processing (NLP), Machine Learning (ML), and others. We characterize, define and scope human-AI alignment. From this, we present a conceptual framework of "Bidirectional Human-AI Alignment" to organize the literature from a human-centered perspective. This framework encompasses both 1) conventional studies of aligning AI to humans that ensures AI produces the intended outcomes determined by humans, and 2) a proposed concept of aligning humans to AI, which aims to help individuals and society adjust to AI advancements both cognitively and behaviorally. Additionally, we articulate the key findings derived from literature analysis, including discussions about human values, interaction techniques, and evaluations. To pave the way for future studies, we envision three key challenges for future directions and propose examples of potential future solutions.


Spot Check Equivalence: an Interpretable Metric for Information Elicitation Mechanisms

arXiv.org Artificial Intelligence

Because high-quality data is like oxygen for AI systems, effectively eliciting information from crowdsourcing workers has become a first-order problem for developing high-performance machine learning algorithms. Two prevalent paradigms, spot-checking and peer prediction, enable the design of mechanisms to evaluate and incentivize high-quality data from human labelers. So far, at least three metrics have been proposed to compare the performances of these techniques [33, 8, 3]. However, different metrics lead to divergent and even contradictory results in various contexts. In this paper, we harmonize these divergent stories, showing that two of these metrics are actually the same within certain contexts and explain the divergence of the third. Moreover, we unify these different contexts by introducing \textit{Spot Check Equivalence}, which offers an interpretable metric for the effectiveness of a peer prediction mechanism. Finally, we present two approaches to compute spot check equivalence in various contexts, where simulation results verify the effectiveness of our proposed metric.


Extractive Adversarial Networks: High-Recall Explanations for Identifying Personal Attacks in Social Media Posts

arXiv.org Machine Learning

We introduce an adversarial method for producing high-recall explanations of neural text classifier decisions. Building on an existing architecture for extractive explanations via hard attention, we add an adversarial layer which scans the residual of the attention for remaining predictive signal. Motivated by the important domain of detecting personal attacks in social media comments, we additionally demonstrate the importance of manually setting a semantically appropriate `default' behavior for the model by explicitly manipulating its bias term. We develop a validation set of human-annotated personal attacks to evaluate the impact of these changes.


Classifying the Political Leaning of News Articles and Users from User Votes

AAAI Conferences

Social news aggregator services generate readers’ subjective reactions to news opinion articles. Can we use those as a resource to classify articles as liberal or conservative, even without knowing the self-identified political leaning of most users? We applied three semi-supervised learning methods that propagate classifications of political news articles and users as conservative or liberal, based on the assumption that liberal users will vote for liberal articles more often, and similarly for conservative users and articles. Starting from a few labeled articles and users, the algorithms propagate political leaning labels to the entire graph. In cross-validation, the best algorithm achieved 99.6% accuracy on held-out users and 96.3% accuracy on held-out articles. Adding social data such as users’ friendship or text features such as cosine similarity did not improve accuracy. The propagation algorithms, using the subjective liking data from users, also performed better than an SVM based text classifier, which achieved 92.0% accuracy on articles.


The Prevalence of Political Discourse in Non-Political Blogs

AAAI Conferences

Though political theorists have emphasized the importance of political discussion in non-political spaces, past study of online political discussion has focused on primarily political websites. Using a random sample from Blogger.com, we find that 25% of all political posts are from blogs that post about politics less than 20% of the time, because the vast majority of blogs post about politics some of the time but infrequently. Far from being taboo topics in those non- political blogs, political posts got slightly more comments than non-political posts in those same blogs, and the comments overwhelmingly engage the political topics of the post, mostly agreeing but frequently disagreeing as well. We argue that non-political spaces devoted primarily to personal diaries, hobbies, and other topics represent a substantial place of online political discussion and should be a site for further study.