human
- Education (1.00)
- Information Technology (0.93)
c1f0b856a35986348ab3414177266f75-Paper-Conference.pdf
Large language models are now tuned to align with the goals of their creators, namely to be "helpful and harmless." These models should respond helpfully to user questions, but refuse to answer requests that could cause harm. However, adversarial users can construct inputs which circumvent attempts at alignment. In this work, we study adversarial alignment, and ask to what extent these models remain aligned when interacting with an adversarial user who constructs worst-case inputs (adversarial examples). These inputs are designed to cause the model to emit harmful content that would otherwise be prohibited. We show that existing NLP-based optimization attacks are insufficiently powerful to reliably attack aligned text models: even when current NLP-based attacks fail, we can find adversarial inputs with brute force.
- North America > United States (0.14)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
- Africa > Central African Republic > Ombella-M'Poko > Bimbo (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.98)
- Information Technology > Artificial Intelligence > Vision (0.94)
- Information Technology > Sensing and Signal Processing > Image Processing (0.90)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Learning to Cooperate with Humans using Generative Agents
Training agents that can coordinate zero-shot with humans is a key mission in multi-agent reinforcement learning (MARL). Current algorithms focus on training simulated human partner policies which are then used to train a Cooperator agent. The simulated human is produced either through behavior cloning over a dataset of human cooperation behavior, or by using MARL to create a population of simulated agents. However, these approaches often struggle to produce a Cooperator that can coordinate well with real humans, since the simulated humans fail to cover the diverse strategies and styles employed by people in the real world. We show \emph{learning a generative model of human partners} can effectively address this issue.
Measuring Per-Unit Interpretability at Scale Without Humans
In today's era, whatever we can measure at scale, we can optimize. So far, measuring the interpretability of units in deep neural networks (DNNs) for computer vision still requires direct human evaluation and is not scalable. As a result, the inner workings of DNNs remain a mystery despite the remarkable progress we have seen in their applications. In this work, we introduce the first scalable method to measure the per-unit interpretability in vision DNNs. This method does not require any human evaluations, yet its prediction correlates well with existing human interpretability measurements.
Beyond Accuracy: Tracking more like Human via Visual Search
Human visual search ability enables efficient and accurate tracking of an arbitrary moving target, which is a significant research interest in cognitive neuroscience. The recently proposed Central-Peripheral Dichotomy (CPD) theory sheds light on how humans effectively process visual information and track moving targets in complex environments. However, existing visual object tracking algorithms still fall short of matching human performance in maintaining tracking over time, particularly in complex scenarios requiring robust visual search skills. These scenarios often involve Spatio-Temporal Discontinuities (i.e., STDChallenge), prevalent in long-term tracking and global instance tracking. To address this issue, we conduct research from a human-like modeling perspective: (1) Inspired by the CPD, we pro- pose a new tracker named CPDTrack to achieve human-like visual search ability.
Human-AI Interaction Design Standards
The rapid development of artificial intelligence (AI) has significantly transformed human-computer interactions, making it essential to establish robust design standards to ensure effective, ethical, and human-centered AI (HCAI) solutions. Standards serve as the foundation for the adoption of new technologies, and human-AI interaction (HAII) standards are critical to supporting the industrialization of AI technology by following an HCAI approach. These design standards aim to provide clear principles, requirements, and guidelines for designing, developing, deploying, and using AI systems, enhancing the user experience and performance of AI systems. Despite their importance, the creation and adoption of HCAI-based interaction design standards face challenges, including the absence of universal frameworks, the inherent complexity of HAII, and the ethical dilemmas that arise in such systems. This chapter provides a comparative analysis of HAII versus traditional human-computer interaction (HCI) and outlines guiding principles for HCAI-based design. It explores international, regional, national, and industry standards related to HAII design from an HCAI perspective and reviews design guidelines released by leading companies such as Microsoft, Google, and Apple. Additionally, the chapter highlights tools available for implementing HAII standards and presents case studies of human-centered interaction design for AI systems in diverse fields, including healthcare, autonomous vehicles, and customer service. It further examines key challenges in developing HAII standards and suggests future directions for the field. Emphasizing the importance of ongoing collaboration between AI designers, developers, and experts in human factors and HCI, this chapter stresses the need to advance HCAI-based interaction design standards to ensure human-centered AI solutions across various domains.
- North America > United States (1.00)
- Europe (0.92)
- Asia > China > Zhejiang Province (0.14)
- Research Report (0.50)
- Overview (0.46)
- Transportation (1.00)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- (4 more...)
Homo Ratiocinator (Reckoning Human)
Homo Sapiens, "wise human" in Latin, is the taxonomic species name for modern humans. But observing the current state of the world and its trajectory, it is hard for me to accept the description "wise." I am not the first to object to the "sapiens" descriptor. The French philosopher Henri-Louis Bergson argued in 1911 that a better term would be Homo Faber, referring to human tool-making ability. This ability goes back to early humans, about three million years ago. Most importantly, human tools got better and better due to innovation and cultural transmission.
- North America > United States > California (0.05)
- Africa (0.05)