AITopics | Personal

Collaborating Authors

Personal

Interview with Pulkit Verma: Towards safe and reliable behavior of AI agents

AIHubOct-24-2024, 08:33:04 GMT

In this interview series, we're meeting some of the AAAI/SIGAI Doctoral Consortium participants to find out more about their research. The Doctoral Consortium provides an opportunity for a group of PhD students to discuss and explore their research interests and career objectives in an interdisciplinary workshop together with a panel of established researchers. In this latest interview, we hear from Pulkit Verma, recent PhD graduate from Arizona State University. I recently completed my PhD in Computer Science from School of Computing and Augmented Intelligence, Arizona State University. My research focuses on safe and reliable behavior of AI agents.

ai system, assessment, black-box ai system, (13 more...)

AIHub

Country:

North America > United States > Arizona (0.46)
Asia > India (0.05)

Genre:

Personal > Interview (0.36)
Instructional Material (0.36)

Industry:

Leisure & Entertainment > Sports (0.31)
Education > Educational Setting > K-12 Education (0.31)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.72)
Information Technology > Artificial Intelligence > Robots (0.50)

Add feedback

LoGU: Long-form Generation with Uncertainty Expressions

Yang, Ruihan, Zhang, Caiqi, Zhang, Zhisong, Huang, Xinting, Yang, Sen, Collier, Nigel, Yu, Dong, Yang, Deqing

arXiv.org Artificial IntelligenceOct-24-2024

While Large Language Models (LLMs) demonstrate impressive capabilities, they still struggle with generating factually incorrect content (i.e., hallucinations). A promising approach to mitigate this issue is enabling models to express uncertainty when unsure. Previous research on uncertainty modeling has primarily focused on short-form QA, but realworld applications often require much longer responses. In this work, we introduce the task of Long-form Generation with Uncertainty(LoGU). We identify two key challenges: Uncertainty Suppression, where models hesitate to express uncertainty, and Uncertainty Misalignment, where models convey uncertainty inaccurately. To tackle these challenges, we propose a refinement-based data collection framework and a two-stage training pipeline. Our framework adopts a divide-and-conquer strategy, refining uncertainty based on atomic claims. The collected data are then used in training through supervised fine-tuning (SFT) and direct preference optimization (DPO) to enhance uncertainty expression. Extensive experiments on three long-form instruction following datasets show that our method significantly improves accuracy, reduces hallucinations, and maintains the comprehensiveness of responses.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.14309

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)
Asia > Singapore (0.04)
(7 more...)

Genre:

Research Report > New Finding (0.67)
Research Report > Promising Solution (0.48)
Personal > Obituary (0.46)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.93)
Health & Medicine > Health Care Technology > Telehealth (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Grasping the Essentials: Tailoring Large Language Models for Zero-Shot Relation Extraction

Zhou, Sizhe, Meng, Yu, Jin, Bowen, Han, Jiawei

arXiv.org Artificial IntelligenceOct-24-2024

Relation extraction (RE) aims to identify semantic relationships between entities within text. Despite considerable advancements, existing models predominantly require extensive annotated training data, which is both costly and labor-intensive to collect. Moreover, these models often struggle to adapt to new or unseen relations. Few-shot learning, aiming to lessen annotation demands, typically provides incomplete and biased supervision for target relations, leading to degraded and unstable performance. To accurately and explicitly describe relation semantics while minimizing annotation demands, we explore the definition only zero-shot RE setting where only relation definitions expressed in natural language are used to train a RE model. We introduce REPaL, comprising three stages: (1) We leverage large language models (LLMs) to generate initial seed instances from relation definitions and an unlabeled corpus. (2) We fine-tune a bidirectional Small Language Model (SLM) with initial seeds to learn relations for the target domain. (3) We expand pattern coverage and mitigate bias from initial seeds by integrating feedback from the SLM's predictions on the unlabeled corpus and the synthesis history. To accomplish this, we leverage the multi-turn conversation ability of LLMs to generate new instances in follow-up dialogues, informed by both the feedback and synthesis history. Studies reveal that definition-oriented seed synthesis enhances pattern coverage whereas indiscriminately increasing seed quantity leads to performance saturation. Experiments on two datasets show REPaL significantly improved cost-effective zero-shot performance by large margins.

large language model, machine learning, relation, (16 more...)

arXiv.org Artificial Intelligence

2402.11142

Country:

Asia > Russia (0.14)
Asia > Singapore (0.04)
Africa > Middle East > Tunisia > Tunis Governorate > Tunis (0.04)
(30 more...)

Genre: Personal (0.93)

Industry:

Media > Film (1.00)
Leisure & Entertainment > Sports > Olympic Games (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

Add feedback

Characterising Open Source Co-opetition in Company-hosted Open Source Software Projects: The Cases of PyTorch, TensorFlow, and Transformers

Osborne, Cailean, Daneshyan, Farbod, He, Runzhi, Ye, Hengzhi, Zhang, Yuxia, Zhou, Minghui

arXiv.org Artificial IntelligenceOct-23-2024

Companies, including market rivals, have long collaborated on the development of open source software (OSS), resulting in a tangle of co-operation and competition known as "open source co-opetition". While prior work investigates open source co-opetition in OSS projects that are hosted by vendor-neutral foundations, we have a limited understanding thereof in OSS projects that are hosted and governed by one company. Given their prevalence, it is timely to investigate open source co-opetition in such contexts. Towards this end, we conduct a mixed-methods analysis of three company-hosted OSS projects in the artificial intelligence (AI) industry: Meta's PyTorch (prior to its donation to the Linux Foundation), Google's TensorFlow, and Hugging Face's Transformers. We contribute three key findings. First, while the projects exhibit similar code authorship patterns between host and external companies (80%/20% of commits), collaborations are structured differently (e.g., decentralised vs. hub-and-spoke networks). Second, host and external companies engage in strategic, non-strategic, and contractual collaborations, with varying incentives and collaboration practices. Some of the observed collaborations are specific to the AI industry (e.g., hardware-software optimizations or AI model integrations), while others are typical of the broader software industry (e.g., bug fixing or task outsourcing). Third, single-vendor governance creates a power imbalance that influences open source co-opetition practices and possibilities, from the host company's singular decision-making power (e.g., the risk of license change) to their community involvement strategy (e.g., from over-control to over-delegation). We conclude with recommendations for future research.

artificial intelligence, collaboration, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2410.18241

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > New York > New York County > New York City (0.04)
(16 more...)

Genre:

Research Report > New Finding (0.68)
Personal > Interview (0.46)

Industry:

Information Technology > Services (1.00)
Law (0.93)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

'I am valued here': the extraordinary film that recreates a disabled boy's rich digital life

The GuardianOct-22-2024, 14:04:43 GMT

The night after their son Mats died aged just 25, Trude and Robert Steen sat on the sofa in their living room in Oslo with their daughter Mia. "Everything was a blur," remembers Trude of that day 10 years ago. "Then Robert said, 'Maybe we should reach out to Mats' friends in World of Warcraft.'" Mats was born with Duchenne muscular dystrophy, a progressive condition that causes the muscles to weaken gradually. He was diagnosed aged four and started using a wheelchair at 10.

mat, remarkable life, trude and robert, (15 more...)

The Guardian

Country:

Europe > Norway > Eastern Norway > Oslo (0.25)
North America > United States > California (0.05)

Genre: Personal > Obituary (0.56)

Industry:

Media > Film (1.00)
Leisure & Entertainment > Games > Computer Games (0.71)

Technology: Information Technology > Artificial Intelligence > Games > Computer Games (0.61)

Add feedback

TechScape: Elon Musk's global political goals

The GuardianOct-22-2024, 13:22:58 GMT

Today in TechScape I'm deciphering Elon Musk's global political goals, a remarkable documentary filmed within World of Warcraft, polling on support for school phone bans, and cats on TikTok. Thank you for joining me. First, let's talk about Musk's global politics. Over the weekend, Musk pledged to give away 1m a day to registered voters in battleground states in the US who sign his Pac's petition in support of the first and second amendments. He awarded the first prize, a novelty check the size of a kitchen island, at a Pennsylvania rally on Saturday and the second on Sunday in Pittsburgh.

musk, regulator, techscape, (15 more...)

The Guardian

Country:

North America > United States > Pennsylvania (0.25)
North America > United States > California (0.15)
South America > Brazil (0.05)
(2 more...)

Genre: Personal (0.69)

Industry:

Leisure & Entertainment (1.00)
Education (1.00)
Government > Regional Government > North America Government > United States Government (0.96)
(2 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (0.92)

Add feedback

ERABAL: Enhancing Role-Playing Agents through Boundary-Aware Learning

Tang, Yihong, Ou, Jiao, Liu, Che, Zhang, Fuzheng, Zhang, Di, Gai, Kun

arXiv.org Artificial IntelligenceOct-22-2024

Role-playing is an emerging application in the field of Human-Computer Interaction (HCI), primarily implemented through the alignment training of a large language model (LLM) with assigned characters. Despite significant progress, role-playing agents (RPLAs) still struggle with maintaining role-consistency across conversations, particularly when confronted with boundary queries subtly related to character attributes. In this paper, we present ERABAL, a framework aimed at enhancing RPLAs' role-playing capabilities through boundary-aware learning. ERABAL encompasses a generation pipeline for role-specific dialogues and a concomitant methodology for alignment training. Through comprehensive evaluations, we demonstrate that ERABAL is both efficient and effective. By training with significantly fewer dialogues than those used in leading approaches, ERABAL achieves notable improvements across WikiRoleEval, CharacterEval, and the role-playing subset of MT-Bench compared to the generalist baseline models. Our code and datasets will be made publicly available to support further research.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2409.1471

Country:

Europe > Greece (0.04)
Europe > Russia (0.04)
Europe > France (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.92)
Personal > Interview (0.68)

Industry:

Leisure & Entertainment (0.92)
Consumer Products & Services (0.67)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

WHoW: A Cross-domain Approach for Analysing Conversation Moderation

Chen, Ming-Bin, Frermann, Lea, Lau, Jey Han

arXiv.org Artificial IntelligenceOct-20-2024

We propose WHoW, an evaluation framework for analyzing the facilitation strategies of moderators across different domains/scenarios by examining their motives (Why), dialogue acts (How) and target speaker (Who). Using this framework, we annotated 5,657 moderation sentences with human judges and 15,494 sentences with GPT-4o from two domains: TV debates and radio panel discussions. Comparative analysis demonstrates the framework's cross-domain generalisability and reveals distinct moderation strategies: debate moderators emphasise coordination and facilitate interaction through questions and instructions, while panel discussion moderators prioritize information provision and actively participate in discussions. Our analytical framework works for different moderation scenarios, enhances our understanding of moderation behaviour through automatic large-scale analysis, and facilitates the development of moderator agents.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2410.15551

Country:

North America > United States (0.28)
North America > Mexico (0.04)
Africa > Middle East > Egypt (0.04)

Genre:

Personal > Interview (0.67)
Research Report > Experimental Study (0.46)

Industry:

Law (1.00)
Government (1.00)
Information Technology > Security & Privacy (0.92)
Media (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.89)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

Evaluation of Human-Robot Interfaces based on 2D/3D Visual and Haptic Feedback for Aerial Manipulation

Mellet, Julien, Allenspach, Mike, Cuniato, Eugenio, Pacchierotti, Claudio, Siegwart, Roland, Tognon, Marco

arXiv.org Artificial IntelligenceOct-20-2024

Most telemanipulation systems for aerial robots provide the operator with only 2D screen visual information. The lack of richer information about the robot's status and environment can limit human awareness and, in turn, task performance. While the pilot's experience can often compensate for this reduced flow of information, providing richer feedback is expected to reduce the cognitive workload and offer a more intuitive experience overall. This work aims to understand the significance of providing additional pieces of information during aerial telemanipulation, namely (i) 3D immersive visual feedback about the robot's surroundings through mixed reality (MR) and (ii) 3D haptic feedback about the robot interaction with the environment. To do so, we developed a human-robot interface able to provide this information. First, we demonstrate its potential in a real-world manipulation task requiring sub-centimeter-level accuracy. Then, we evaluate the individual effect of MR vision and haptic feedback on both dexterity and workload through a human subjects study involving a virtual block transportation task. Results show that both 3D MR vision and haptic feedback improve the operator's dexterity in the considered teleoperated aerial interaction tasks. Nevertheless, pilot experience remains the most significant factor.

artificial intelligence, human computer interaction, operator, (17 more...)

arXiv.org Artificial Intelligence

2410.15398

Country:

Europe > Switzerland > Zürich > Zürich (0.15)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
North America > United States > Minnesota (0.04)
(11 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Personal (1.00)

Industry:

Health & Medicine (1.00)
Government (0.94)
Education (0.93)
Aerospace & Defense > Aircraft (0.46)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (0.93)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.61)

Add feedback

When Machine Unlearning Meets Retrieval-Augmented Generation (RAG): Keep Secret or Forget Knowledge?

Wang, Shang, Zhu, Tianqing, Ye, Dayong, Zhou, Wanlei

arXiv.org Artificial IntelligenceOct-19-2024

The deployment of large language models (LLMs) like ChatGPT and Gemini has shown their powerful natural language generation capabilities. However, these models can inadvertently learn and retain sensitive information and harmful content during training, raising significant ethical and legal concerns. To address these issues, machine unlearning has been introduced as a potential solution. While existing unlearning methods take into account the specific characteristics of LLMs, they often suffer from high computational demands, limited applicability, or the risk of catastrophic forgetting. To address these limitations, we propose a lightweight unlearning framework based on Retrieval-Augmented Generation (RAG) technology. By modifying the external knowledge base of RAG, we simulate the effects of forgetting without directly interacting with the unlearned LLM. We approach the construction of unlearned knowledge as a constrained optimization problem, deriving two key components that underpin the effectiveness of RAG-based unlearning. This RAG-based approach is particularly effective for closed-source LLMs, where existing unlearning methods often fail. We evaluate our framework through extensive experiments on both open-source and closed-source models, including ChatGPT, Gemini, Llama-2-7b-chat-hf, and PaLM 2. The results demonstrate that our approach meets five key unlearning criteria: effectiveness, universality, harmlessness, simplicity, and robustness. Meanwhile, this approach can extend to multimodal large language models and LLM-based agents.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.15267

Country:

North America > United States > New York (0.04)
North America > United States > Illinois > Sangamon County > Springfield (0.04)
Asia > Nepal (0.04)
Asia > Macao (0.04)

Genre:

Personal > Honors (0.46)
Research Report > New Finding (0.34)
Research Report > Promising Solution (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback