AITopics | Personal

Collaborating Authors

Personal

Exploring Practitioner Perspectives On Training Data Attribution Explanations

Nguyen, Elisa, Kortukov, Evgenii, Song, Jean Y., Oh, Seong Joon

arXiv.org Artificial IntelligenceNov-22-2023

Explainable AI (XAI) aims to provide insight into opaque model reasoning to humans and as such is an interdisciplinary field by nature. In this paper, we interviewed 10 practitioners to understand the possible usability of training data attribution (TDA) explanations and to explore the design space of such an approach. We confirmed that training data quality is often the most important factor for high model performance in practice and model developers mainly rely on their own experience to curate data. End-users expect explanations to enhance their interaction with the model and do not necessarily prioritise but are open to training data as a means of explanation. Within our participants, we found that TDA explanations are not well-known and therefore not used. We urge the community to focus on the utility of TDA techniques from the human-machine collaboration perspective and broaden the TDA evaluation to reflect common use cases in practice.

explanation, participant, training data, (14 more...)

arXiv.org Artificial Intelligence

2310.20477

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.05)
North America > United States > New York > New York County > New York City (0.04)
Europe > Netherlands (0.04)
(6 more...)

Genre:

Research Report (1.00)
Personal > Interview (1.00)

Industry:

Law (1.00)
Health & Medicine (1.00)
Information Technology > Security & Privacy (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.35)

Add feedback

As OpenAI chaos mounts, talks to bring back Sam Altman continue

Washington Post - Technology NewsNov-21-2023, 17:43:46 GMT

Altman's sudden move to join Microsoft is not finalized, Satya Nadella, CEO of Microsoft, signaled in an interview with CNBC on Monday. A person familiar with the matter said he would only return to OpenAI if the board members who ousted him stepped down. In the CNBC interview on Monday afternoon, Nadella sought to assure customers and investors that his company was on solid ground no matter the outcome. He left the door open for Altman to return to OpenAI or continue on as an AI leader at Microsoft, even though he announced late Sunday night that Altman was coming to Microsoft. "I'm open to both options," Nadella said in the interview with CNBC.

bring back sam altman continue, microsoft, openai chaos mount, (3 more...)

Washington Post - Technology News

Genre: Personal > Interview (0.71)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.97)

Add feedback

Christopher Nolan on the Promise and Peril of Technology

The Atlantic - TechnologyNov-20-2023, 14:00:00 GMT

By the time I sat down with Christopher Nolan in his posh hotel suite not far from the White House, I guessed that he was tired of Washington, D.C. The day before, he'd toured the Oval Office and had lunch on Capitol Hill. Later that night, I'd watched him receive an award from the Federation for American Scientists, an organization that counts Robert Oppenheimer, the subject of Nolan's most recent film, among its founders. He'd endured a joke, repeated too many times by Senate Majority Leader Chuck Schumer, about the subject of his next film--"It's another biopic: Schumer." The award was sitting on an end table next to Nolan, who was dressed in brown slacks, a gray vest, and a navy suit jacket--his Anglo-formality undimmed by decades spent living in Los Angeles. "It's heavy, and glass, and good for self-defense," he said of the award, while filling his teacup.

andersen, nolan, oppenheimer, (16 more...)

The Atlantic - Technology

Country:

North America > United States > District of Columbia > Washington (0.24)
North America > United States > California > Los Angeles County > Los Angeles (0.24)
Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.05)
Asia > China > Hong Kong (0.04)

Genre: Personal (0.47)

Industry:

Media > Film (1.00)
Government > Regional Government > North America Government > United States Government (0.69)

Technology:

Information Technology > Artificial Intelligence (0.70)
Information Technology > Communications (0.47)

Add feedback

GPQA: A Graduate-Level Google-Proof Q&A Benchmark

Rein, David, Hou, Betty Li, Stickland, Asa Cooper, Petty, Jackson, Pang, Richard Yuanzhe, Dirani, Julien, Michael, Julian, Bowman, Samuel R.

arXiv.org Artificial IntelligenceNov-20-2023

We present GPQA, a challenging dataset of 448 multiple-choice questions written by domain experts in biology, physics, and chemistry. We ensure that the questions are high-quality and extremely difficult: experts who have or are pursuing PhDs in the corresponding domains reach 65% accuracy (74% when discounting clear mistakes the experts identified in retrospect), while highly skilled non-expert validators only reach 34% accuracy, despite spending on average over 30 minutes with unrestricted access to the web (i.e., the questions are "Google-proof"). The questions are also difficult for state-of-the-art AI systems, with our strongest GPT-4 based baseline achieving 39% accuracy. If we are to use future AI systems to help us answer very hard questions, for example, when developing new scientific knowledge, we need to develop scalable oversight methods that enable humans to supervise their outputs, which may be difficult even if the supervisors are themselves skilled and knowledgeable. The difficulty of GPQA both for skilled non-experts and frontier AI systems should enable realistic scalable oversight experiments, which we hope can help devise ways for human experts to reliably get truthful information from AI systems that surpass human capabilities.

accuracy, expert validator, validator, (14 more...)

arXiv.org Artificial Intelligence

2311.12022

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Asia > Middle East > Israel (0.04)
(8 more...)

Genre:

Research Report (1.00)
Personal > Obituary (1.00)

Industry:

Materials > Chemicals (1.00)
Health & Medicine (1.00)
Education (1.00)
Leisure & Entertainment > Sports > Boxing (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

System 2 Attention (is something you might need too)

Weston, Jason, Sukhbaatar, Sainbayar

arXiv.org Artificial IntelligenceNov-20-2023

Soft attention in Transformer-based Large Language Models (LLMs) is susceptible to incorporating irrelevant information from the context into its latent representations, which adversely affects next token generations. To help rectify these issues, we introduce System 2 Attention (S2A), which leverages the ability of LLMs to reason in natural language and follow instructions in order to decide what to attend to. S2A regenerates the input context to only include the relevant portions, before attending to the regenerated context to elicit the final response. In experiments, S2A outperforms standard attention-based LLMs on three tasks containing opinion or irrelevant information: QA, math word problems and longform generation, where S2A increases factuality and objectivity, and decreases sycophancy.

arxiv preprint arxiv, llama-2-70b-chat, system 2, (14 more...)

arXiv.org Artificial Intelligence

2311.11829

Country:

North America > United States > California > Santa Clara County > Sunnyvale (0.05)
North America > United States > California > Santa Clara County > Saratoga (0.04)

Genre:

Personal (0.68)
Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Lost in the Middle: How Language Models Use Long Contexts

Liu, Nelson F., Lin, Kevin, Hewitt, John, Paranjape, Ashwin, Bevilacqua, Michele, Petroni, Fabio, Liang, Percy

arXiv.org Artificial IntelligenceNov-20-2023

While recent language models have the ability to take long contexts as input, relatively little is known about how well they use longer context. We analyze the performance of language models on two tasks that require identifying relevant information in their input contexts: multi-document question answering and key-value retrieval. We find that performance can degrade significantly when changing the position of relevant information, indicating that current language models do not robustly make use of information in long input contexts. In particular, we observe that performance is often highest when relevant information occurs at the beginning or end of the input context, and significantly degrades when models must access relevant information in the middle of long contexts, even for explicitly long-context models. Our analysis provides a better understanding of how language models use their input context and provides new evaluation protocols for future long-context language models.

information, input context, language model, (15 more...)

arXiv.org Artificial Intelligence

2307.03172

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > Germany (0.04)

Genre:

Personal > Honors (0.46)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.99)

Add feedback

Interview with Dautzenberg Roman: #IROS2023 Best Paper Award on Mobile Manipulation sponsored by OMRON Sinic X Corp.

RobohubNov-19-2023, 09:00:38 GMT

Congratulations to Dautzenberg Roman and his team of researchers, who won the IROS 2023 Best Paper Award on Mobile Manipulation sponsored by OMRON Sinic X Corp. for their paper "A perching and tilting aerial robot for precise and versatile power tool work on vertical walls". Below, the authors tell us more about their work, the methodology, and what they are planning next. Our paper shows a an aerial robot (think "drone") which can exert large forces in the horizontal direction, i.e. onto walls. This is a difficult task, as UAVs usually rely on thrust vectoring to apply horizontal forces and thus can only apply small forces before losing control authority. By perching onto walls, our system no longer needs the propulsion to remain at a desired site.

aerial robot, dautzenberg roman, mobile manipulation, (8 more...)

Robohub

Genre: Personal > Honors (0.62)

Technology: Information Technology > Artificial Intelligence (0.88)

Add feedback

Is ChatGPT a General-Purpose Natural Language Processing Task Solver?

Qin, Chengwei, Zhang, Aston, Zhang, Zhuosheng, Chen, Jiaao, Yasunaga, Michihiro, Yang, Diyi

arXiv.org Artificial IntelligenceNov-19-2023

Spurred by advancements in scale, large language models (LLMs) have demonstrated the ability to perform a variety of natural language processing (NLP) tasks zero-shot -- i.e., without adaptation on downstream data. Recently, the debut of ChatGPT has drawn a great deal of attention from the natural language processing (NLP) community due to the fact that it can generate high-quality responses to human input and self-correct previous mistakes based on subsequent conversations. However, it is not yet known whether ChatGPT can serve as a generalist model that can perform many NLP tasks zero-shot. In this work, we empirically analyze the zero-shot learning ability of ChatGPT by evaluating it on 20 popular NLP datasets covering 7 representative task categories. With extensive empirical studies, we demonstrate both the effectiveness and limitations of the current version of ChatGPT. We find that ChatGPT performs well on many tasks favoring reasoning capabilities (e.g., arithmetic reasoning) while it still faces challenges when solving specific tasks such as sequence tagging. We additionally provide in-depth analysis through qualitative case studies.

answer choice, arabic numeral, gpt-3, (11 more...)

arXiv.org Artificial Intelligence

2302.06476

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Italy (0.14)
(15 more...)

Genre:

Personal (1.00)
Research Report (0.81)

Industry:

Media (1.00)
Leisure & Entertainment (1.00)
Health & Medicine > Consumer Health (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Who Is Mira Murati, OpenAI's New Interim CEO?

WIREDNov-17-2023, 23:14:24 GMT

Until the dramatic departure of OpenAI's cofounder and CEO Sam Altman Friday, Mira Murati was its chief technology officer--but you could also call her as its minister of truth. In addition to heading the teams that develop tools such as ChatGPT and Dall-E, it's been her job to make sure those products don't mislead people, show bias, or snuff out humanity altogether. This interview was conducted in July 2023 for WIRED's cover story on OpenAI. It is being published today after Sam Altman's sudden departure to provide a glimpse at the thinking of the powerful AI company's new boss. Steven Levy: How did you come to join OpenAI?

mira murati, new interim ceo, openai, (1 more...)

WIRED

Genre: Personal > Interview (0.97)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

OpenAI CEO Sam Altman ousted as 'board no longer has confidence' in his leadership

EngadgetNov-17-2023, 20:49:24 GMT

In a surprise shakeup of its c-suite Friday, OpenAI's board of directors announced that CEO Sam Altman has been fired and will be leaving both the company and the board, effective immediately. Chief Technology Officer Mira Murati has been named interim CEO. Altman's oustering reportedly follows an internal "deliberative review process" which found he had not been "consistently candid in his communications with the board, hindering its ability to exercise its responsibilities," the company announced. As such, "the board no longer has confidence in his ability to continue leading OpenAI." OpenAI, which owns popular AI chatbot ChatGPT, thanked Altman' for his "many contributions to the founding and growth of OpenAI," but believes that "as the leader of the company's research, product, and safety functions, Mira is exceptionally qualified to step into the role of interim CEO." The board added it has "the utmost confidence in her ability to lead OpenAI during this transition period."

altman, board no longer, openai, (5 more...)

Engadget

Country:

North America > United States > California > San Francisco County > San Francisco (0.06)
Asia (0.06)

Genre: Personal (0.58)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback