AITopics | Personal

Collaborating Authors

Personal

The Trendy New Trivia Game That's Like Wordle for Straight Men

SlateOct-1-2023, 15:00:00 GMT

We are in the midst of an unprecedented, intergenerational phone-game renaissance. Wordle has become a pillar of the New York Times brand, newspapers everywhere are resurrecting their crossword backpage, and Words With Friends has essentially transformed into a dating app. These games are designed to be approachably mainstream--every English speaker alive can deduce a five-letter word with six chances--but unfortunately, I am a man of unconventional taste. If I'm going to entertain a daily dose of potpourri, I need something weirder, more challenging, and better suited for the precise category of useless knowledge that occupies my brain. That's why the sports-trivia game Immaculate Grid has become a fixture of my morning routine.

bailey zappe, guy-remembering, immaculate grid, (9 more...)

Slate

Country:

North America > United States > Wisconsin > Milwaukee County > Milwaukee (0.05)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.05)
North America > United States > New York (0.05)
(6 more...)

Genre: Personal > Human Interest (0.40)

Industry:

Leisure & Entertainment > Sports > Basketball (1.00)
Leisure & Entertainment > Sports > Football (0.95)
Leisure & Entertainment > Games > Computer Games (0.71)

Technology:

Information Technology > Communications (0.49)
Information Technology > Artificial Intelligence (0.35)

Add feedback

The Robots are Here: Navigating the Generative AI Revolution in Computing Education

Prather, James, Denny, Paul, Leinonen, Juho, Becker, Brett A., Albluwi, Ibrahim, Craig, Michelle, Keuning, Hieke, Kiesler, Natalie, Kohn, Tobias, Luxton-Reilly, Andrew, MacNeil, Stephen, Peterson, Andrew, Pettit, Raymond, Reeves, Brent N., Savelka, Jaromir

arXiv.org Artificial IntelligenceOct-1-2023

Recent advancements in artificial intelligence (AI) are fundamentally reshaping computing, with large language models (LLMs) now effectively being able to generate and interpret source code and natural language instructions. These emergent capabilities have sparked urgent questions in the computing education community around how educators should adapt their pedagogy to address the challenges and to leverage the opportunities presented by this new technology. In this working group report, we undertake a comprehensive exploration of LLMs in the context of computing education and make five significant contributions. First, we provide a detailed review of the literature on LLMs in computing education and synthesise findings from 71 primary articles. Second, we report the findings of a survey of computing students and instructors from across 20 countries, capturing prevailing attitudes towards LLMs and their use in computing education contexts. Third, to understand how pedagogy is already changing, we offer insights collected from in-depth interviews with 22 computing educators from five continents who have already adapted their curricula and assessments. Fourth, we use the ACM Code of Ethics to frame a discussion of ethical issues raised by the use of large language models in computing education, and we provide concrete advice for policy makers, educators, and students. Finally, we benchmark the performance of LLMs on various computing education datasets, and highlight the extent to which the capabilities of current models are rapidly improving. Our aim is that this report will serve as a focal point for both researchers and practitioners who are exploring, adapting, using, and evaluating LLMs and LLM-based tools in computing classrooms.

arxiv preprint arxiv, student and instructor, technical symposium, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3623762.3633499

2310.00658

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.13)
(43 more...)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Personal > Interview (1.00)
(3 more...)

Industry:

Law (1.00)
Government > Military (1.00)
Education > Educational Setting > Higher Education (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.67)

Add feedback

RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models

Wang, Zekun Moore, Peng, Zhongyuan, Que, Haoran, Liu, Jiaheng, Zhou, Wangchunshu, Wu, Yuhan, Guo, Hongcheng, Gan, Ruitong, Ni, Zehao, Zhang, Man, Zhang, Zhaoxiang, Ouyang, Wanli, Xu, Ke, Chen, Wenhu, Fu, Jie, Peng, Junran

arXiv.org Artificial IntelligenceOct-1-2023

The advent of Large Language Models (LLMs) has paved the way for complex tasks such as role-playing, which enhances user interactions by enabling models to imitate various characters. However, the closed-source nature of state-of-the-art LLMs and their general-purpose training limit role-playing optimization. In this paper, we introduce RoleLLM, a framework to benchmark, elicit, and enhance role-playing abilities in LLMs. RoleLLM comprises four stages: (1) Role Profile Construction for 100 roles; (2) Context-Based Instruction Generation (Context-Instruct) for role-specific knowledge extraction; (3) Role Prompting using GPT (RoleGPT) for speaking style imitation; and (4) Role-Conditioned Instruction Tuning (RoCIT) for fine-tuning open-source models along with role customization. By Context-Instruct and RoleGPT, we create RoleBench, the first systematic and fine-grained character-level benchmark dataset for role-playing with 168,093 samples. Moreover, RoCIT on RoleBench yields RoleLLaMA (English) and RoleGLM (Chinese), significantly enhancing role-playing abilities and even achieving comparable results with RoleGPT (using GPT-4).

enhancing role-playing ability, instruction, rolegpt, (12 more...)

arXiv.org Artificial Intelligence

2310.00746

Country:

Europe > United Kingdom > Scotland (0.04)
Asia > China > Hong Kong (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
(3 more...)

Genre:

Personal > Interview (1.00)
Research Report (0.82)

Industry:

Education (0.92)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation

Mündler, Niels, He, Jingxuan, Jenko, Slobodan, Vechev, Martin

arXiv.org Artificial IntelligenceOct-1-2023

Large language models (large LMs) are susceptible to producing text that contains hallucinated content. An important instance of this problem is self-contradiction, where the LM generates two contradictory sentences within the same context. In this work, we present a comprehensive investigation into self-contradiction for various instruction-tuned LMs, covering evaluation, detection, and mitigation. Our analysis reveals the prevalence of self-contradictions when LMs generate text for open-domain topics, e.g., in 17.7% of all sentences produced by ChatGPT. Self-contradiction also complements retrieval-based methods, as a large portion of them (e.g., 35.8% for ChatGPT) cannot be verified using Wikipedia. We then propose a novel prompting-based framework designed to effectively detect and mitigate self-contradictions. Our detector achieves high accuracy, e.g., around 80% F1 score when prompting ChatGPT. The mitigation algorithm iteratively refines the generated text to remove contradictory information while preserving text fluency and informativeness. Importantly, our entire framework is applicable to black-box LMs and does not require external grounded knowledge. Our approach is practically effective and has been released as a push-button tool to benefit the public, available at https://chatprotect.ai/.

alm, chatgpt, freeman, (15 more...)

arXiv.org Artificial Intelligence

2305.15852

Country:

North America > Cuba (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Nebraska (0.04)
(29 more...)

Genre:

Personal (1.00)
Research Report (0.65)

Industry:

Media > Film (1.00)
Leisure & Entertainment > Sports (1.00)
Media > Music (0.93)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Holistic Evaluation of Language Models

Liang, Percy, Bommasani, Rishi, Lee, Tony, Tsipras, Dimitris, Soylu, Dilara, Yasunaga, Michihiro, Zhang, Yian, Narayanan, Deepak, Wu, Yuhuai, Kumar, Ananya, Newman, Benjamin, Yuan, Binhang, Yan, Bobby, Zhang, Ce, Cosgrove, Christian, Manning, Christopher D., Ré, Christopher, Acosta-Navas, Diana, Hudson, Drew A., Zelikman, Eric, Durmus, Esin, Ladhak, Faisal, Rong, Frieda, Ren, Hongyu, Yao, Huaxiu, Wang, Jue, Santhanam, Keshav, Orr, Laurel, Zheng, Lucia, Yuksekgonul, Mert, Suzgun, Mirac, Kim, Nathan, Guha, Neel, Chatterji, Niladri, Khattab, Omar, Henderson, Peter, Huang, Qian, Chi, Ryan, Xie, Sang Michael, Santurkar, Shibani, Ganguli, Surya, Hashimoto, Tatsunori, Icard, Thomas, Zhang, Tianyi, Chaudhary, Vishrav, Wang, William, Li, Xuechen, Mai, Yifan, Zhang, Yuhui, Koreeda, Yuta

arXiv.org Artificial IntelligenceOct-1-2023

Language models (LMs) are becoming the foundation for almost all major language technologies, but their capabilities, limitations, and risks are not well understood. We present Holistic Evaluation of Language Models (HELM) to improve the transparency of language models. First, we taxonomize the vast space of potential scenarios (i.e. use cases) and metrics (i.e. desiderata) that are of interest for LMs. Then we select a broad subset based on coverage and feasibility, noting what's missing or underrepresented (e.g. question answering for neglected English dialects, metrics for trustworthiness). Second, we adopt a multi-metric approach: We measure 7 metrics (accuracy, calibration, robustness, fairness, bias, toxicity, and efficiency) for each of 16 core scenarios when possible (87.5% of the time). This ensures metrics beyond accuracy don't fall to the wayside, and that trade-offs are clearly exposed. We also perform 7 targeted evaluations, based on 26 targeted scenarios, to analyze specific aspects (e.g. reasoning, disinformation). Third, we conduct a large-scale evaluation of 30 prominent language models (spanning open, limited-access, and closed models) on all 42 scenarios, 21 of which were not previously used in mainstream LM evaluation. Prior to HELM, models on average were evaluated on just 17.9% of the core HELM scenarios, with some prominent models not sharing a single scenario in common. We improve this to 96.0%: now all 30 models have been densely benchmarked on the same core scenarios and metrics under standardized conditions. Our evaluation surfaces 25 top-level findings. For full transparency, we release all raw model prompts and completions publicly for further analysis, as well as a general modular toolkit. We intend for HELM to be a living benchmark for the community, continuously updated with new scenarios, metrics, and models.

civilcomment raft boolq narrativeqa naturalquestion, civilcomment raft mmlu boolqnarrativeqa naturalquestion, demographic representation and stereotypical association, (13 more...)

arXiv.org Artificial Intelligence

2211.0911

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.13)
North America > United States > Washington > King County > Seattle (0.13)
(49 more...)

Genre:

Research Report > New Finding (1.00)
Personal (1.00)
Overview (1.00)

Industry:

Media > News (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
(11 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
(8 more...)

Add feedback

We Found Something Strange Under Our Son's Bed. What He's Using It For Is Even Stranger.

SlateSep-30-2023, 16:00:00 GMT

How to Do It is Slate's sex advice column. Send it to Stoya and Rich here. My husband and I have an awesome, intelligent 14-year-old son who identifies as bisexual. We are totally accepting and supportive of him. He has had a few short-lived crushes on different genders, though he doesn't seem to be particularly interested in dating right now. His internet search histories are pretty benign--mostly video game stuff, and the occasional search for "hot girls" and "boobs."

corrina, masturbation, orgasm, (15 more...)

Slate

Genre: Personal > Human Interest (0.40)

Industry:

Health & Medicine (0.69)
Education > Curriculum > Health & Wellness Education > Sex Education (0.40)

Technology: Information Technology > Artificial Intelligence (0.34)

Add feedback

Duchess Sarah Ferguson's former personal assistant murdered: 'I'm shocked and saddened'

FOX NewsSep-30-2023, 13:07:03 GMT

Fox News Flash top entertainment and celebrity headlines are here. Sarah Ferguson expressed her shock and grief as she mourned the death of her former personal assistant, Jenean Chapman, who was murdered in Texas this week. The 63-year-old Duchess of York paid tribute to Chapman in an Instagram post that she shared on Thursday. "I am shocked and saddened to learn that Jenean Chapman, who worked with me as my personal assistant many years ago, has been murdered in Dallas aged just 46. A suspect is in custody," Ferguson wrote.

chapman, ferguson, sarah ferguson, (12 more...)

FOX News

Country:

North America > United States > Texas (0.30)
North America > United States > New York (0.07)
North America > United States > California > San Francisco County > San Francisco (0.05)

Genre: Personal (0.32)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.72)
Media (0.56)
Health & Medicine (0.52)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.84)

Add feedback

"With Great Power Comes Great Responsibility!": Student and Instructor Perspectives on the influence of LLMs on Undergraduate Engineering Education

Joshi, Ishika, Budhiraja, Ritvik, Tanna, Pranav Deepak, Jain, Lovenya, Deshpande, Mihika, Srivastava, Arjun, Rallapalli, Srinivas, Akolekar, Harshal D, Challa, Jagat Sesh, Kumar, Dhruv

arXiv.org Artificial IntelligenceSep-30-2023

The rise in popularity of Large Language Models (LLMs) has prompted discussions in academic circles, with students exploring LLM-based tools for coursework inquiries and instructors exploring them for teaching and research. Even though a lot of work is underway to create LLM-based tools tailored for students and instructors, there is a lack of comprehensive user studies that capture the perspectives of students and instructors regarding LLMs. This paper addresses this gap by conducting surveys and interviews within undergraduate engineering universities in India. Using 1306 survey responses among students, 112 student interviews, and 27 instructor interviews around the academic usage of ChatGPT (a popular LLM), this paper offers insights into the current usage patterns, perceived benefits, threats, and challenges, as well as recommendations for enhancing the adoption of LLMs among students and instructors. These insights are further utilized to discuss the practical implications of LLMs in undergraduate engineering education and beyond.

chatgpt, computing machinery, student, (14 more...)

arXiv.org Artificial Intelligence

2309.10694

Country:

North America > United States > New York > New York County > New York City (0.06)
Europe > Finland > Southwest Finland > Turku (0.04)
North America > Canada > Ontario > Toronto (0.04)
(9 more...)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Instructional Material > Course Syllabus & Notes (1.00)
Personal > Interview (0.67)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing

Gou, Zhibin, Shao, Zhihong, Gong, Yeyun, Shen, Yelong, Yang, Yujiu, Duan, Nan, Chen, Weizhu

arXiv.org Artificial IntelligenceSep-30-2023

Recent developments in large language models (LLMs) have been impressive. However, these models sometimes show inconsistencies and problematic behavior, such as hallucinating facts, generating flawed code, or creating offensive and toxic content. Unlike these models, humans typically utilize external tools to cross-check and refine their initial content, like using a search engine for fact-checking, or a code interpreter for debugging. Inspired by this observation, we introduce a framework called CRITIC that allows LLMs, which are essentially "black boxes" to validate and progressively amend their own outputs in a manner similar to human interaction with tools. More specifically, starting with an initial output, CRITIC interacts with appropriate tools to evaluate certain aspects of the text, and then revises the output based on the feedback obtained during this validation process. Comprehensive evaluations involving free-form question answering, mathematical program synthesis, and toxicity reduction demonstrate that CRITIC consistently enhances the performance of LLMs. Meanwhile, our research highlights the crucial importance of external feedback in promoting the ongoing self-improvement of LLMs.

answer plausible, elizabeth perkin, track and field title, (15 more...)

arXiv.org Artificial Intelligence

2305.11738

Country:

Asia > North Korea (0.28)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Georgia (0.14)
(48 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.92)
Personal > Honors > Award (0.46)

Industry:

Transportation (1.00)
Media > Music (1.00)
Media > Film (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Meet the American who invented video games, Ralph Baer, a German Jew who fled Nazis, served US Army in WWII

FOX NewsSep-29-2023, 08:00:28 GMT

"Father of the Video Game" Ralph Baer escaped Jewish persecution in Nazi Germany as a teen and served in the U.S. Army in WWII. After coming of age in tough times, he felt driven to bring "more fun and whimsy" into the world. Ralph Baer's childhood was stolen by the Nazis. The German-born Jew gained a semblance of revenge overseas, imagining a new way for children of all ages to play. Ralph Baer invented video games.

baer, ralph baer, video game, (12 more...)

FOX News

Country:

Europe > Germany (0.29)
North America > United States > New Hampshire > Hillsborough County > Manchester (0.06)
North America > United States > New York (0.05)
(11 more...)

Genre: Personal (0.47)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (1.00)

Technology: Information Technology > Artificial Intelligence > Games (1.00)

Add feedback