jerry
- Asia > Singapore (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- North America > Canada > British Columbia > East Kootenay Region > Fernie (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- Asia > Singapore (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- North America > Canada > British Columbia > East Kootenay Region > Fernie (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Language Models Do Not Follow Occam's Razor: A Benchmark for Inductive and Abductive Reasoning
Sun, Yunxin, Saparov, Abulhair
Reasoning is a core capability in artificial intelligence systems, for which large language models (LLMs) have recently shown remarkable progress. However, most work focuses exclusively on deductive reasoning, which is problematic since other types of reasoning are also essential in solving real-world problems, and they are less explored. This work focuses on evaluating LLMs' inductive and abductive reasoning capabilities. We introduce a programmable and synthetic dataset, InAbHyD (pronounced in-a-bid), where each reasoning example consists of an incomplete world model and a set of observations. The task for the intelligent agent is to produce hypotheses to explain observations under the incomplete world model to solve each reasoning example. We propose a new metric to evaluate the quality of hypotheses based on Occam's Razor. We evaluate and analyze some state-of-the-art LLMs. Our analysis shows that LLMs can perform inductive and abductive reasoning in simple scenarios, but struggle with complex world models and producing high-quality hypotheses, even with popular reasoning-enhancing techniques such as in-context learning and RLVR.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- (10 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Interview with Jerry Tan: Service robot development for education
At the International Joint Conference on Artificial Intelligence (IJCAI) 2023, I had the opportunity to interview Jerry Tan from Lattel Robotics, a company dedicated to promoting AI-focused robotics education and training. They work closely with the RoboCup@Home Education initiative, supporting schools and institutions in introducing AI and service robot development to students. Their goal is to equip learners with practical AI application skills in computer vision, autonomous navigation, object manipulation and speech interactions. Through their AI robotics and AI applications workshops, Lattel Robotics offers an introduction to robot operating system (ROS)-based AI applications development in service robotics. As a hardware partner for the RoboCup@Home Education initiative, they assist schools and institutions in competing in AI robotic challenges by developing applications that address real-world problems.
- North America (0.05)
- Europe (0.05)
- Africa (0.05)
TOM: A Development Platform For Wearable Intelligent Assistants
Janaka, Nuwan, Zhao, Shengdong, Hsu, David, Wen, Sherisse Tan Jing, Keat, Koh Chun
Advanced digital assistants can significantly enhance task performance, reduce user burden, and provide personalized guidance to improve users' abilities. However, the development of such intelligent digital assistants presents a formidable challenge. To address this, we introduce TOM, a conceptual architecture and software platform (https://github.com/TOM-Platform) designed to support the development of intelligent wearable assistants that are contextually aware of both the user and the environment. This system was developed collaboratively with AR/MR researchers, HCI researchers, AI/Robotic researchers, and software developers, and it continues to evolve to meet the diverse requirements of these stakeholders. TOM facilitates the creation of intelligent assistive AR applications for daily activities and supports the recording and analysis of user interactions, integration of new devices, and the provision of assistance for various activities. Additionally, we showcase several proof-of-concept assistive services and discuss the challenges involved in developing such services.
- Asia > Singapore > Central Region > Singapore (0.05)
- North America > United States > New York > New York County > New York City (0.05)
- Asia > China > Hong Kong (0.04)
- (5 more...)
- Health & Medicine > Consumer Health (0.47)
- Information Technology > Security & Privacy (0.46)
LOGIC-LM++: Multi-Step Refinement for Symbolic Formulations
Kirtania, Shashank, Gupta, Priyanshu, Radhakirshna, Arjun
In this paper we examine the limitations of Large Language Models (LLMs) for complex reasoning tasks. Although recent works have started to employ formal languages as an intermediate representation for reasoning tasks, they often face challenges in accurately generating and refining these formal specifications to ensure correctness. To address these issues, this paper proposes Logic-LM++, an improvement on Logic-LM . It uses the ability of LLMs to do pairwise comparisons, allowing the evaluation of the refinements suggested by the LLM. The paper demonstrates that Logic-LM++ outperforms Logic-LM and other contemporary techniques across natural language reasoning tasks on three datasets, FOLIO, ProofWriter and AR-LSAT, with an average improvement of 18.5% on standard prompting, 12.3% on chain of thought prompting and 5% on Logic-LM.
CausalChaos! Dataset for Comprehensive Causal Action Question Answering Over Longer Causal Chains Grounded in Dynamic Visual Scenes
Parmar, Paritosh, Peh, Eric, Chen, Ruirui, Lam, Ting En, Chen, Yuhan, Tan, Elston, Fernando, Basura
Causal video question answering (QA) has garnered increasing interest, yet existing datasets often lack depth in causal reasoning. To address this gap, we capitalize on the unique properties of cartoons and construct CausalChaos!, a novel, challenging causal Why-QA dataset built upon the iconic "Tom and Jerry" cartoon series. Cartoons use the principles of animation that allow animators to create expressive, unambiguous causal relationships between events to form a coherent storyline. Utilizing these properties, along with thought-provoking questions and multi-level answers (answer and detailed causal explanation), our questions involve causal chains that interconnect multiple dynamic interactions between characters and visual scenes. These factors demand models to solve more challenging, yet well-defined causal relationships. We also introduce hard incorrect answer mining, including a causally confusing version that is even more challenging. While models perform well, there is much room for improvement, especially, on open-ended answers. We identify more advanced/explicit causal relationship modeling & joint modeling of vision and language as the immediate areas for future efforts to focus upon. Along with the other complementary datasets, our new challenging dataset will pave the way for these developments in the field.
- Asia > Singapore (0.05)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- North America > Canada > British Columbia > East Kootenay Region > Fernie (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Mastering Asymmetrical Multiplayer Game with Multi-Agent Asymmetric-Evolution Reinforcement Learning
Sun, Chenglu, Zhang, Yichi, Zhang, Yu, Lu, Ziling, Liu, Jingbin, Xu, Sijia, Zhang, Weidong
Asymmetrical multiplayer (AMP) game is a popular game genre which involves multiple types of agents competing or collaborating with each other in the game. It is difficult to train powerful agents that can defeat top human players in AMP games by typical self-play training method because of unbalancing characteristics in their asymmetrical environments. We propose asymmetric-evolution training (AET), a novel multi-agent reinforcement learning framework that can train multiple kinds of agents simultaneously in AMP game. We designed adaptive data adjustment (ADA) and environment randomization (ER) to optimize the AET process. We tested our method in a complex AMP game named Tom \& Jerry, and our AIs trained without using any human data can achieve a win rate of 98.5% against top human players over 65 matches. The ablation experiments indicated that the proposed modules are beneficial to the framework.
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.85)
The Hacking of ChatGPT Is Just Getting Started
It took Alex Polyakov just a couple of hours to break GPT-4. When OpenAI released the latest version of its text-generating chatbot in March, Polyakov sat down in front of his keyboard and started entering prompts designed to bypass OpenAI's safety systems. Soon, the CEO of security firm Adversa AI had GPT-4 spouting homophobic statements, creating phishing emails, and supporting violence. Polyakov is one of a small number of security researchers, technologists, and computer scientists developing jailbreaks and prompt injection attacks against ChatGPT and other generative AI systems. The process of jailbreaking aims to design prompts that make the chatbots bypass rules around producing hateful content or writing about illegal acts, while closely-related prompt injection attacks can quietly insert malicious data or instructions into AI models.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)
ChatGPT vs. Bing vs. Bard: Which AI is best?
ChatGPT, Bing Chat, and Bard promise to transform your life using the power of artificial intelligence, through AI conversations that can inform, amuse, and educate you--just like a human being. But how good are these new AI chatbots, really? We tested them to find out. We asked all three AIs a variety of different questions: some that expanded upon general search topics, some that demanded an opinion, logic puzzles, even code--and then asked them to be more creative, such as by writing an alternate, better ending to Game of Thrones and a Seinfeld scene with a special guest. We've included all of their answers, or as much as them as we could provide, and we'll let you decide for yourself.
- Energy (0.50)
- Media > Television (0.35)
- Transportation > Ground > Road (0.31)