Goto

Collaborating Authors

 Personal


A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets

arXiv.org Artificial Intelligence

The development of large language models (LLMs) such as ChatGPT has brought a lot of attention recently. However, their evaluation in the benchmark academic datasets remains under-explored due to the difficulty of evaluating the generative outputs produced by this model against the ground truth. In this paper, we aim to present a thorough evaluation of ChatGPT's performance on diverse academic datasets, covering tasks like question-answering, text summarization, code generation, commonsense reasoning, mathematical problem-solving, machine translation, bias detection, and ethical considerations. Specifically, we evaluate ChatGPT across 140 tasks and analyze 255K responses it generates in these datasets. This makes our work the largest evaluation of ChatGPT in NLP benchmarks. In short, our study aims to validate the strengths and weaknesses of ChatGPT in various tasks and provide insights for future research using LLMs. We also report a new emergent ability to follow multi-query instructions that we mostly found in ChatGPT and other instruction-tuned models. Our extensive evaluation shows that even though ChatGPT is capable of performing a wide variety of tasks, and may obtain impressive performance in several benchmark datasets, it is still far from achieving the ability to reliably solve many challenging tasks. By providing a thorough assessment of ChatGPT's performance across diverse NLP tasks, this paper sets the stage for a targeted deployment of ChatGPT-like LLMs in real-world applications.


Mitigating the Learning Bias towards Repetition by Self-Contrastive Training for Open-Ended Generation

arXiv.org Artificial Intelligence

Despite the huge progress in myriad generation tasks, pretrained language models (LMs) such as GPT2 still tend to generate repetitive texts with maximization-based decoding algorithms for open-ended generation. We attribute their overestimation of token-level repetition probabilities to the learning bias: LMs capture simple repetitive patterns faster with the MLE loss. We propose self-contrastive training to penalize the output of a premature checkpoint of the same model when it incorrectly predicts repetition, which is shown to mitigate repetition effectively while maintaining fluency on two datasets. Furthermore, we find that LMs use longer-range dependencies to predict repetitive tokens than non-repetitive ones, which may be the cause of sentence-level repetition loops.


Scenario-Based Motion Planning with Bounded Probability of Collision

arXiv.org Artificial Intelligence

Robots will increasingly operate near humans that introduce uncertainties in the motion planning problem due to their complex nature. Typically, chance constraints are introduced in the planner to optimize performance while guaranteeing probabilistic safety. However, existing methods do not consider the actual probability of collision for the planned trajectory, but rather its marginalization, that is, the independent collision probabilities for each planning step and/or dynamic obstacle, resulting in conservative trajectories. To address this issue, we introduce a novel real-time capable method termed Safe Horizon MPC, that explicitly constrains the joint probability of collision with all obstacles over the duration of the motion plan. This is achieved by reformulating the chance-constrained planning problem using scenario optimization and predictive control. Our method is less conservative than state-of-the-art approaches, applicable to arbitrary probability distributions of the obstacles' trajectories, computationally tractable and scalable. We demonstrate our proposed approach using a mobile robot and an autonomous vehicle in an environment shared with humans.


A Comprehensive Survey of Artificial Intelligence Techniques for Talent Analytics

arXiv.org Artificial Intelligence

In today's competitive and fast-evolving business environment, it is a critical time for organizations to rethink how to make talent-related decisions in a quantitative manner. Indeed, the recent development of Big Data and Artificial Intelligence (AI) techniques have revolutionized human resource management. The availability of large-scale talent and management-related data provides unparalleled opportunities for business leaders to comprehend organizational behaviors and gain tangible knowledge from a data science perspective, which in turn delivers intelligence for real-time decision-making and effective talent management at work for their organizations. In the last decade, talent analytics has emerged as a promising field in applied data science for human resource management, garnering significant attention from AI communities and inspiring numerous research efforts. To this end, we present an up-to-date and comprehensive survey on AI technologies used for talent analytics in the field of human resource management. Specifically, we first provide the background knowledge of talent analytics and categorize various pertinent data. Subsequently, we offer a comprehensive taxonomy of relevant research efforts, categorized based on three distinct application-driven scenarios: talent management, organization management, and labor market analysis. In conclusion, we summarize the open challenges and potential prospects for future research directions in the domain of AI-driven talent analytics.


Why diversity and inclusion needs to be at the forefront of future AI

Robohub

Inรชs Hipรณlito is a highly accomplished researcher, recognized for her work in esteemed journals and contributions as a co-editor. She has received research awards including the prestigious Talent Grant from the University of Amsterdam in 2021. After her PhD, she held positions at the Berlin School of Mind and Brain and Humboldt-Universitรคt zu Berlin. Currently, she is a permanent lecturer of the philosophy of AI at Macquarie University, focusing on cognitive development and the interplay between augmented cognition (AI) and the sociocultural environment. Neurourbanism as a Novel Approach in Global Health,' funded by the Berlin University Alliance.


Finding differences in perspectives between designers and engineers to develop trustworthy AI for autonomous cars

arXiv.org Artificial Intelligence

In the context of designing and implementing ethical Artificial Intelligence (AI), varying perspectives exist regarding developing trustworthy AI for autonomous cars. This study sheds light on the differences in perspectives and provides recommendations to minimize such divergences. By exploring the diverse viewpoints, we identify key factors contributing to the differences and propose strategies to bridge the gaps. This study goes beyond the trolley problem to visualize the complex challenges of trustworthy and ethical AI. Three pillars of trustworthy AI have been defined: transparency, reliability, and safety. This research contributes to the field of trustworthy AI for autonomous cars, providing practical recommendations to enhance the development of AI systems that prioritize both technological advancement and ethical principles.


Stay on topic with Classifier-Free Guidance

arXiv.org Artificial Intelligence

Classifier-Free Guidance (CFG) [37] has recently emerged in text-to-image generation as a lightweight technique to encourage prompt-adherence in generations. In this work, we demonstrate that CFG can be used broadly as an inference-time technique in pure language modeling. We show that CFG (1) improves the performance of Pythia, GPT-2 and LLaMA-family models across an array of tasks: Q&A, reasoning, code generation, and machine translation, achieving SOTA on LAMBADA with LLaMA-7B over PaLM-540B; (2) brings improvements equivalent to a model with twice the parameter-count; (3) can stack alongside other inference-time methods like Chain-of-Thought and Self-Consistency, yielding further improvements in difficult tasks; (4) can be used to increase the faithfulness and coherence of assistants in challenging form-driven and content-driven prompts: in a human evaluation we show a 75% preference for GPT4All using CFG over baseline.


Joanne Pransky: Rest in Peace (1959-2023)

Robohub

It is with great sadness that I am sharing that Joanne Pransky, the World's First Robotic Psychariatrist, and who Isaac Asimov called the real Susan Calvin passed away recently. I had several delight conversations with her, including an interview and moderated panel. Joanne was a tireless advocate for robotics AND for women in robotics. She didn't have advanced degrees in robotics but had worked in the robotics industry and then in robot trade journals- she had quite the eye for finding really useful technology versus hype. Her enthusiasm and passion for constantly learning was an inspiration to me and I was privileged to know her as a friend.


Interdisciplinary Methods in Computational Creativity: How Human Variables Shape Human-Inspired AI Research

arXiv.org Artificial Intelligence

The word creativity originally described a concept from human psychology, but in the realm of computational creativity (CC), it has become much more. The question of what creativity means when it is part of a computational system might be considered core to CC. Pinning down the meaning of creativity, and concepts like it, becomes salient when researchers port concepts from human psychology to computation, a widespread practice extending beyond CC into artificial intelligence (AI). Yet, the human processes shaping human-inspired computational systems have been little investigated. In this paper, we question which human literatures (social sciences, psychology, neuroscience) enter AI scholarship and how they are translated at the port of entry. This study is based on 22 in-depth, semi-structured interviews, primarily with human-inspired AI researchers, half of whom focus on creativity as a major research area. This paper focuses on findings most relevant to CC. We suggest that which human literature enters AI bears greater scrutiny because ideas may become disconnected from context in their home discipline. Accordingly, we recommend that CC researchers document the decisions and context of their practices, particularly those practices formalizing human concepts for machines. Publishing reflexive commentary on human elements in CC and AI would provide a useful record and permit greater dialogue with other disciplines.


Cooperative Multi-Agent Deep Reinforcement Learning for Reliable and Energy-Efficient Mobile Access via Multi-UAV Control

arXiv.org Artificial Intelligence

This paper addresses a novel multi-agent deep reinforcement learning (MADRL)-based positioning algorithm for multiple unmanned aerial vehicles (UAVs) collaboration (i.e., UAVs work as mobile base stations). The primary objective of the proposed algorithm is to establish dependable mobile access networks for cellular vehicle-to-everything (C-V2X) communication, thereby facilitating the realization of high-quality intelligent transportation systems (ITS). The reliable mobile access services can be achieved in following two ways, i.e., i) energy-efficient UAV operation and ii) reliable wireless communication services. For energy-efficient UAV operation, the reward of our proposed MADRL algorithm contains the features for UAV energy consumption models in order to realize efficient operations. Furthermore, for reliable wireless communication services, the quality of service (QoS) requirements of individual users are considered as a part of rewards and 60GHz mmWave radio is used for mobile access. This paper considers the 60GHz mmWave access for utilizing the benefits of i) ultra-wide-bandwidth for multi-Gbps high-speed communications and ii) high-directional communications for spatial reuse that is obviously good for densely deployed users. Lastly, the comprehensive and data-intensive performance evaluation of the proposed MADRL-based algorithm for multi-UAV positioning is conducted in this paper. The results of these evaluations demonstrate that the proposed algorithm outperforms other existing algorithms.