Personal
AI for Handball: predicting and explaining the 2024 Olympic Games tournament with Deep Learning and Large Language Models
Over summer 2024, the world will be looking at Paris to encourage their favorite athletes win the Olympic gold medal. In handball, few nations will fight hard to win the precious metal with speculations predicting the victory for France or Denmark for men and France or Norway for women. However, there is so far no scientific method proposed to predict the final results of the competition. In this work, we leverage a deep learning model to predict the results of the handball tournament of the 2024 Olympic Games. This model, coupled with explainable AI (xAI) techniques, allows us to extract insightful information about the main factors influencing the outcome of each match. Notably, xAI helps sports experts understand how factors like match information or individual athlete performance contribute to the predictions. Furthermore, we integrate Large Language Models (LLMs) to generate human-friendly explanations that highlight the most important factors impacting the match results. By providing human-centric explanations, our approach offers a deeper understanding of the AI predictions, making them more actionable for coaches and analysts.
#RoboCup2024 – daily digest: 20 July
This is the second of our daily digests from RoboCup2024 in Eindhoven, The Netherlands. If you missed the first digest, which gives some background to RoboCup, you can find it here. Competitions continued across all the leagues today, with participants vying for a place in Sunday's finals. The RoboCup@Work league focusses on robots in work-related scenarios, utilizing ideas and concepts from other RoboCup competitions to tackle open research challenges in industrial and service robotics. I arrived at the arena in time to catch the advanced navigation test.
'Google says I'm a dead physicist': is the world's biggest search engine broken?
I didn't know I was dead until I saw it on Google. When I searched my name, there it was: a picture of my smiling face next to the text "Tom Faber was a physicist and publisher, and he was a university lecturer at Cambridge for 35 years". Apparently I died on 27 July 2004, aged 77. This was news to me. The problem was the picture. When you search the name of a notable person, Google may create what it calls a "knowledge panel", a little box with basic information taken from Wikipedia. Somewhere along the way, the algorithm had confused pictures of my face with the biography of another man who shared my name. According to his obituary, he was "a distinguished physicist with a literary hinterland". Google provides a feedback form to resolve this type of bug. I filled it in several times, but it made no difference.
PersLLM: A Personified Training Approach for Large Language Models
Zeng, Zheni, Chen, Jiayi, Chen, Huimin, Yan, Yukun, Chen, Yuxuan, Liu, Zhiyuan, Sun, Maosong
Large language models exhibit aspects of human-level intelligence that catalyze their application as human-like agents in domains such as social simulations, human-machine interactions, and collaborative multi-agent systems. However, the absence of distinct personalities, such as displaying ingratiating behaviors, inconsistent opinions, and uniform response patterns, diminish LLMs utility in practical applications. Addressing this, the development of personality traits in LLMs emerges as a crucial area of research to unlock their latent potential. Existing methods to personify LLMs generally involve strategies like employing stylized training data for instruction tuning or using prompt engineering to simulate different personalities. These methods only capture superficial linguistic styles instead of the core of personalities and are therefore not stable. In this study, we propose PersLLM, integrating psychology-grounded principles of personality: social practice, consistency, and dynamic development, into a comprehensive training methodology. We incorporate personality traits directly into the model parameters, enhancing the model's resistance to induction, promoting consistency, and supporting the dynamic evolution of personality. Single-agent evaluation validates our method's superiority, as it produces responses more aligned with reference personalities compared to other approaches. Case studies for multi-agent communication highlight its benefits in enhancing opinion consistency within individual agents and fostering collaborative creativity among multiple agents in dialogue contexts, potentially benefiting human simulation and multi-agent cooperation. Additionally, human-agent interaction evaluations indicate that our personified models significantly enhance interactive experiences, underscoring the practical implications of our research.
Multiobjective Vehicle Routing Optimization with Time Windows: A Hybrid Approach Using Deep Reinforcement Learning and NSGA-II
Wu, Rixin, Wang, Ran, Hao, Jie, Wu, Qiang, Wang, Ping, Niyato, Dusit
This paper proposes a weight-aware deep reinforcement learning (WADRL) approach designed to address the multiobjective vehicle routing problem with time windows (MOVRPTW), aiming to use a single deep reinforcement learning (DRL) model to solve the entire multiobjective optimization problem. The Non-dominated sorting genetic algorithm-II (NSGA-II) method is then employed to optimize the outcomes produced by the WADRL, thereby mitigating the limitations of both approaches. Firstly, we design an MOVRPTW model to balance the minimization of travel cost and the maximization of customer satisfaction. Subsequently, we present a novel DRL framework that incorporates a transformer-based policy network. This network is composed of an encoder module, a weight embedding module where the weights of the objective functions are incorporated, and a decoder module. NSGA-II is then utilized to optimize the solutions generated by WADRL. Finally, extensive experimental results demonstrate that our method outperforms the existing and traditional methods. Due to the numerous constraints in VRPTW, generating initial solutions of the NSGA-II algorithm can be time-consuming. However, using solutions generated by the WADRL as initial solutions for NSGA-II significantly reduces the time required for generating initial solutions. Meanwhile, the NSGA-II algorithm can enhance the quality of solutions generated by WADRL, resulting in solutions with better scalability. Notably, the weight-aware strategy significantly reduces the training time of DRL while achieving better results, enabling a single DRL model to solve the entire multiobjective optimization problem.
Building Intelligence Identification System via Large Language Model Watermarking: A Survey and Beyond
Wang, Xuhong, Jiang, Haoyu, Yu, Yi, Yu, Jingru, Lin, Yilun, Yi, Ping, Wang, Yingchun, Yu, Qiao, Li, Li, Wang, Fei-Yue
Large Language Models (LLMs) are increasingly integrated into diverse industries, posing substantial security risks due to unauthorized replication and misuse. To mitigate these concerns, robust identification mechanisms are widely acknowledged as an effective strategy. Identification systems for LLMs now rely heavily on watermarking technology to manage and protect intellectual property and ensure data security. However, previous studies have primarily concentrated on the basic principles of algorithms and lacked a comprehensive analysis of watermarking theory and practice from the perspective of intelligent identification. To bridge this gap, firstly, we explore how a robust identity recognition system can be effectively implemented and managed within LLMs by various participants using watermarking technology. Secondly, we propose a mathematical framework based on mutual information theory, which systematizes the identification process to achieve more precise and customized watermarking. Additionally, we present a comprehensive evaluation of performance metrics for LLM watermarking, reflecting participant preferences and advancing discussions on its identification applications. Lastly, we outline the existing challenges in current watermarking technologies and theoretical frameworks, and provide directional guidance to address these challenges. Our systematic classification and detailed exposition aim to enhance the comparison and evaluation of various methods, fostering further research and development toward a transparent, secure, and equitable LLM ecosystem.
Tackling Challenges in Implementing Large-Scale Graph Databases
Graph databases (GDBs)13,30 have gained momentum with the rise of large unstructured repositories of information that emphasize relations between entities. Dozens of GDB management systems,8,22,25,31 prototypes,1,2,15,21 models and languages,3,10,12,14 large knowledge graphs like Wikidata,33 and efforts from companies like Apache, Facebook, Google, Microsoft, Neo4j, and Oracle, illustrate the growing interest in this technology. While the expressive power and flexibility of their data model and query languages is the key to their success, the efficiency challenges posed by their implementation is the main obstacle to the wider adoption of GDBs. Latin America has a long-standing tradition in fundamental research areas like database theory, string processing, information retrieval, and the design and analysis of algorithms and data structures--all of which are relevant for the development of GDBs. In the last few years, several researchers in Chile started collaborating on algorithms and systems for evaluating complex queries on large-scale GDBs.
SENTINEL: Securing Indoor Localization against Adversarial Attacks with Capsule Neural Networks
Gufran, Danish, Anandathirtha, Pooja, Pasricha, Sudeep
With the increasing demand for edge device powered location-based services in indoor environments, Wi-Fi received signal strength (RSS) fingerprinting has become popular, given the unavailability of GPS indoors. However, achieving robust and efficient indoor localization faces several challenges, due to RSS fluctuations from dynamic changes in indoor environments and heterogeneity of edge devices, leading to diminished localization accuracy. While advances in machine learning (ML) have shown promise in mitigating these phenomena, it remains an open problem. Additionally, emerging threats from adversarial attacks on ML-enhanced indoor localization systems, especially those introduced by malicious or rogue access points (APs), can deceive ML models to further increase localization errors. To address these challenges, we present SENTINEL, a novel embedded ML framework utilizing modified capsule neural networks to bolster the resilience of indoor localization solutions against adversarial attacks, device heterogeneity, and dynamic RSS fluctuations. We also introduce RSSRogueLoc, a novel dataset capturing the effects of rogue APs from several real-world indoor environments. Experimental evaluations demonstrate that SENTINEL achieves significant improvements, with up to 3.5x reduction in mean error and 3.4x reduction in worst-case error compared to state-of-the-art frameworks using simulated adversarial attacks. SENTINEL also achieves improvements of up to 2.8x in mean error and 2.7x in worst-case error compared to state-of-the-art frameworks when evaluated with the real-world RSSRogueLoc dataset.
Interview with Sherry Yang: Learning interactive real-world simulators
Sherry Yang, Yilun Du, Kamyar Ghasemipour, Jonathan Tompson, Leslie Kaelbling, Dale Schuurmans and Pieter Abbeel won an outstanding paper award at ICLR2024 for their work Learning Interactive Real-World Simulators. In the paper, they introduce a universal simulator (called UniSim) which takes image and text input to train a robot simulator. We spoke to Sherry about this work, some of the challenges, and potential applications. There are two components – there is the universal component and then there is a simulator component. Looking at the simulator component first – typically when people build a simulator, they do this based on an understanding of the real world, using physics equations. Researchers will build a simulator to study how things work, such as how cars move, for example.
On LLM Wizards: Identifying Large Language Models' Behaviors for Wizard of Oz Experiments
Fang, Jingchao, Arechiga, Nikos, Namaoshi, Keiichi, Bravo, Nayeli, Hogan, Candice, Shamma, David A.
The Wizard of Oz (WoZ) method is a widely adopted research approach where a human Wizard "role-plays" a not readily available technology and interacts with participants to elicit user behaviors and probe the design space. With the growing ability for modern large language models (LLMs) to role-play, one can apply LLMs as Wizards in WoZ experiments with better scalability and lower cost than the traditional approach. However, methodological guidance on responsibly applying LLMs in WoZ experiments and a systematic evaluation of LLMs' role-playing ability are lacking. Through two LLM-powered WoZ studies, we take the first step towards identifying an experiment lifecycle for researchers to safely integrate Figure 1: An overview of our proposed experiment lifecycle LLMs into WoZ experiments and interpret data generated compared to traditional Wizard of Oz experiments. We ask from settings that involve Wizards role-played by LLMs. We also GPT-4 empowered agents to play the role of "Wizards" in contribute a heuristic-based evaluation framework that allows the conversation-based Wizard of Oz experiments. The agents estimation of LLMs' role-playing ability in WoZ experiments and talk to either Simulacrums powered by GPT-4 (in Study 1) or reveals LLMs' behavior patterns at scale.