Goto

Collaborating Authors

 Collection



Quantum Machine Learning Playground

Debus, Pascal, Issel, Sebastian, Tscharke, Kilian

arXiv.org Artificial Intelligence

This article introduces an innovative interactive visualization tool designed to demystify quantum machine learning (QML) algorithms. Our work is inspired by the success of classical machine learning visualization tools, such as TensorFlow Playground, and aims to bridge the gap in visualization resources specifically for the field of QML. The article includes a comprehensive overview of relevant visualization metaphors from both quantum computing and classical machine learning, the development of an algorithm visualization concept, and the design of a concrete implementation as an interactive web application. By combining common visualization metaphors for the so-called data re-uploading universal quantum classifier as a representative QML model, this article aims to lower the entry barrier to quantum computing and encourage further innovation in the field. The accompanying interactive application is a proposal for the first version of a quantum machine learning playground for learning and exploring QML models.


Prime the search: Using large language models for guiding geometric task and motion planning by warm-starting tree search

Lee, Dongryung, Joo, Sejune, Lee, Kimin, Kim, Beomjoon

arXiv.org Artificial Intelligence

The problem of relocating a set of objects to designated areas amidst movable obstacles can be framed as a Geometric Task and Motion Planning (G-TAMP) problem, a subclass of task and motion planning (TAMP). Traditional approaches to G-TAMP have relied either on domain-independent heuristics or on learning from planning experience to guide the search, both of which typically demand significant computational resources or data. In contrast, humans often use common sense to intuitively decide which objects to manipulate in G-TAMP problems. Inspired by this, we propose leveraging Large Language Models (LLMs), which have common sense knowledge acquired from internet-scale data, to guide task planning in G-TAMP problems. To enable LLMs to perform geometric reasoning, we design a predicate-based prompt that encodes geometric information derived from a motion planning algorithm. We then query the LLM to generate a task plan, which is then used to search for a feasible set of continuous parameters. Since LLMs are prone to mistakes, instead of committing to LLM's outputs, we extend Monte Carlo Tree Search (MCTS) to a hybrid action space and use the LLM to guide the search. Unlike the previous approach that calls an LLM at every node and incurs high computational costs, we use it to warm-start the MCTS with the nodes explored in completing the LLM's task plan. On six different G-TAMP problems, we show our method outperforms previous LLM planners and pure search algorithms. Code can be found at: https://github.com/iMSquared/prime-the-search


Preface to the Special Issue of the TAL Journal on Scholarly Document Processing

Boudin, Florian, Aizawa, Akiko

arXiv.org Artificial Intelligence

The rapid growth of scholarly literature makes it increasingly difficult for researchers to keep up with new knowledge. Automated tools are now more essential than ever to help navigate and interpret this vast body of information. Scientific papers pose unique difficulties, with their complex language, specialized terminology, and diverse formats, requiring advanced methods to extract reliable and actionable insights. Large language models (LLMs) offer new opportunities, enabling tasks such as literature reviews, writing assistance, and interactive exploration of research. This special issue of the TAL journal highlights research addressing these challenges and, more broadly, research on natural language processing and information retrieval for scholarly and scientific documents.


#AAAI2025 workshops round-up 3: Neural reasoning and mathematical discovery, and AI to accelerate science and engineering

AIHub

In this series of articles, we're publishing summaries with some of the key takeaways from a few of the workshops held at the 39th Annual AAAI Conference on Artificial Intelligence (AAAI 2025). Recent progress in Sphere Neural Networks demonstrates various possibilities for neural networks to achieve symbolic-level reasoning. This workshop aimed to reconsider various problems and discuss walk-round solutions in the two-way street commingling of neural networks and mathematics. This workshop brought together researchers from artificial intelligence and diverse scientific domains to address new challenges towards accelerating scientific discovery and engineering design. This was the fourth iteration of the workshop, with the theme of AI for biological sciences following previous three years' themes of AI for chemistry, earth sciences, and materials/manufacturing respectively.


Reinforcement Learning Environment with LLM-Controlled Adversary in D&D 5th Edition Combat

Dayo, Joseph Emmanuel DL, Ogbinar, Michel Onasis S., Naval, Prospero C. Jr

arXiv.org Artificial Intelligence

The objective of this study is to design and implement a reinforcement learning (RL) environment using D\&D 5E combat scenarios to challenge smaller RL agents through interaction with a robust adversarial agent controlled by advanced Large Language Models (LLMs) like GPT-4o and LLaMA 3 8B. This research employs Deep Q-Networks (DQN) for the smaller agents, creating a testbed for strategic AI development that also serves as an educational tool by simulating dynamic and unpredictable combat scenarios. We successfully integrated sophisticated language models into the RL framework, enhancing strategic decision-making processes. Our results indicate that while RL agents generally outperform LLM-controlled adversaries in standard metrics, the strategic depth provided by LLMs significantly enhances the overall AI capabilities in this complex, rule-based setting. The novelty of our approach and its implications for mastering intricate environments and developing adaptive strategies are discussed, alongside potential innovations in AI-driven interactive simulations. This paper aims to demonstrate how integrating LLMs can create more robust and adaptable AI systems, providing valuable insights for further research and educational applications.


AI for Just Work: Constructing Diverse Imaginations of AI beyond "Replacing Humans"

Jin, Weina, Vincent, Nicholas, Hamarneh, Ghassan

arXiv.org Artificial Intelligence

The AI community usually focuses on "how" to develop AI techniques, but lacks thorough open discussions on "why" we develop AI. Lacking critical reflections on the general visions and purposes of AI may make the community vulnerable to manipulation. In this position paper, we explore the "why" question of AI. We denote answers to the "why" question the imaginations of AI, which depict our general visions, frames, and mindsets for the prospects of AI. We identify that the prevailing vision in the AI community is largely a monoculture that emphasizes objectives such as replacing humans and improving productivity. Our critical examination of this mainstream imagination highlights its underpinning and potentially unjust assumptions. We then call to diversify our collective imaginations of AI, embedding ethical assumptions from the outset in the imaginations of AI. To facilitate the community's pursuit of diverse imaginations, we demonstrate one process for constructing a new imagination of "AI for just work," and showcase its application in the medical image synthesis task to make it more ethical. We hope this work will help the AI community to open dialogues with civil society on the visions and purposes of AI, and inspire more technical works and advocacy in pursuit of diverse and ethical imaginations to restore the value of AI for the public good.


Natural Language Generation

Reiter, Ehud

arXiv.org Artificial Intelligence

This book provides a broad overview of Natural Language Generation (NLG), including technology, user requirements, evaluation, and real-world applications. The focus is on concepts and insights which hopefully will remain relevant for many years, not on the latest LLM innovations. It draws on decades of work by the author and others on NLG. The book has the following chapters: Introduction to NLG; Rule-Based NLG; Machine Learning and Neural NLG; Requirements; Evaluation; Safety, Maintenance, and Testing; and Applications. All chapters include examples and anecdotes from the author's personal experiences, and end with a Further Reading section. The book should be especially useful to people working on applied NLG, including NLG researchers, people in other fields who want to use NLG, and commercial developers. It will not however be useful to people who want to understand the latest LLM technology. There is a companion site with more information at https://ehudreiter.com/book/


Real-time Ship Recognition and Georeferencing for the Improvement of Maritime Situational Awareness

Perez, Borja Carrillo

arXiv.org Artificial Intelligence

In an era where maritime infrastructures are crucial, advanced situational awareness solutions are increasingly important. The use of optical camera systems can allow real-time usage of maritime footage. This thesis presents an investigation into leveraging deep learning and computer vision to advance real-time ship recognition and georeferencing for the improvement of maritime situational awareness. A novel dataset, ShipSG, is introduced, containing 3,505 images and 11,625 ship masks with corresponding class and geographic position. After an exploration of state-of-the-art, a custom real-time segmentation architecture, ScatYOLOv8+CBAM, is designed for the NVIDIA Jetson AGX Xavier embedded system. This architecture adds the 2D scattering transform and attention mechanisms to YOLOv8, achieving an mAP of 75.46% and an 25.3 ms per frame, outperforming state-of-the-art methods by over 5%. To improve small and distant ship recognition in high-resolution images on embedded systems, an enhanced slicing mechanism is introduced, improving mAP by 8% to 11%. Additionally, a georeferencing method is proposed, achieving positioning errors of 18 m for ships up to 400 m away and 44 m for ships between 400 m and 1200 m. The findings are also applied in real-world scenarios, such as the detection of abnormal ship behaviour, camera integrity assessment and 3D reconstruction. The approach of this thesis outperforms existing methods and provides a framework for integrating recognized and georeferenced ships into real-time systems, enhancing operational effectiveness and decision-making for maritime stakeholders. This thesis contributes to the maritime computer vision field by establishing a benchmark for ship segmentation and georeferencing research, demonstrating the viability of deep-learning-based recognition and georeferencing methods for real-time maritime monitoring.


Deep Learning and Machine Learning, Advancing Big Data Analytics and Management: Handy Appetizer

Peng, Benji, Pan, Xuanhe, Wen, Yizhu, Bi, Ziqian, Chen, Keyu, Li, Ming, Liu, Ming, Niu, Qian, Liu, Junyu, Wang, Jinlang, Zhang, Sen, Xu, Jiawei, Feng, Pohsun

arXiv.org Artificial Intelligence

This book explores the role of Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) in driving the progress of big data analytics and management. The book focuses on simplifying the complex mathematical concepts behind deep learning, offering intuitive visualizations and practical case studies to help readers understand how neural networks and technologies like Convolutional Neural Networks (CNNs) work. It introduces several classic models and technologies such as Transformers, GPT, ResNet, BERT, and YOLO, highlighting their applications in fields like natural language processing, image recognition, and autonomous driving. The book also emphasizes the importance of pre-trained models and how they can enhance model performance and accuracy, with instructions on how to apply these models in various real-world scenarios. Additionally, it provides an overview of key big data management technologies like SQL and NoSQL databases, as well as distributed computing frameworks such as Apache Hadoop and Spark, explaining their importance in managing and processing vast amounts of data. Ultimately, the book underscores the value of mastering deep learning and big data management skills as critical tools for the future workforce, making it an essential resource for both beginners and experienced professionals.