Goto

Collaborating Authors

 Oceania


Towards a cognitive architecture to enable natural language interaction in co-constructive task learning

arXiv.org Artificial Intelligence

This research addresses the question, which characteristics a cognitive architecture must have to leverage the benefits of natural language in Co-Constructive Task Learning (CCTL). To provide context, we first discuss Interactive Task Learning (ITL), the mechanisms of the human memory system, and the significance of natural language and multi-modality. Next, we examine the current state of cognitive architectures, analyzing their capabilities to inform a concept of CCTL grounded in multiple sources. We then integrate insights from various research domains to develop a unified framework. Finally, we conclude by identifying the remaining challenges and requirements necessary to achieve CCTL in Human-Robot Interaction (HRI).


Artificial Conversations, Real Results: Fostering Language Detection with Synthetic Data

arXiv.org Artificial Intelligence

Collecting high-quality training data is essential for fine-tuning Large Language Models (LLMs). However, acquiring such data is often costly and time-consuming, especially for non-English languages such as Italian. Recently, researchers have begun to explore the use of LLMs to generate synthetic datasets as a viable alternative. This study proposes a pipeline for generating synthetic data and a comprehensive approach for investigating the factors that influence the validity of synthetic data generated by LLMs by examining how model performance is affected by metrics such as prompt strategy, text length and target position in a specific task, i.e. inclusive language detection in Italian job advertisements. Our results show that, in most cases and across different metrics, the fine-tuned models trained on synthetic data consistently outperformed other models on both real and synthetic test datasets.


Get the Agents Drunk: Memory Perturbations in Autonomous Agent-based Recommender Systems

arXiv.org Artificial Intelligence

Large language model-based agents are increasingly used in recommender systems (Agent4RSs) to achieve personalized behavior modeling. Specifically, Agent4RSs introduces memory mechanisms that enable the agents to autonomously learn and self-evolve from real-world interactions. However, to the best of our knowledge, how robust Agent4RSs are remains unexplored. As such, in this paper, we propose the first work to attack Agent4RSs by perturbing agents' memories, not only to uncover their limitations but also to enhance their security and robustness, ensuring the development of safer and more reliable AI agents. Given the security and privacy concerns, it is more practical to launch attacks under a black-box setting, where the accurate knowledge of the victim models cannot be easily obtained. Moreover, the practical attacks are often stealthy to maximize the impact. To this end, we propose a novel practical attack framework named DrunkAgent. DrunkAgent consists of a generation module, a strategy module, and a surrogate module. The generation module aims to produce effective and coherent adversarial textual triggers, which can be used to achieve attack objectives such as promoting the target items. The strategy module is designed to `get the target agents drunk' so that their memories cannot be effectively updated during the interaction process. As such, the triggers can play the best role. Both of the modules are optimized on the surrogate module to improve the transferability and imperceptibility of the attacks. By identifying and analyzing the vulnerabilities, our work provides critical insights that pave the way for building safer and more resilient Agent4RSs. Extensive experiments across various real-world datasets demonstrate the effectiveness of DrunkAgent.


THEMIS: Towards Practical Intellectual Property Protection for Post-Deployment On-Device Deep Learning Models

arXiv.org Artificial Intelligence

On-device deep learning (DL) has rapidly gained adoption in mobile apps, offering the benefits of offline model inference and user privacy preservation over cloud-based approaches. However, it inevitably stores models on user devices, introducing new vulnerabilities, particularly model-stealing attacks and intellectual property infringement. While system-level protections like Trusted Execution Environments (TEEs) provide a robust solution, practical challenges remain in achieving scalable on-device DL model protection, including complexities in supporting third-party models and limited adoption in current mobile solutions. Advancements in TEE-enabled hardware, such as NVIDIA's GPU-based TEEs, may address these obstacles in the future. Currently, watermarking serves as a common defense against model theft but also faces challenges here as many mobile app developers lack corresponding machine learning expertise and the inherent read-only and inference-only nature of on-device DL models prevents third parties like app stores from implementing existing watermarking techniques in post-deployment models. To protect the intellectual property of on-device DL models, in this paper, we propose THEMIS, an automatic tool that lifts the read-only restriction of on-device DL models by reconstructing their writable counterparts and leverages the untrainable nature of on-device DL models to solve watermark parameters and protect the model owner's intellectual property. Extensive experimental results across various datasets and model structures show the superiority of THEMIS in terms of different metrics. Further, an empirical investigation of 403 real-world DL mobile apps from Google Play is performed with a success rate of 81.14%, showing the practicality of THEMIS.


Is LLM the Silver Bullet to Low-Resource Languages Machine Translation?

arXiv.org Artificial Intelligence

Low-Resource Languages (LRLs) present significant challenges in natural language processing due to their limited linguistic resources and underrepresentation in standard datasets. While recent advancements in Large Language Models (LLMs) and Neural Machine Translation (NMT) have substantially improved translation capabilities for high-resource languages, performance disparities persist for LRLs, particularly impacting privacy-sensitive and resource-constrained scenarios. This paper systematically evaluates the limitations of current LLMs across 200 languages using benchmarks such as FLORES-200. We also explore alternative data sources, including news articles and bilingual dictionaries, and demonstrate how knowledge distillation from large pre-trained models can significantly improve smaller LRL translations. Additionally, we investigate various fine-tuning strategies, revealing that incremental enhancements markedly reduce performance gaps on smaller LLMs.


Pay More Attention to the Robustness of Prompt for Instruction Data Mining

arXiv.org Artificial Intelligence

Instruction tuning has emerged as a paramount method for tailoring the behaviors of LLMs. Recent work has unveiled the potential for LLMs to achieve high performance through fine-tuning with a limited quantity of high-quality instruction data. Building upon this approach, we further explore the impact of prompt's robustness on the selection of high-quality instruction data. This paper proposes a pioneering framework of high-quality online instruction data mining for instruction tuning, focusing on the impact of prompt's robustness on the data mining process. Our notable innovation, is to generate the adversarial instruction data by conducting the attack for the prompt of online instruction data. Then, we introduce an Adversarial Instruction-Following Difficulty metric to measure how much help the adversarial instruction data can provide to the generation of the corresponding response. Apart from it, we propose a novel Adversarial Instruction Output Embedding Consistency approach to select high-quality online instruction data. We conduct extensive experiments on two benchmark datasets to assess the performance. The experimental results serve to underscore the effectiveness of our proposed two methods. Moreover, the results underscore the critical practical significance of considering prompt's robustness.


Crossing Boundaries: Leveraging Semantic Divergences to Explore Cultural Novelty in Cooking Recipes

arXiv.org Artificial Intelligence

Novelty modeling and detection is a core topic in Natural Language Processing (NLP), central to numerous tasks such as recommender systems and automatic summarization. It involves identifying pieces of text that deviate in some way from previously known information. However, novelty is also a crucial determinant of the unique perception of relevance and quality of an experience, as it rests upon each individual's understanding of the world. Social factors, particularly cultural background, profoundly influence perceptions of novelty and innovation. Cultural novelty arises from differences in salience and novelty as shaped by the distance between distinct communities. While cultural diversity has garnered increasing attention in artificial intelligence (AI), the lack of robust metrics for quantifying cultural novelty hinders a deeper understanding of these divergences. This gap limits quantifying and understanding cultural differences within computational frameworks. To address this, we propose an interdisciplinary framework that integrates knowledge from sociology and management. Central to our approach is GlobalFusion, a novel dataset comprising 500 dishes and approximately 100,000 cooking recipes capturing cultural adaptation from over 150 countries. By introducing a set of Jensen-Shannon Divergence metrics for novelty, we leverage this dataset to analyze textual divergences when recipes from one community are modified by another with a different cultural background. The results reveal significant correlations between our cultural novelty metrics and established cultural measures based on linguistic, religious, and geographical distances. Our findings highlight the potential of our framework to advance the understanding and measurement of cultural diversity in AI.


In pictures: Prayers and reflection mark Eid celebrations around the world

BBC News

Muslims around the world have begun celebrating Eid al-Fitr, one of the biggest celebrations in the Islamic calendar. Eid al-Fitr - which means "festival of the breaking of the fast" - is celebrated at the end of Ramadan, a month of fasting for many adults, as well as spiritual reflection and prayer.ReutersHere in Moscow, worshippers are seen preparing for prayer.ReutersHundreds took part in prayers at Tononoka grounds, in Mombasa, KenyaGetty ImagesPrayers were also observed at a stadium in Port Sudan in the east of the countryGetty ImagesLittle children joined adults at the Moskee Essalam in Rotterdam, NetherlandsGetty ImagesGifts are handed out to Muslim children in Lviv, Ukraine, as Russia's war on the country continuesReuters Palestinians in Jabaliya in the northern Gaza Strip pray amidst the rubble of a mosque destroyed in the current war between Israel and HamasGetty ImagesFamilies gather at al-Aqsa mosque in Jerusalem - the third holiest site in IslamReutersA boy yawns during prayers at a stadium in QatarEPAMuslims greet each-other at Martim Moniz Square in Lisbon, PortugalGetty ImagesWomen worshippers gather in Burgess Park, London, for an outdoor prayerEPAThere were also worshippers gathered outside Plebiscito Square in Naples, ItalyReutersSome women took pictures after attending prayers at the Hagia Sophia Grand Mosque in Istanbul, TurkeyGetty ImagesAfghan refugees pray at a mosque on the outskirts of Peshawar, PakistanMiddle EastEuropeEid al-FitrReligionIslamRelated'I was afraid for my life': At the scene of the attack on Palestinian Oscar winner 5 days agoMiddle EastMore8 hrs ago'In Bradford, families spend thousands on new clothes for Eid' Muslims spend large amounts in Bradford's supermarkets, clothes shops and other services before Eid.8 hrs agoEngland1 day ago The tourist has received an award from the city's mayor after restraining a man during a stabbing.1 day agoEurope1 day ago Another 21 people are injured, as a restaurant and several buildings are set ablaze in the city, local officials say.1 day agoWorld1 day ago Town's successful Ramadan lights project expanded A Scunthorpe community group says it has seen an "amazing" response to its lights display.1 day agoLincolnshire1 day ago Bishop says school that changed Easter events'valued' The BBC is not responsible for the content of external sites.


How and why parents and teachers are introducing young children to AI

The Guardian

Since the release of ChatGPT in late 2022, generative artificial intelligence has trickled down from adults in their offices to university students in campus libraries to teenagers in high school hallways. Now it's reaching the youngest among us, and parents and teachers are grappling with the most responsible way to introduce their under-13s to a new technology that may fundamentally reshape the future. Though the terms of service for ChatGPT, Google's Gemini and other AI models specify that the tools are only meant for those over 13, parents and teachers are taking the matter of AI education into their own hands. Inspired by a story we published on parents who are teaching their children to use AI to set them up for success in school and at work, we asked Guardian readers how and why – or why not – others are doing the same. Though our original story only concerned parents, we have also included teachers in the responses published below, as preparing children for future studies and jobs is one of educators' responsibilities as well.


Agent-Centric Personalized Multiple Clustering with Multi-Modal LLMs

arXiv.org Artificial Intelligence

Personalized multiple clustering aims to generate diverse partitions of a dataset based on different user-specific aspects, rather than a single clustering. It has recently drawn research interest for accommodating varying user preferences. Recent approaches primarily use CLIP embeddings with proxy learning to extract representations biased toward user clustering preferences. However, CLIP primarily focuses on coarse image-text alignment, lacking a deep contextual understanding of user interests. To overcome these limitations, we propose an agent-centric personalized clustering framework that leverages multi-modal large language models (MLLMs) as agents to comprehensively traverse a relational graph to search for clusters based on user interests. Due to the advanced reasoning mechanism of MLLMs, the obtained clusters align more closely with user-defined criteria than those obtained from CLIP-based representations. To reduce computational overhead, we shorten the agents' traversal path by constructing a relational graph using user-interest-biased embeddings extracted by MLLMs. A large number of weakly connected edges can be filtered out based on embedding similarity, facilitating an efficient traversal search for agents. Experimental results show that the proposed method achieves NMI scores of 0.9667 and 0.9481 on the Card Order and Card Suits benchmarks, respectively, largely improving the SOTA model by over 140%.