Goto

Collaborating Authors

 Sarajevo


Augmenting Document-level Relation Extraction with Efficient Multi-Supervision

Lin, Xiangyu, Jia, Weijia, Gong, Zhiguo

arXiv.org Artificial Intelligence

Despite its popularity in sentence-level relation extraction, distantly supervised data is rarely utilized by existing work in document-level relation extraction due to its noisy nature and low information density. Among its current applications, distantly supervised data is mostly used as a whole for pertaining, which is of low time efficiency. To fill in the gap of efficient and robust utilization of distantly supervised training data, we propose Efficient Multi-Supervision for document-level relation extraction, in which we first select a subset of informative documents from the massive dataset by combining distant supervision with expert supervision, then train the model with Multi-Supervision Ranking Loss that integrates the knowledge from multiple sources of supervision to alleviate the effects of noise. The experiments demonstrate the effectiveness of our method in improving the model performance with higher time efficiency than existing baselines.


Set the Clock: Temporal Alignment of Pretrained Language Models

Zhao, Bowen, Brumbaugh, Zander, Wang, Yizhong, Hajishirzi, Hannaneh, Smith, Noah A.

arXiv.org Artificial Intelligence

Language models (LMs) are trained on web text originating from many points in time and, in general, without any explicit temporal grounding. This work investigates the temporal chaos of pretrained LMs and explores various methods to align their internal knowledge to a target time, which we call "temporal alignment." To do this, we first automatically construct a dataset containing 20K time-sensitive questions and their answers for each year from 2000 to 2023. Based on this dataset, we empirically show that pretrained LMs (e.g., LLaMa2), despite having a recent pretraining cutoff (e.g., 2022), mostly answer questions using earlier knowledge (e.g., in 2019). We then develop several methods, from prompting to finetuning, to align LMs to use their most recent knowledge when answering questions, and investigate various factors in this alignment. Our experiments demonstrate that aligning LLaMa2 to the year 2022 can enhance its performance by up to 62% according to that year's answers. This improvement occurs even without explicitly mentioning time information, indicating the possibility of aligning models' internal sense of time after pretraining. Finally, we find that alignment to a historical time is also possible, with up to 2.8$\times$ the performance of the unaligned LM in 2010 if finetuning models to that year. These findings hint at the sophistication of LMs' internal knowledge organization and the necessity of tuning them properly.


SLOPE: Search with Learned Optimal Pruning-based Expansion

Bokan, Davor, Ajanovic, Zlatan, Lacevic, Bakir

arXiv.org Artificial Intelligence

Heuristic search is often used for motion planning and pathfinding problems, for finding the shortest path in a graph while also promising completeness and optimal efficiency. The drawback is it's space complexity, specifically storing all expanded child nodes in memory and sorting large lists of active nodes, which can be a problem in real-time scenarios with limited on-board computation. To combat this, we present the Search with Learned Optimal Pruning-based Expansion (SLOPE), which, learns the distance of a node from a possible optimal path, unlike other approaches that learn a cost-to-go value. The unfavored nodes are then pruned according to the said distance, which in turn reduces the size of the open list. This ensures that the search explores only the region close to optimal paths while lowering memory and computational costs. Unlike traditional learning methods, our approach is orthogonal to estimating cost-to-go heuristics, offering a complementary strategy for improving search efficiency. We demonstrate the effectiveness of our approach evaluating it as a standalone search method and in conjunction with learned heuristic functions, achieving comparable-or-better node expansion metrics, while lowering the number of child nodes in the open list. Our code is available at https://github.com/dbokan1/SLOPE.


Robot Explanation Identity

Halilovic, Amar, Krivic, Senka

arXiv.org Artificial Intelligence

To bring robots into human everyday life, their capacity for social interaction must increase. One way for robots to acquire social skills is by assigning them the concept of identity. This research focuses on the concept of \textit{Explanation Identity} within the broader context of robots' roles in society, particularly their ability to interact socially and explain decisions. Explanation Identity refers to the combination of characteristics and approaches robots use to justify their actions to humans. Drawing from different technical and social disciplines, we introduce Explanation Identity as a multidisciplinary concept and discuss its importance in Human-Robot Interaction. Our theoretical framework highlights the necessity for robots to adapt their explanations to the user's context, demonstrating empathy and ethical integrity. This research emphasizes the dynamic nature of robot identity and guides the integration of explanation capabilities in social robots, aiming to improve user engagement and acceptance.


Solid Waste Detection in Remote Sensing Images: A Survey

Fraternali, Piero, Morandini, Luca, González, Sergio Luis Herrera

arXiv.org Artificial Intelligence

The detection and characterization of illegal solid waste disposal sites are essential for environmental protection, particularly for mitigating pollution and health hazards. Improperly managed landfills contaminate soil and groundwater via rainwater infiltration, posing threats to both animals and humans. Traditional landfill identification approaches, such as on-site inspections, are time-consuming and expensive. Remote sensing is a cost-effective solution for the identification and monitoring of solid waste disposal sites that enables broad coverage and repeated acquisitions over time. Earth Observation (EO) satellites, equipped with an array of sensors and imaging capabilities, have been providing high-resolution data for several decades. Researchers proposed specialized techniques that leverage remote sensing imagery to perform a range of tasks such as waste site detection, dumping site monitoring, and assessment of suitable locations for new landfills. This review aims to provide a detailed illustration of the most relevant proposals for the detection and monitoring of solid waste sites by describing and comparing the approaches, the implemented techniques, and the employed data. Furthermore, since the data sources are of the utmost importance for developing an effective solid waste detection model, a comprehensive overview of the satellites and publicly available data sets is presented. Finally, this paper identifies the open issues in the state-of-the-art and discusses the relevant research directions for reducing the costs and improving the effectiveness of novel solid waste detection methods.


Vaccine: Perturbation-aware Alignment for Large Language Model

Huang, Tiansheng, Hu, Sihao, Liu, Ling

arXiv.org Artificial Intelligence

The new paradigm of finetuning-as-a-service introduces a new attack surface for Large Language Models (LLMs): a few harmful data uploaded by users can easily trick the finetuning to produce an alignment-broken model. We conduct an empirical analysis and uncover a \textit{harmful embedding drift} phenomenon, showing a probable cause of the alignment-broken effect. Inspired by our findings, we propose Vaccine, a perturbation-aware alignment technique to mitigate the security risk of users finetuning. The core idea of Vaccine is to produce invariant hidden embeddings by progressively adding crafted perturbation to them in the alignment phase. This enables the embeddings to withstand harmful perturbation from un-sanitized user data in the finetuning phase. Our results on open source mainstream LLMs (e.g., Llama2, Opt, Vicuna) demonstrate that Vaccine can boost the robustness of alignment against harmful prompts induced embedding drift while reserving reasoning ability towards benign prompts. Our code is available at \url{https://github.com/git-disl/Vaccine}.


Locating Factual Knowledge in Large Language Models: Exploring the Residual Stream and Analyzing Subvalues in Vocabulary Space

Yu, Zeping, Ananiadou, Sophia

arXiv.org Artificial Intelligence

We find the location of factual knowledge in large language models by exploring the residual stream and analyzing subvalues in vocabulary space. We find the reason why subvalues have human-interpretable concepts when projecting into vocabulary space. The before-softmax values of subvalues are added by an addition function, thus the probability of top tokens in vocabulary space will increase. Based on this, we find using log probability increase to compute the significance of layers and subvalues is better than probability increase, since the curve of log probability increase has a linear monotonically increasing shape. Moreover, we calculate the inner products to evaluate how much a feed-forward network (FFN) subvalue is activated by previous layers. Base on our methods, we find where factual knowledge is stored. Specifically, attention layers store "Paris is related to France". FFN layers store "Paris is a capital/city", activated by attention subvalues related to "capital". We leverage our method on Baevski-18, GPT2 medium, Llama-7B and Llama-13B. Overall, we provide a new method for understanding the mechanism of transformers. We will release our code on github.


Understanding Path Planning Explanations

Halilovic, Amar, Krivic, Senka

arXiv.org Artificial Intelligence

Abstract--Navigation is a must-have skill for any mobile robot. For the design of our user study, we will extend and formalize our approach to explain both path planning failures There is an increasing deployment of autonomous robots and deviations from the initial trajectory, i.e. to be able to in various domains [1]. Currently, we use one and accountability in their decision-making exists [2]. Navigation is a pivotal aspect of an autonomous robot create different planning failures and trajectory-contrastive decision-making spectrum. After generating explanations for the created scenarios, a key role in achieving accurate and efficient navigation in we will perturb the explanations in the following way: changing environments.


The Impact of Artificial Intelligence on the Evolution of Digital Education: A Comparative Study of OpenAI Text Generation Tools including ChatGPT, Bing Chat, Bard, and Ernie

Motlagh, Negin Yazdani, Khajavi, Matin, Sharifi, Abbas, Ahmadi, Mohsen

arXiv.org Artificial Intelligence

In the digital era, the integration of artificial intelligence (AI) in education has ushered in transformative changes, redefining teaching methodologies, curriculum planning, and student engagement. This review paper delves deep into the rapidly evolving landscape of digital education by contrasting the capabilities and impact of OpenAI's pioneering text generation tools like Bing Chat, Bard, Ernie with a keen focus on the novel ChatGPT. Grounded in a typology that views education through the lenses of system, process, and result, the paper navigates the multifaceted applications of AI. From decentralizing global education and personalizing curriculums to digitally documenting competence-based outcomes, AI stands at the forefront of educational modernization. Highlighting ChatGPT's meteoric rise to one million users in just five days, the study underscores its role in democratizing education, fostering autodidacticism, and magnifying student engagement. However, with such transformative power comes the potential for misuse, as text-generation tools can inadvertently challenge academic integrity. By juxtaposing the promise and pitfalls of AI in education, this paper advocates for a harmonized synergy between AI tools and the educational community, emphasizing the urgent need for ethical guidelines, pedagogical adaptations, and strategic collaborations.


RecXplainer: Amortized Attribute-based Personalized Explanations for Recommender Systems

Verma, Sahil, Shah, Chirag, Dickerson, John P., Beniwal, Anurag, Sadagopan, Narayanan, Seshadri, Arjun

arXiv.org Artificial Intelligence

Recommender systems influence many of our interactions in the digital world -- impacting how we shop for clothes, sorting what we see when browsing YouTube or TikTok, and determining which restaurants and hotels we are shown when using hospitality platforms. Modern recommender systems are large, opaque models trained on a mixture of proprietary and open-source datasets. Naturally, issues of trust arise on both the developer and user side: is the system working correctly, and why did a user receive (or not receive) a particular recommendation? Providing an explanation alongside a recommendation alleviates some of these concerns. The status quo for auxiliary recommender system feedback is either user-specific explanations (e.g., "users who bought item B also bought item A") or item-specific explanations (e.g., "we are recommending item A because you watched/bought item B"). However, users bring personalized context into their search experience, valuing an item as a function of that item's attributes and their own personal preferences. In this work, we propose RecXplainer, a novel method for generating fine-grained explanations based on a user's preferences over the attributes of recommended items. We evaluate RecXplainer on five real-world and large-scale recommendation datasets using five different kinds of recommender systems to demonstrate the efficacy of RecXplainer in capturing users' preferences over item attributes and using them to explain recommendations. We also compare RecXplainer to five baselines and show RecXplainer's exceptional performance on ten metrics.