South America
DARD: A Multi-Agent Approach for Task-Oriented Dialog Systems
Gupta, Aman, Ravichandran, Anirudh, Zhang, Ziji, Shah, Swair, Beniwal, Anurag, Sadagopan, Narayanan
Task-oriented dialogue systems are essential for applications ranging from customer service to personal assistants and are widely used across various industries. However, developing effective multi-domain systems remains a significant challenge due to the complexity of handling diverse user intents, entity types, and domain-specific knowledge across several domains. In this work, we propose DARD (Domain Assigned Response Delegation), a multi-agent conversational system capable of successfully handling multi-domain dialogs. DARD leverages domain-specific agents, orchestrated by a central dialog manager agent. Our extensive experiments compare and utilize various agent modeling approaches, combining the strengths of smaller fine-tuned models (Flan-T5-large & Mistral-7B) with their larger counterparts, Large Language Models (LLMs) (Claude Sonnet 3.0). We provide insights into the strengths and limitations of each approach, highlighting the benefits of our multi-agent framework in terms of flexibility and composability. We evaluate DARD using the well-established MultiWOZ benchmark, achieving state-of-the-art performance by improving the dialogue inform rate by 6.6% and the success rate by 4.1% over the best-performing existing approaches. Additionally, we discuss various annotator discrepancies and issues within the MultiWOZ dataset and its evaluation system.
Explainable few-shot learning workflow for detecting invasive and exotic tree species
Gevaert, Caroline M., Pedro, Alexandra Aguiar, Ku, Ou, Cheng, Hao, Chandramouli, Pranav, Javan, Farzaneh Dadrass, Nattino, Francesco, Georgievska, Sonja
Deep Learning methods are notorious for relying on extensive labeled datasets to train and assess their performance. This can cause difficulties in practical situations where models should be trained for new applications for which very little data is available. While few-shot learning algorithms can address the first problem, they still lack sufficient explanations for the results. This research presents a workflow that tackles both challenges by proposing an explainable few-shot learning workflow for detecting invasive and exotic tree species in the Atlantic Forest of Brazil using Unmanned Aerial Vehicle (UAV) images. By integrating a Siamese network with explainable AI (XAI), the workflow enables the classification of tree species with minimal labeled data while providing visual, case-based explanations for the predictions. Results demonstrate the effectiveness of the proposed workflow in identifying new tree species, even in data-scarce conditions. With a lightweight backbone, e.g., MobileNet, it achieves a F1-score of 0.86 in 3-shot learning, outperforming a shallow CNN. A set of explanation metrics, i.e., correctness, continuity, and contrastivity, accompanied by visual cases, provide further insights about the prediction results. This approach opens new avenues for using AI and UAVs in forest management and biodiversity conservation, particularly concerning rare or under-studied species.
Rethinking Node Representation Interpretation through Relation Coherence
Lin, Ying-Chun, Neville, Jennifer, Becker, Cassiano, Metha, Purvanshi, Asghar, Nabiha, Agarwal, Vipul
Understanding node representations in graph-based models is crucial for uncovering biases ,diagnosing errors, and building trust in model decisions. However, previous work on explainable AI for node representations has primarily emphasized explanations (reasons for model predictions) rather than interpretations (mapping representations to understandable concepts). Furthermore, the limited research that focuses on interpretation lacks validation, and thus the reliability of such methods is unclear. We address this gap by proposing a novel interpretation method-Node Coherence Rate for Representation Interpretation (NCI)-which quantifies how well different node relations are captured in node representations. We also propose a novel method (IME) to evaluate the accuracy of different interpretation methods. Our experimental results demonstrate that NCI reduces the error of the previous best approach by an average of 39%. We then apply NCI to derive insights about the node representations produced by several graph-based methods and assess their quality in unsupervised settings.
Generative Memesis: AI Mediates Political Memes in the 2024 USA Presidential Election
Chang, Ho-Chun Herbert, Shaman, Benjamin, Chen, Yung-chun, Zha, Mingyue, Noh, Sean, Wei, Chiyu, Weener, Tracy, Magee, Maya
Visual content on social media has become increasingly influential in shaping political discourse and civic engagement. Using a dataset of 239,526 Instagram images, deep learning, and LLM-based workflows, we examine the impact of different content types on user engagement during the 2024 US presidential Elections, with a focus on synthetic visuals. Results show while synthetic content may not increase engagement alone, it mediates how political information is created through highly effective, often absurd, political memes. We define the notion of generative memesis, where memes are no longer shared person-to-person but mediated by AI through customized, generated images. We also find partisan divergences: Democrats use AI for in-group support whereas Republicans use it for out-group attacks. Non-traditional, left-leaning outlets are the primary creators of political memes; emphasis on different topics largely follows issue ownership.
A KAN-based Interpretable Framework for Process-Informed Prediction of Global Warming Potential
Lee, Jaewook, Sun, Xinyang, Errington, Ethan, Guo, Miao
Accurate prediction of Global Warming Potential (GWP) is essential for assessing the environmental impact of chemical processes and materials. Traditional GWP prediction models rely predominantly on molecular structure, overlooking critical process-related information. In this study, we present an integrative GWP prediction model that combines molecular descriptors (MACCS keys and Mordred descriptors) with process information (process title, description, and location) to improve predictive accuracy and interpretability. Using a deep neural network (DNN) model, we achieved an R-squared of 86% on test data with Mordred descriptors, process location, and description information, representing a 25% improvement over the previous benchmark of 61%; XAI analysis further highlighted the significant role of process title embeddings in enhancing model predictions. To enhance interpretability, we employed a Kolmogorov-Arnold Network (KAN) to derive a symbolic formula for GWP prediction, capturing key molecular and process features and providing a transparent, interpretable alternative to black-box models, enabling users to gain insights into the molecular and process factors influencing GWP. Error analysis showed that the model performs reliably in densely populated data ranges, with increased uncertainty for higher GWP values. This analysis allows users to manage prediction uncertainty effectively, supporting data-driven decision-making in chemical and process design. Our results suggest that integrating both molecular and process-level information in GWP prediction models yields substantial gains in accuracy and interpretability, offering a valuable tool for sustainability assessments. Future work may extend this approach to additional environmental impact categories and refine the model to further enhance its predictive reliability.
An unified approach to link prediction in collaboration networks
Sosa, Juan, Martรญnez, Diego, Guerrero, Nicolรกs
This article investigates and compares three approaches to link prediction in colaboration networks, namely, an ERGM (Exponential Random Graph Model; Robins et al. 2007), a GCN (Graph Convolutional Network; Kipf and Welling 2017), and a Word2Vec+MLP model (Word2Vec model combined with a multilayer neural network; Mikolov et al. 2013a and Goodfellow et al. 2016). The ERGM, grounded in statistical methods, is employed to capture general structural patterns within the network, while the GCN and Word2Vec+MLP models leverage deep learning techniques to learn adaptive structural representations of nodes and their relationships. The predictive performance of the models is assessed through extensive simulation exercises using cross-validation, with metrics based on the receiver operating characteristic curve. The results clearly show the superiority of machine learning approaches in link prediction, particularly in large networks, where traditional models such as ERGM exhibit limitations in scalability and the ability to capture inherent complexities. These findings highlight the potential benefits of integrating statistical modeling techniques with deep learning methods to analyze complex networks, providing a more robust and effective framework for future research in this field.
Ukraine prepares to fight North Korean troops in Kursk as war escalates
Ukraine prepared to fight North Korean troops in the Russian region of Kursk on Wednesday, as the entry of a second nuclear power in Russia's war against Ukraine threatened to escalate and broaden the conflict. The United States Pentagon confirmed on Tuesday that North Korean troops were in Kursk, where Ukraine launched a counter-invasion almost three months ago. Pentagon spokesman Pat Ryder said there was "a small number [of North Korean troops] in the Kursk oblast, with a couple of thousand more that are almost there or due to arrive imminently". A senior South Korean official told reporters on Wednesday that about 3,000 North Korean troops were being moved close to the front lines. NATO Secretary-General Mark Rutte confirmed the deployment on Monday.
KAN-AD: Time Series Anomaly Detection with Kolmogorov-Arnold Networks
Zhou, Quan, Pei, Changhua, Sun, Fei, Han, Jing, Gao, Zhengwei, Pei, Dan, Zhang, Haiming, Xie, Gaogang, Li, Jianhui
Time series anomaly detection (TSAD) has become an essential component of large-scale cloud services and web systems because it can promptly identify anomalies, providing early warnings to prevent greater losses. Deep learning-based forecasting methods have become very popular in TSAD due to their powerful learning capabilities. However, accurate predictions don't necessarily lead to better anomaly detection. Due to the common occurrence of noise, i.e., local peaks and drops in time series, existing black-box learning methods can easily learn these unintended patterns, significantly affecting anomaly detection performance. Kolmogorov-Arnold Networks (KAN) offers a potential solution by decomposing complex temporal sequences into a combination of multiple univariate functions, making the training process more controllable. However, KAN optimizes univariate functions using spline functions, which are also susceptible to the influence of local anomalies. To address this issue, we present KAN-AD, which leverages the Fourier series to emphasize global temporal patterns, thereby mitigating the influence of local peaks and drops. KAN-AD improves both effectiveness and efficiency by transforming the existing black-box learning approach into learning the weights preceding univariate functions. Experimental results show that, compared to the current state-of-the-art, we achieved an accuracy increase of 15% while boosting inference speed by 55 times.
Provably Optimal Memory Capacity for Modern Hopfield Models: Transformer-Compatible Dense Associative Memories as Spherical Codes
Hu, Jerry Yao-Chieh, Wu, Dennis, Liu, Han
We study the optimal memorization capacity of modern Hopfield models and Kernelized Hopfield Models (KHMs), a transformer-compatible class of Dense Associative Memories. We present a tight analysis by establishing a connection between the memory configuration of KHMs and spherical codes from information theory. Specifically, we treat the stored memory set as a specialized spherical code. This enables us to cast the memorization problem in KHMs into a point arrangement problem on a hypersphere. We show that the optimal capacity of KHMs occurs when the feature space allows memories to form an optimal spherical code. This unique perspective leads to: (i) An analysis of how KHMs achieve optimal memory capacity, and identify corresponding necessary conditions. Importantly, we establish an upper capacity bound that matches the well-known exponential lower bound in the literature. This provides the first tight and optimal asymptotic memory capacity for modern Hopfield models. (ii) A sub-linear time algorithm $\mathtt{U}\text{-}\mathtt{Hop}$+ to reach KHMs' optimal capacity. (iii) An analysis of the scaling behavior of the required feature dimension relative to the number of stored memories. These efforts improve both the retrieval capability of KHMs and the representation learning of corresponding transformers. Experimentally, we provide thorough numerical results to back up theoretical findings.
Lagrangian neural networks for nonholonomic mechanics
Diaz, Viviana Alejandra, Salomone, Leandro Martin, Zuccalli, Marcela
The laws of motion of a Lagrangian system are determined by the principle of stationary action, also known as Hamilton's principle. This principle states that the action is minimal (or stationary) throughout a mechanical process. From this statement, the differential equations known as Euler-Lagrange equations are derived. If the Lagrangian function of a given mechanical system is known, then Euler-Lagrange equations establish the relationship between accelerations, velocities, and positions; that is, the system dynamics are obtained from Euler-Lagrange equations. Hence, the goal of Lagrangian mechanics is to write an analytic expression for the Lagrangian function in appropriate generalized coordinates and then develop the Euler-Lagrange equations symbolically into a system of second-order differential equations whose solutions give the system's trajectory. In many cases, even when Euler-Lagrange equations are available, the solutions are not provided in analytical or explicit forms.