Expert Systems
Explainability-Driven Quality Assessment for Rule-Based Systems
Seneviratne, Oshani, Capuzzo, Brendan, Van Woensel, William
This paper introduces an explanation framework designed to enhance the quality of rules in knowledge-based reasoning systems based on dataset-driven insights. The traditional method for rule induction from data typically requires labor-intensive labeling and data-driven learning. This framework provides an alternative and instead allows for the data-driven refinement of existing rules: it generates explanations of rule inferences and leverages human interpretation to refine rules. It leverages four complementary explanation types: trace-based, contextual, contrastive, and counterfactual, providing diverse perspectives for debugging, validating, and ultimately refining rules. By embedding explainability into the reasoning architecture, the framework enables knowledge engineers to address inconsistencies, optimize thresholds, and ensure fairness, transparency, and interpretability in decision-making processes. Its practicality is demonstrated through a use case in finance.
Explainability in Practice: A Survey of Explainable NLP Across Various Domains
Mohammadi, Hadi, Bagheri, Ayoub, Giachanou, Anastasia, Oberski, Daniel L.
Natural Language Processing (NLP) has become a cornerstone in many critical sectors, including healthcare, finance, and customer relationship management. This is especially true with the development and use of advanced models such as GPT-based architectures and BERT, which are widely used in decision-making processes. However, the black-box nature of these advanced NLP models has created an urgent need for transparency and explainability. This review explores explainable NLP (XNLP) with a focus on its practical deployment and real-world applications, examining its implementation and the challenges faced in domain-specific contexts. The paper underscores the importance of explainability in NLP and provides a comprehensive perspective on how XNLP can be designed to meet the unique demands of various sectors, from healthcare's need for clear insights to finance's emphasis on fraud detection and risk assessment. Additionally, this review aims to bridge the knowledge gap in XNLP literature by offering a domain-specific exploration and discussing underrepresented areas such as real-world applicability, metric evaluation, and the role of human interaction in model assessment. The paper concludes by suggesting future research directions that could enhance the understanding and broader application of XNLP.
Integrating Frequency Guidance into Multi-source Domain Generalization for Bearing Fault Diagnosis
Tu, Xiaotong, Ma, Chenyu, Wu, Qingyao, Liu, Yinhao, Zhang, Hongyang
Recent generalizable fault diagnosis researches have effectively tackled the distributional shift between unseen working conditions. Most of them mainly focus on learning domain-invariant representation through feature-level methods. However, the increasing numbers of unseen domains may lead to domain-invariant features contain instance-level spurious correlations, which impact the previous models' generalizable ability. To address the limitations, we propose the Fourier-based Augmentation Reconstruction Network, namely FARNet.The methods are motivated by the observation that the Fourier phase component and amplitude component preserve different semantic information of the signals, which can be employed in domain augmentation techniques. The network comprises an amplitude spectrum sub-network and a phase spectrum sub-network, sequentially reducing the discrepancy between the source and target domains. To construct a more robust generalized model, we employ a multi-source domain data augmentation strategy in the frequency domain. Specifically, a Frequency-Spatial Interaction Module (FSIM) is introduced to handle global information and local spatial features, promoting representation learning between the two sub-networks. To refine the decision boundary of our model output compared to conventional triplet loss, we propose a manifold triplet loss to contribute to generalization. Through extensive experiments on the CWRU and SJTU datasets, FARNet demonstrates effective performance and achieves superior results compared to current cross-domain approaches on the benchmarks.
GoPro pushes update to its entry-level Hero camera, adding 4:3 video for social clips
GoPro is rolling out a software update for its entry-level Hero camera that allows users to shoot 4:3 video in 4K. This is great for the kinds of clips that populate social media sites like TikTok, as the footage is taller. The update is available for free via the company's GoPro Quik app on iOS and Android. Obviously, the new aspect ratio is intended for social media content, but shooting in 4:3 has several use case scenarios. For instance, it can be the perfect choice for capturing video from a first-person perspective.
KBQA-o1: Agentic Knowledge Base Question Answering with Monte Carlo Tree Search
Luo, Haoran, E, Haihong, Guo, Yikai, Lin, Qika, Wu, Xiaobao, Mu, Xinyu, Liu, Wenhao, Song, Meina, Zhu, Yifan, Tuan, Luu Anh
Knowledge Base Question Answering (KBQA) aims to answer natural language questions with a large-scale structured knowledge base (KB). Despite advancements with large language models (LLMs), KBQA still faces challenges in weak KB awareness, imbalance between effectiveness and efficiency, and high reliance on annotated data. To address these challenges, we propose KBQA-o1, a novel agentic KBQA method with Monte Carlo Tree Search (MCTS). It introduces a ReAct-based agent process for stepwise logical form generation with KB environment exploration. Moreover, it employs MCTS, a heuristic search method driven by policy and reward models, to balance agentic exploration's performance and search space. With heuristic exploration, KBQA-o1 generates high-quality annotations for further improvement by incremental fine-tuning. Experimental results show that KBQA-o1 outperforms previous low-resource KBQA methods with limited annotated data, boosting Llama-3.1-8B model's GrailQA F1 performance to 78.5% compared to 48.5% of the previous sota method with GPT-3.5-turbo.
International AI Safety Report
Bengio, Yoshua, Mindermann, Sรถren, Privitera, Daniel, Besiroglu, Tamay, Bommasani, Rishi, Casper, Stephen, Choi, Yejin, Fox, Philip, Garfinkel, Ben, Goldfarb, Danielle, Heidari, Hoda, Ho, Anson, Kapoor, Sayash, Khalatbari, Leila, Longpre, Shayne, Manning, Sam, Mavroudis, Vasilios, Mazeika, Mantas, Michael, Julian, Newman, Jessica, Ng, Kwan Yee, Okolo, Chinasa T., Raji, Deborah, Sastry, Girish, Seger, Elizabeth, Skeadas, Theodora, South, Tobin, Strubell, Emma, Tramรจr, Florian, Velasco, Lucia, Wheeler, Nicole, Acemoglu, Daron, Adekanmbi, Olubayo, Dalrymple, David, Dietterich, Thomas G., Felten, Edward W., Fung, Pascale, Gourinchas, Pierre-Olivier, Heintz, Fredrik, Hinton, Geoffrey, Jennings, Nick, Krause, Andreas, Leavy, Susan, Liang, Percy, Ludermir, Teresa, Marda, Vidushi, Margetts, Helen, McDermid, John, Munga, Jane, Narayanan, Arvind, Nelson, Alondra, Neppel, Clara, Oh, Alice, Ramchurn, Gopal, Russell, Stuart, Schaake, Marietje, Schรถlkopf, Bernhard, Song, Dawn, Soto, Alvaro, Tiedrich, Lee, Varoquaux, Gaรซl, Yao, Andrew, Zhang, Ya-Qin, Albalawi, Fahad, Alserkal, Marwan, Ajala, Olubunmi, Avrin, Guillaume, Busch, Christian, de Carvalho, Andrรฉ Carlos Ponce de Leon Ferreira, Fox, Bronwyn, Gill, Amandeep Singh, Hatip, Ahmet Halit, Heikkilรค, Juha, Jolly, Gill, Katzir, Ziv, Kitano, Hiroaki, Krรผger, Antonio, Johnson, Chris, Khan, Saif M., Lee, Kyoung Mu, Ligot, Dominic Vincent, Molchanovskyi, Oleksii, Monti, Andrea, Mwamanzi, Nusu, Nemer, Mona, Oliver, Nuria, Portillo, Josรฉ Ramรณn Lรณpez, Ravindran, Balaraman, Rivera, Raquel Pezoa, Riza, Hammam, Rugege, Crystal, Seoighe, Ciarรกn, Sheehan, Jerry, Sheikh, Haroon, Wong, Denise, Zeng, Yi
I am honoured to present the International AI Safety Report. It is the work of 96 international AI experts who collaborated in an unprecedented effort to establish an internationally shared scientific understanding of risks from advanced AI and methods for managing them. We embarked on this journey just over a year ago, shortly after the countries present at the Bletchley Park AI Safety Summit agreed to support the creation of this report. Since then, we published an Interim Report in May 2024, which was presented at the AI Seoul Summit. We are now pleased to publish the present, full report ahead of the AI Action Summit in Paris in February 2025. Since the Bletchley Summit, the capabilities of general-purpose AI, the type of AI this report focuses on, have increased further. For example, new models have shown markedly better performance at tests of Professor Yoshua Bengio programming and scientific reasoning.
Improving Interpretability and Accuracy in Neuro-Symbolic Rule Extraction Using Class-Specific Sparse Filters
Padalkar, Parth, Lee, Jaeseong, Wei, Shiyi, Gupta, Gopal
There has been significant focus on creating neuro-symbolic models for interpretable image classification using Convolutional Neural Networks (CNNs). These methods aim to replace the CNN with a neuro-symbolic model consisting of the CNN, which is used as a feature extractor, and an interpretable rule-set extracted from the CNN itself. While these approaches provide interpretability through the extracted rule-set, they often compromise accuracy compared to the original CNN model. In this paper, we identify the root cause of this accuracy loss as the post-training binarization of filter activations to extract the rule-set. To address this, we propose a novel sparsity loss function that enables class-specific filter binarization during CNN training, thus minimizing information loss when extracting the rule-set. We evaluate several training strategies with our novel sparsity loss, analyzing their effectiveness and providing guidance on their appropriate use. Notably, we set a new benchmark, achieving a 9% improvement in accuracy and a 53% reduction in rule-set size on average, compared to the previous SOTA, while coming within 3% of the original CNN's accuracy. This highlights the significant potential of interpretable neuro-symbolic models as viable alternatives to black-box CNNs.
ESGSenticNet: A Neurosymbolic Knowledge Base for Corporate Sustainability Analysis
Ong, Keane, Mao, Rui, Xing, Frank, Satapathy, Ranjan, Sulaeman, Johan, Cambria, Erik, Mengaldo, Gianmarco
Evaluating corporate sustainability performance is essential to drive sustainable business practices, amid the need for a more sustainable economy. However, this is hindered by the complexity and volume of corporate sustainability data (i.e. sustainability disclosures), not least by the effectiveness of the NLP tools used to analyse them. To this end, we identify three primary challenges - immateriality, complexity, and subjectivity, that exacerbate the difficulty of extracting insights from sustainability disclosures. To address these issues, we introduce ESGSenticNet, a publicly available knowledge base for sustainability analysis. ESGSenticNet is constructed from a neurosymbolic framework that integrates specialised concept parsing, GPT-4o inference, and semi-supervised label propagation, together with a hierarchical taxonomy. This approach culminates in a structured knowledge base of 44k knowledge triplets - ('halve carbon emission', supports, 'emissions control'), for effective sustainability analysis. Experiments indicate that ESGSenticNet, when deployed as a lexical method, more effectively captures relevant and actionable sustainability information from sustainability disclosures compared to state of the art baselines. Besides capturing a high number of unique ESG topic terms, ESGSenticNet outperforms baselines on the ESG relatedness and ESG action orientation of these terms by 26% and 31% respectively. These metrics describe the extent to which topic terms are related to ESG, and depict an action toward ESG. Moreover, when deployed as a lexical method, ESGSenticNet does not require any training, possessing a key advantage in its simplicity for non-technical stakeholders.
Review for NeurIPS paper: BoxE: A Box Embedding Model for Knowledge Base Completion
Additional Feedback: Please number ALL equations for easy reference, at least in the preliminary submission. L139 Translational bumps are certainly very expressive, but a likely first reaction is that they are too expressive. Perhaps you need a couple sentences right here on how you control their power. L153 "for the sample KG, there are 4 2 potential configurations" There are four entities and two binary relations. For each relation, each slot can be occupied by any one of four entities (assuming selectively reflexive and symmetric relations allowed).
Review for NeurIPS paper: BoxE: A Box Embedding Model for Knowledge Base Completion
The paper aims to improve knowledge base modelling. In this regards, authors propose a rather ingenious use of box embeddings as the latent representation for the relations. Specifically, each n-ary relation is represented by n boxes and each entity is represented by two vectors. Having a pair of vectors is very powerful, as they allow us to model complex interactions across entities. In particular authors show how their proposed box embeddings can simultaneously handle symmetry, asymmetry, anti-symmetry, and transitivity. No previous framework is claimed to be as flexible nor capable of handling all these patterns.