Expert Systems
Interactive Natural Language Processing
Wang, Zekun, Zhang, Ge, Yang, Kexin, Shi, Ning, Zhou, Wangchunshu, Hao, Shaochun, Xiong, Guangzheng, Li, Yizhi, Sim, Mong Yuan, Chen, Xiuying, Zhu, Qingqing, Yang, Zhenzhu, Nik, Adam, Liu, Qi, Lin, Chenghua, Wang, Shi, Liu, Ruibo, Chen, Wenhu, Xu, Ke, Liu, Dayiheng, Guo, Yike, Fu, Jie
Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within the field of NLP, aimed at addressing limitations in existing frameworks while aligning with the ultimate goals of artificial intelligence. This paradigm considers language models as agents capable of observing, acting, and receiving feedback iteratively from external entities. Specifically, language models in this context can: (1) interact with humans for better understanding and addressing user needs, personalizing responses, aligning with human values, and improving the overall user experience; (2) interact with knowledge bases for enriching language representations with factual knowledge, enhancing the contextual relevance of responses, and dynamically leveraging external information to generate more accurate and informed responses; (3) interact with models and tools for effectively decomposing and addressing complex tasks, leveraging specialized expertise for specific subtasks, and fostering the simulation of social behaviors; and (4) interact with environments for learning grounded representations of language, and effectively tackling embodied tasks such as reasoning, planning, and decision-making in response to environmental observations. This paper offers a comprehensive survey of iNLP, starting by proposing a unified definition and framework of the concept. We then provide a systematic classification of iNLP, dissecting its various components, including interactive objects, interaction interfaces, and interaction methods. We proceed to delve into the evaluation methodologies used in the field, explore its diverse applications, scrutinize its ethical and safety issues, and discuss prospective research directions. This survey serves as an entry point for researchers who are interested in this rapidly evolving area and offers a broad view of the current landscape and future trajectory of iNLP.
Predicting municipalities in financial distress: a machine learning approach enhanced by domain expertise
Piermarini, Dario, Sudoso, Antonio M., Piccialli, Veronica
Financial distress of municipalities, although comparable to bankruptcy of private companies, has a far more serious impact on the well-being of communities. For this reason, it is essential to detect deficits as soon as possible. Predicting financial distress in municipalities can be a complex task, as it involves understanding a wide range of factors that can affect a municipality's financial health. In this paper, we evaluate machine learning models to predict financial distress in Italian municipalities. Accounting judiciary experts have specialized knowledge and experience in evaluating the financial performance, and they use a range of indicators to make their assessments. By incorporating these indicators in the feature extraction process, we can ensure that the model is taking into account a wide range of information that is relevant to the financial health of municipalities. The results of this study indicate that using machine learning models in combination with the knowledge of accounting judiciary experts can aid in the early detection of financial distress, leading to better outcomes for the communities.
Preconditioned Visual Language Inference with Weak Supervision
Qasemi, Ehsan, Maina-Kilaas, Amani R., Dash, Devadutta, Alsaggaf, Khalid, Chen, Muhao
Humans can infer the affordance of objects by extracting related contextual preconditions for each scenario. For example, upon seeing an image of a broken cup, we can infer that this precondition prevents the cup from being used for drinking. Reasoning with preconditions of commonsense is studied in NLP where the model explicitly gets the contextual precondition. However, it is unclear if SOTA visual language models (VLMs) can extract such preconditions and infer the affordance of objects with them. In this work, we introduce the task of preconditioned visual language inference and rationalization (PVLIR). We propose a learning resource based on three strategies to retrieve weak supervision signals for the task and develop a human-verified test set for evaluation. Our results reveal the shortcomings of SOTA VLM models in the task and draw a road map to address the challenges ahead in improving them.
Distinguish Before Answer: Generating Contrastive Explanation as Knowledge for Commonsense Question Answering
Chen, Qianglong, Xu, Guohai, Yan, Ming, Zhang, Ji, Huang, Fei, Si, Luo, Zhang, Yin
Existing knowledge-enhanced methods have achieved remarkable results in certain QA tasks via obtaining diverse knowledge from different knowledge bases. However, limited by the properties of retrieved knowledge, they still have trouble benefiting from both the knowledge relevance and distinguishment simultaneously. To address the challenge, we propose CPACE, a Concept-centric Prompt-bAsed Contrastive Explanation Generation model, which aims to convert obtained symbolic knowledge into a contrastive explanation for better distinguishing the differences among given candidates. Firstly, following previous works, we retrieve different types of symbolic knowledge with a concept-centric knowledge extraction module. After that, we generate corresponding contrastive explanations using acquired symbolic knowledge and explanation prompts as guidance for better modeling the knowledge distinguishment and interpretability. Finally, we regard the generated contrastive explanation as external knowledge for downstream task enhancement. We conduct a series of experiments on three widely-used question-answering datasets: CSQA, QASC, and OBQA. Experimental results demonstrate that with the help of generated contrastive explanation, our CPACE model achieves new SOTA on CSQA (89.8% on the testing set, 0.9% higher than human performance), and gains impressive improvement on QASC and OBQA (4.2% and 3.5%, respectively).
A Survey of Explainable AI and Proposal for a Discipline of Explanation Engineering
Gomes, Clive, Natraj, Lalitha, Liu, Shijun, Datta, Anushka
After introducing the scope of this paper, we start by discussing what an "explanation" really is. We then move on to discuss some of the existing approaches to XAI and build a taxonomy of the most popular methods. Next, we also look at a few applications of these and other XAI techniques in four primary domains: finance, autonomous driving, healthcare and manufacturing. We end by introducing a promising discipline, "Explanation Engineering," which includes a systematic approach for designing explainability into AI systems.
An Ensemble Semi-Supervised Adaptive Resonance Theory Model with Explanation Capability for Pattern Classification
Pourpanah, Farhad, Lim, Chee Peng, Etemad, Ali, Wu, Q. M. Jonathan
Most semi-supervised learning (SSL) models entail complex structures and iterative training processes as well as face difficulties in interpreting their predictions to users. To address these issues, this paper proposes a new interpretable SSL model using the supervised and unsupervised Adaptive Resonance Theory (ART) family of networks, which is denoted as SSL-ART. Firstly, SSL-ART adopts an unsupervised fuzzy ART network to create a number of prototype nodes using unlabeled samples. Then, it leverages a supervised fuzzy ARTMAP structure to map the established prototype nodes to the target classes using labeled samples. Specifically, a one-to-many (OtM) mapping scheme is devised to associate a prototype node with more than one class label. The main advantages of SSL-ART include the capability of: (i) performing online learning, (ii) reducing the number of redundant prototype nodes through the OtM mapping scheme and minimizing the effects of noisy samples, and (iii) providing an explanation facility for users to interpret the predicted outcomes. In addition, a weighted voting strategy is introduced to form an ensemble SSL-ART model, which is denoted as WESSL-ART. Every ensemble member, i.e., SSL-ART, assigns {\color{black}a different weight} to each class based on its performance pertaining to the corresponding class. The aim is to mitigate the effects of training data sequences on all SSL-ART members and improve the overall performance of WESSL-ART. The experimental results on eighteen benchmark data sets, three artificially generated data sets, and a real-world case study indicate the benefits of the proposed SSL-ART and WESSL-ART models for tackling pattern classification problems.
Neurosymbolic AI and its Taxonomy: a survey
Gibaut, Wandemberg, Pereira, Leonardo, Grassiotto, Fabio, Osorio, Alexandre, Gadioli, Eder, Munoz, Amparo, Gomes, Sildolfo, Santos, Claudio dos
As Artificial Intelligence, and Deep Learning in particular, reach impressive results, it gains also unprecedented popularity not only in academics and industry but also in popular culture and society in general. This increasingly ubiquitous AI presence has arisen several concerns about its impacts on humanity and the planet, with some well-known scientists like Stephen Hawking having spoken concerns about AI's accountability [1]. Despite achieving outstanding results in Computer Vision, Natural Language Processing and Game Playing [2, 3], tasks in which AIs formerly have poor performance compared to humans, those concerns about AI triggered debates among research communities, including those discussed by Gary Marcus [4] and on AAAI-2020 debate with Geoffrey Hinton, Yoshua Bengio and Yann LeCun [5].
Multi-Grained Knowledge Retrieval for End-to-End Task-Oriented Dialog
Wan, Fanqi, Shen, Weizhou, Yang, Ke, Quan, Xiaojun, Bi, Wei
Retrieving proper domain knowledge from an external database lies at the heart of end-to-end task-oriented dialog systems to generate informative responses. Most existing systems blend knowledge retrieval with response generation and optimize them with direct supervision from reference responses, leading to suboptimal retrieval performance when the knowledge base becomes large-scale. To address this, we propose to decouple knowledge retrieval from response generation and introduce a multi-grained knowledge retriever (MAKER) that includes an entity selector to search for relevant entities and an attribute selector to filter out irrelevant attributes. To train the retriever, we propose a novel distillation objective that derives supervision signals from the response generator. Experiments conducted on three standard benchmarks with both small and large-scale knowledge bases demonstrate that our retriever performs knowledge retrieval more effectively than existing methods. Our code has been made publicly available.\footnote{https://github.com/18907305772/MAKER}
A Genetic Fuzzy System for Interpretable and Parsimonious Reinforcement Learning Policies
Bishop, Jordan T., Gallagher, Marcus, Browne, Will N.
Reinforcement learning (RL) is experiencing a resurgence in research interest, where Learning Classifier Systems (LCSs) have been applied for many years. However, traditional Michigan approaches tend to evolve large rule bases that are difficult to interpret or scale to domains beyond standard mazes. A Pittsburgh Genetic Fuzzy System (dubbed Fuzzy MoCoCo) is proposed that utilises both multiobjective and cooperative coevolutionary mechanisms to evolve fuzzy rule-based policies for RL environments. Multiobjectivity in the system is concerned with policy performance vs. complexity. The continuous state RL environment Mountain Car is used as a testing bed for the proposed system. Results show the system is able to effectively explore the trade-off between policy performance and complexity, and learn interpretable, high-performing policies that use as few rules as possible.
A Comparative Study of Methods for Estimating Conditional Shapley Values and When to Use Them
Olsen, Lars Henry Berge, Glad, Ingrid Kristine, Jullum, Martin, Aas, Kjersti
Shapley values originated in cooperative game theory but are extensively used today as a model-agnostic explanation framework to explain predictions made by complex machine learning models in the industry and academia. There are several algorithmic approaches for computing different versions of Shapley value explanations. Here, we focus on conditional Shapley values for predictive models fitted to tabular data. Estimating precise conditional Shapley values is difficult as they require the estimation of non-trivial conditional expectations. In this article, we develop new methods, extend earlier proposed approaches, and systematize the new refined and existing methods into different method classes for comparison and evaluation. The method classes use either Monte Carlo integration or regression to model the conditional expectations. We conduct extensive simulation studies to evaluate how precisely the different method classes estimate the conditional expectations, and thereby the conditional Shapley values, for different setups. We also apply the methods to several real-world data experiments and provide recommendations for when to use the different method classes and approaches. Roughly speaking, we recommend using parametric methods when we can specify the data distribution almost correctly, as they generally produce the most accurate Shapley value explanations. When the distribution is unknown, both generative methods and regression models with a similar form as the underlying predictive model are good and stable options. Regression-based methods are often slow to train but produce the Shapley value explanations quickly once trained. The vice versa is true for Monte Carlo-based methods, making the different methods appropriate in different practical situations.