Rule-Based Reasoning
The Power of Nudging: Exploring Three Interventions for Metacognitive Skills Instruction across Intelligent Tutoring Systems
Abdelshiheed, Mark, Hostetter, John Wesley, Shabrina, Preya, Barnes, Tiffany, Chi, Min
Deductive domains are typical of many cognitive skills in that no single problem-solving strategy is always optimal for solving all problems. It was shown that students who know how and when to use each strategy (StrTime) outperformed those who know neither and stick to the default strategy (Default). In this work, students were trained on a logic tutor that supports a default forward-chaining and a backward-chaining (BC) strategy, then a probability tutor that only supports BC. We investigated three types of interventions on teaching the Default students how and when to use which strategy on the logic tutor: Example, Nudge and Presented. Meanwhile, StrTime students received no interventions. Overall, our results show that Nudge outperformed their Default peers and caught up with StrTime on both tutors.
Mixing Backward- with Forward-Chaining for Metacognitive Skill Acquisition and Transfer
Abdelshiheed, Mark, Hostetter, John Wesley, Yang, Xi, Barnes, Tiffany, Chi, Min
Metacognitive skills have been commonly associated with preparation for future learning in deductive domains. Many researchers have regarded strategy- and time-awareness as two metacognitive skills that address how and when to use a problem-solving strategy, respectively. It was shown that students who are both strategy-and time-aware (StrTime) outperformed their nonStrTime peers across deductive domains. In this work, students were trained on a logic tutor that supports a default forward-chaining (FC) and a backward-chaining (BC) strategy. We investigated the impact of mixing BC with FC on teaching strategy- and time-awareness for nonStrTime students. During the logic instruction, the experimental students (Exp) were provided with two BC worked examples and some problems in BC to practice how and when to use BC. Meanwhile, their control (Ctrl) and StrTime peers received no such intervention. Six weeks later, all students went through a probability tutor that only supports BC to evaluate whether the acquired metacognitive skills are transferred from logic. Our results show that on both tutors, Exp outperformed Ctrl and caught up with StrTime.
Who are you referring to? Coreference resolution in image narrations
Goel, Arushi, Fernando, Basura, Keller, Frank, Bilen, Hakan
Coreference resolution aims to identify words and phrases which refer to same entity in a text, a core task in natural language processing. In this paper, we extend this task to resolving coreferences in long-form narrations of visual scenes. First we introduce a new dataset with annotated coreference chains and their bounding boxes, as most existing image-text datasets only contain short sentences without coreferring expressions or labeled chains. We propose a new technique that learns to identify coreference chains using weak supervision, only from image-text pairs and a regularization using prior linguistic knowledge. Our model yields large performance gains over several strong baselines in resolving coreferences. We also show that coreference resolution helps improving grounding narratives in images.
Hybrid Classic-Quantum Computing for Staging of Invasive Ductal Carcinoma of Breast
Moret-Bonillo, Vicente, Mosqueira-Rey, Eduardo, Magaz-Romero, Samuel, Alvarez-Estevez, Diego
Despite the great current relevance of Artificial Intelligence, and the extraordinary innovations that this discipline has brought to many fields -among which, without a doubt, medicine is found-, experts in medical applications of Artificial Intelligence are looking for new alternatives to solve problems for which current Artificial Intelligence programs do not provide with optimal solutions. For this, one promising option could be the use of the concepts and ideas of Quantum Mechanics, for the construction of quantum-based Artificial Intelligence systems. From a hybrid classical-quantum perspective, this article deals with the application of quantum computing techniques for the staging of Invasive Ductal Carcinoma of the breast. It includes: (1) a general explanation of a classical, and well-established, approach for medical reasoning, (2) a description of the clinical problem, (3) a conceptual model for staging invasive ductal carcinoma, (4) some basic notions about Quantum Rule-Based Systems, (5) a step-by-step explanation of the proposed approach for quantum staging of the invasive ductal carcinoma, and (6) the results obtained after running the quantum system on a significant number of use cases. A detailed discussion is also provided at the end of this paper.
Taking advantage of a very simple property to efficiently infer NFAs
Jastrzab, Tomasz, Lardeux, Frédéric, Monfroy, Eric
Grammatical inference consists in learning a formal grammar as a finite state machine or as a set of rewrite rules. In this paper, we are concerned with inferring Nondeterministic Finite Automata (NFA) that must accept some words, and reject some other words from a given sample. This problem can naturally be modeled in SAT. The standard model being enormous, some models based on prefixes, suffixes, and hybrids were designed to generate smaller SAT instances. There is a very simple and obvious property that says: if there is an NFA of size k for a given sample, there is also an NFA of size k+1. We first strengthen this property by adding some characteristics to the NFA of size k+1. Hence, we can use this property to tighten the bounds of the size of the minimal NFA for a given sample. We then propose simplified and refined models for NFA of size k+1 that are smaller than the initial models for NFA of size k. We also propose a reduction algorithm to build an NFA of size k from a specific NFA of size k+1. Finally, we validate our proposition with some experimentation that shows the efficiency of our approach.
Understanding Post-hoc Explainers: The Case of Anchors
Lopardo, Gianluigi, Precioso, Frederic, Garreau, Damien
In many scenarios, the interpretability of machine learning models is a highly required but difficult task. To explain the individual predictions of such models, local model-agnostic approaches have been proposed. However, the process generating the explanations can be, for a user, as mysterious as the prediction to be explained. Furthermore, interpretability methods frequently lack theoretical guarantees, and their behavior on simple models is frequently unknown. While it is difficult, if not impossible, to ensure that an explainer behaves as expected on a cutting-edge model, we can at least ensure that everything works on simple, already interpretable models. In this paper, we present a theoretical analysis of Anchors (Ribeiro et al., 2018): a popular rule-based interpretability method that highlights a small set of words to explain a text classifier's decision. After formalizing its algorithm and providing useful insights, we demonstrate mathematically that Anchors produces meaningful results when used with linear text classifiers on top of a TF-IDF vectorization. We believe that our analysis framework can aid in the development of new explainability methods based on solid theoretical foundations.
How AI Is Helping Companies Redesign Processes
In the 1990s, business process reengineering was all the rage: Companies used budding technologies such as enterprise resource planning (ERP) systems and the internet to enact radical changes to broad, end-to-end business processes. Buoyed by reengineering's academic and consulting proponents, companies anticipated transformative changes to broad processes like order-to-cash and conception to commercialization of new products. But while technology did bring major updates, implementations often failed to live up to the sky-high expectations. For example, large-scale ERP systems like SAP or Oracle provided a useful IT backbone to exchange data, yet also created very rigid processes that were hard to change past the IT implementation. Since then, process management typically involved only incremental change to local processes -- Lean and Six Sigma for repetitive processes, and Agile Lean Startup methods for development -- all without any assistance from technology.
A Rule Based Theorem Prover: an Introduction to Proofs in Secondary Schools
Teles, Joana, Santos, Vanda, Quaresma, Pedro
The introduction of automated deduction systems in secondary schools faces several bottlenecks. Beyond the problems related with the curricula and the teachers, the dissonance between the outcomes of the geometry automated theorem provers and the normal practice of conjecturing and proving in schools is a major barrier to a wider use of such tools in an educational environment. Since the early implementations of geometry automated theorem provers, applications of artificial intelligence methods, synthetic provers based on inference rules and using forward chaining reasoning are considered to be best suited for education proposes. Choosing an appropriate set of rules and an automated method that can use those rules is a major challenge. We discuss one such rule set and its implementation using the geometry deductive databases method (GDDM). The approach is tested using some chosen geometric conjectures that could be the goal of a 7th year class ( 12-year-old students). A lesson plan is presented, its goal is the introduction of formal demonstration of proving geometric theorems, trying to motivate students to that goal.
Optimizing CAD Models with Latent Space Manipulation
Elstner, Jannes, Schönhof, Raoul G. C., Tauber, Steffen, Huber, Marco F
When it comes to the optimization of CAD models in the automation domain, neural networks currently play only a minor role. Optimizing abstract features such as automation capability is challenging, since they can be very difficult to simulate, are too complex for rule-based systems, and also have little to no data available for machine-learning methods. On the other hand, image manipulation methods that can manipulate abstract features in images such as StyleCLIP have seen much success. They rely on the latent space of pretrained generative adversarial networks, and could therefore also make use of the vast amount of unlabeled CAD data. In this paper, we show that such an approach is also suitable for optimizing abstract automation-related features of CAD parts. We achieved this by extending StyleCLIP to work with CAD models in the form of voxel models, which includes using a 3D StyleGAN and a custom classifier. Finally, we demonstrate the ability of our system for the optimiziation of automation-related features by optimizing the grabability of various CAD models. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/) Peer review under the responsibility of the scientific committee of the 33rd CIRP Design Conference.
Learning Hybrid Interpretable Models: Theory, Taxonomy, and Methods
Ferry, Julien, Laberge, Gabriel, Aïvodji, Ulrich
A hybrid model involves the cooperation of an interpretable model and a complex black box. At inference, any input of the hybrid model is assigned to either its interpretable or complex component based on a gating mechanism. The advantages of such models over classical ones are two-fold: 1) They grant users precise control over the level of transparency of the system and 2) They can potentially perform better than a standalone black box since redirecting some of the inputs to an interpretable model implicitly acts as regularization. Still, despite their high potential, hybrid models remain under-studied in the interpretability/explainability literature. In this paper, we remedy this fact by presenting a thorough investigation of such models from three perspectives: Theory, Taxonomy, and Methods. First, we explore the theory behind the generalization of hybrid models from the Probably-Approximately-Correct (PAC) perspective. A consequence of our PAC guarantee is the existence of a sweet spot for the optimal transparency of the system. When such a sweet spot is attained, a hybrid model can potentially perform better than a standalone black box. Secondly, we provide a general taxonomy for the different ways of training hybrid models: the Post-Black-Box and Pre-Black-Box paradigms. These approaches differ in the order in which the interpretable and complex components are trained. We show where the state-of-the-art hybrid models Hybrid-Rule-Set and Companion-Rule-List fall in this taxonomy. Thirdly, we implement the two paradigms in a single method: HybridCORELS, which extends the CORELS algorithm to hybrid modeling. By leveraging CORELS, HybridCORELS provides a certificate of optimality of its interpretable component and precise control over transparency. We finally show empirically that HybridCORELS is competitive with existing hybrid models, and performs just as well as a standalone black box (or even better) while being partly transparent.