Giles, C. Lee
Predicting the Reproducibility of Social and Behavioral Science Papers Using Supervised Learning Models
Wu, Jian, Nivargi, Rajal, Lanka, Sree Sai Teja, Menon, Arjun Manoj, Modukuri, Sai Ajay, Nakshatri, Nishanth, Wei, Xin, Wang, Zhuoer, Caverlee, James, Rajtmajer, Sarah M., Giles, C. Lee
In recent years, significant effort has been invested verifying the reproducibility and robustness of research claims in social and behavioral sciences (SBS), much of which has involved resource-intensive replication projects. In this paper, we investigate prediction of the reproducibility of SBS papers using machine learning methods based on a set of features. We propose a framework that extracts five types of features from scholarly work that can be used to support assessments of reproducibility of published research claims. Bibliometric features, venue features, and author features are collected from public APIs or extracted using open source machine learning libraries with customized parsers. Statistical features, such as p-values, are extracted by recognizing patterns in the body text. Semantic features, such as funding information, are obtained from public APIs or are extracted using natural language processing models. We analyze pairwise correlations between individual features and their importance for predicting a set of human-assessed ground truth labels. In doing so, we identify a subset of 9 top features that play relatively more important roles in predicting the reproducibility of SBS papers in our corpus. Results are verified by comparing performances of 10 supervised predictive classifiers trained on different sets of features.
Lifelong Neural Predictive Coding: Sparsity Yields Less Forgetting when Learning Cumulatively
Ororbia, Alexander, Mali, Ankur, Kifer, Daniel, Giles, C. Lee
In lifelong learning systems, especially those based on artificial neural networks, one of the biggest obstacles is the severe inability to retain old knowledge as new information is encountered. This phenomenon is known as catastrophic forgetting. In this paper, we present a new connectionist model, the Sequential Neural Coding Network, and its learning procedure, grounded in the neurocognitive theory of predictive coding. The architecture experiences significantly less forgetting as compared to standard neural models and outperforms a variety of previously proposed remedies and methods when trained across multiple task datasets in a stream-like fashion. The promising performance demonstrated in our experiments offers motivation that directly incorporating mechanisms prominent in real neuronal systems, such as competition, sparse activation patterns, and iterative input processing, can create viable pathways for tackling the challenge of lifelong machine learning.
Verification of Recurrent Neural Networks Through Rule Extraction
Wang, Qinglong, Zhang, Kaixuan, Liu, Xue, Giles, C. Lee
The verification problem for neural networks is verifying whether a neural network will suffer from adversarial samples, or approximating the maximal allowed scale of adversarial perturbation that can be endured. While most prior work contributes to verifying feed-forward networks, little has been explored for verifying recurrent networks. This is due to the existence of a more rigorous constraint on the perturbation space for sequential data, and the lack of a proper metric for measuring the perturbation. In this work, we address these challenges by proposing a metric which measures the distance between strings, and use deterministic finite automata (DFA) to represent a rigorous oracle which examines if the generated adversarial samples violate certain constraints on a perturbation. More specifically, we empirically show that certain recurrent networks allow relatively stable DFA extraction. As such, DFAs extracted from these recurrent networks can serve as a surrogate oracle for when the ground truth DFA is unknown. We apply our verification mechanism to several widely used recurrent networks on a set of the Tomita grammars. The results demonstrate that only a few models remain robust against adversarial samples. In addition, we show that for grammars with different levels of complexity, there is also a difference in the difficulty of robust learning of these grammars.
Conducting Credit Assignment by Aligning Local Representations
Ororbia, Alexander G., Mali, Ankur, Kifer, Daniel, Giles, C. Lee
The use of back-propagation and its variants to train deep networks is often problematic for new users, with issues such as exploding gradients, vanishing gradients, and high sensitivity to weight initialization strategies often making networks difficult to train. In this paper, we present Local Representation Alignment (LRA), a training procedure that is much less sensitive to bad initializations, does not require modifications to the network architecture, and can be adapted to networks with highly nonlinear and discrete-valued activation functions. Furthermore, we show that one variation of LRA can start with a null initialization of network weights and still successfully train networks with a wide variety of nonlinearities, including tanh, ReLU-6, softplus, signum and others that are more biologically plausible. Experiments on MNIST and Fashion MNIST validate the performance of the algorithm and show that LRA can train networks robustly and effectively, succeeding even when back-propagation fails and outperforming other alternative learning algorithms, such as target propagation and feedback alignment.
Investigating Active Learning for Concept Prerequisite Learning
Liang, Chen (Pennsylvania State University) | Ye, Jianbo (Pennsylvania State University) | Wang, Shuting (Pennsylvania State University) | Pursel, Bart (Pennsylvania State University) | Giles, C. Lee (Pennsylvania State University)
Concept prerequisite learning focuses on machine learning methods for measuring the prerequisite relation among concepts. With the importance of prerequisites for education, it has recently become a promising research direction. A major obstacle to extracting prerequisites at scale is the lack of large-scale labels which will enable effective data-driven solutions. We investigate the applicability of active learning to concept prerequisite learning.We propose a novel set of features tailored for prerequisite classification and compare the effectiveness of four widely used query strategies. Experimental results for domains including data mining, geometry, physics, and precalculus show that active learning can be used to reduce the amount of training data required. Given the proposed features, the query-by-committee strategy outperforms other compared query strategies.
Active Learning of Strict Partial Orders: A Case Study on Concept Prerequisite Relations
Liang, Chen, Ye, Jianbo, Zhao, Han, Pursel, Bart, Giles, C. Lee
Strict partial order is a mathematical structure commonly seen in relational data. One obstacle to extracting such type of relations at scale is the lack of large-scale labels for building effective data-driven solutions. We develop an active learning framework for mining such relations subject to a strict order. Our approach incorporates relational reasoning not only in finding new unlabeled pairs whose labels can be deduced from an existing label set, but also in devising new query strategies that consider the relational structure of labels. Our experiments on concept prerequisite relations show our proposed framework can substantially improve the classification performance with the same query budget compared to other baseline approaches.
Learning to Adapt by Minimizing Discrepancy
Ororbia, Alexander G. II, Haffner, Patrick, Reitter, David, Giles, C. Lee
We explore whether useful temporal neural generative models can be learned from sequential data without back-propagation through time. We investigate the viability of a more neurocognitively-grounded approach in the context of unsupervised generative modeling of sequences. Specifically, we build on the concept of predictive coding, which has gained influence in cognitive science, in a neural framework. To do so we develop a novel architecture, the Temporal Neural Coding Network, and its learning algorithm, Discrepancy Reduction. The underlying directed generative model is fully recurrent, meaning that it employs structural feedback connections and temporal feedback connections, yielding information propagation cycles that create local learning signals. This facilitates a unified bottom-up and top-down approach for information transfer inside the architecture. Our proposed algorithm shows promise on the bouncing balls generative modeling problem. Further experiments could be conducted to explore the strengths and weaknesses of our approach.
A Machine Learning Approach for Semantic Structuring of Scientific Charts in Scholarly Documents
Al-Zaidy, Rabah A. (The Pennsylvania State University) | Giles, C. Lee (The Pennsylvania State University)
Large scholarly repositories are designed to provide scientists and researchers with a wealth of information that is retrieved from data present in a variety of formats. A typical scholarly document contains information in a combined layout of texts and graphic images. Common types of graphics found in these documents are scientific charts that are used to represent data values in a visual format. Experimental results are rarely described without the aid of one form of a chart or another, whether it is 2D plot, bar chart, pie chart, etc. Metadata of these graphics is usually the only content that is made available for search by user queries. By processing the image content and extracting the data represented in the graphics, search engines will be able to handle more specific queries related to the data itself. In this paper we describe a machine learning based system that extracts and recognizes the various data fields present in a bar chart for semantic labeling. Our approach comprises of a graphics and text separation and extraction phase, followed by a component role classification for both text and graphic components that are in turn used for semantic analysis and representation of the chart. The proposed system is tested on a set of over 200 bar charts extracted from over 1,000 scientific articles in PDF format.
Recovering Concept Prerequisite Relations from University Course Dependencies
Liang, Chen (Pennsylvania State University) | Ye, Jianbo (Pennsylvania State University) | Wu, Zhaohui (Microsoft Corporation) | Pursel, Bart (Pennsylvania State University) | Giles, C. Lee (Pennsylvania State University)
Prerequisite relations among concepts play an important role in many educational applications such as intelligent tutoring system and curriculum planning. With the increasing amount of educational data available, automatic discovery of concept prerequisite relations has become both an emerging research opportunity and an open challenge. Here, we investigate how to recover concept prerequisite relations from course dependencies and propose an optimization based framework to address the problem. We create the first real dataset for empirically studying this problem, which consists of the listings of computer science courses from 11 U.S. universities and their concept pairs with prerequisite labels. Experiment results on a synthetic dataset and the real course dataset both show that our method outperforms existing baselines.
Reports of the 2016 AAAI Workshop Program
Albrecht, Stefano (The University of Texas at Austin) | Bouchard, Bruno (Université du Québec à Chicoutimi) | Brownstein, John S. (Harvard University) | Buckeridge, David L. (McGill University) | Caragea, Cornelia (University of North Texas) | Carter, Kevin M. (MIT Lincoln Laboratory) | Darwiche, Adnan (University of California, Los Angeles) | Fortuna, Blaz (Bloomberg L.P. and Jozef Stefan Institute) | Francillette, Yannick (Université du Québec à Chicoutimi) | Gaboury, Sébastien (Université du Québec à Chicoutimi) | Giles, C. Lee (Pennsylvania State University) | Grobelnik, Marko (Jozef Stefan Institute) | Hruschka, Estevam R. (Federal University of São Carlos) | Kephart, Jeffrey O. (IBM Thomas J. Watson Research Center) | Kordjamshidi, Parisa (University of Illinois at Urbana-Champaign) | Lisy, Viliam (University of Alberta) | Magazzeni, Daniele (King's College London) | Marques-Silva, Joao (University of Lisbon) | Marquis, Pierre (Université d'Artois) | Martinez, David (MIT Lincoln Laboratory) | Michalowski, Martin (Adventium Labs) | Shaban-Nejad, Arash (University of California, Berkeley) | Noorian, Zeinab (Ryerson University) | Pontelli, Enrico (New Mexico State University) | Rogers, Alex (University of Oxford) | Rosenthal, Stephanie (Carnegie Mellon University) | Roth, Dan (University of Illinois at Urbana-Champaign) | Sinha, Arunesh (University of Southern California) | Streilein, William (MIT Lincoln Laboratory) | Thiebaux, Sylvie (The Australian National University) | Tran, Son Cao (New Mexico State University) | Wallace, Byron C. (University of Texas at Austin) | Walsh, Toby (University of New South Wales and Data61) | Witbrock, Michael (Lucid AI) | Zhang, Jie (Nanyang Technological University)
The Workshop Program of the Association for the Advancement of Artificial Intelligence's Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) was held at the beginning of the conference, February 12-13, 2016. Workshop participants met and discussed issues with a selected focus -- providing an informal setting for active exchange among researchers, developers and users on topics of current interest. To foster interaction and exchange of ideas, the workshops were kept small, with 25-65 participants. Attendance was sometimes limited to active participants only, but most workshops also allowed general registration by other interested individuals.