Uncertainty
How linguistic descriptions of data can help to the teaching-learning process in higher education, case of study: artificial intelligence
Rubio-Manzano, Clemente, Senoceain, Tomas Lermanda
Artificial Intelligence is a central topic in the computer science curriculum. From the year 2011 a project-based learning methodology based on computer games has been designed and implemented into the intelligence artificial course at the University of the Bio-Bio. The project aims to develop software-controlled agents (bots) which are programmed by using heuristic algorithms seen during the course. This methodology allows us to obtain good learning results, however several challenges have been founded during its implementation. In this paper we show how linguistic descriptions of data can help to provide students and teachers with technical and personalized feedback about the learned algorithms. Algorithm behavior profile and a new Turing test for computer games bots based on linguistic modelling of complex phenomena are also proposed in order to deal with such challenges. In order to show and explore the possibilities of this new technology, a web platform has been designed and implemented by one of authors and its incorporation in the process of assessment allows us to improve the teaching learning process.
A Rational Distributed Process-level Account of Independence Judgment
Nobandegani, Ardavan S., Psaromiligkos, Ioannis N.
It is inconceivable how chaotic the world would look to humans, faced with innumerable decisions a day to be made under uncertainty, had they been lacking the capacity to distinguish the relevant from the irrelevant---a capacity which computationally amounts to handling probabilistic independence relations. The highly parallel and distributed computational machinery of the brain suggests that a satisfying process-level account of human independence judgment should also mimic these features. In this work, we present the first rational, distributed, message-passing, process-level account of independence judgment, called $\mathcal{D}^\ast$. Interestingly, $\mathcal{D}^\ast$ shows a curious, but normatively-justified tendency for quick detection of dependencies, whenever they hold. Furthermore, $\mathcal{D}^\ast$ outperforms all the previously proposed algorithms in the AI literature in terms of worst-case running time, and a salient aspect of it is supported by recent work in neuroscience investigating possible implementations of Bayes nets at the neural level. $\mathcal{D}^\ast$ nicely exemplifies how the pursuit of cognitive plausibility can lead to the discovery of state-of-the-art algorithms with appealing properties, and its simplicity makes $\mathcal{D}^\ast$ potentially a good candidate for pedagogical purposes.
Bayesian Neural Networks
Mullachery, Vikram, Khera, Aniruddh, Husain, Amir
This paper describes and discusses Bayesian Neural Network (BNN). The paper showcases a few different applications of them for classification and regression problems. BNNs are comprised of a Probabilistic Model and a Neural Network. The intent of such a design is to combine the strengths of Neural Networks and Stochastic modeling. Neural Networks exhibit continuous function approximator capabilities. Stochastic models allow direct specification of a model with known interaction between parameters to generate data. During the prediction phase, stochastic models generate a complete posterior distribution and produce probabilistic guarantees on the predictions. Thus BNNs are a unique combination of neural network and stochastic models with the stochastic model forming the core of this integration. BNNs can then produce probabilistic guarantees on it's predictions and also generate the distribution of parameters that it has learnt from the observations. That means, in the parameter space, one can deduce the nature and shape of the neural network's learnt parameters. These two characteristics makes them highly attractive to theoreticians as well as practitioners. Recently there has been a lot of activity in this area, with the advent of numerous probabilistic programming libraries such as: PyMC3, Edward, Stan etc. Further this area is rapidly gaining ground as a standard machine learning approach for numerous problems
Overcoming Catastrophic Forgetting by Incremental Moment Matching
Lee, Sang-Woo, Kim, Jin-Hwa, Jun, Jaehyun, Ha, Jung-Woo, Zhang, Byoung-Tak
Catastrophic forgetting is a problem of neural networks that loses the information of the first task after training the second task. Here, we propose a method, i.e. incremental moment matching (IMM), to resolve this problem. IMM incrementally matches the moment of the posterior distribution of the neural network which is trained on the first and the second task, respectively. To make the search space of posterior parameter smooth, the IMM procedure is complemented by various transfer learning techniques including weight transfer, L2-norm of the old and the new parameter, and a variant of dropout with the old parameter. We analyze our approach on a variety of datasets including the MNIST, CIFAR-10, Caltech-UCSD- Birds, and Lifelog datasets. The experimental results show that IMM achieves state-of-the-art performance by balancing the information between an old and a new network.
Over-representation of Extreme Events in Decision-Making: A Rational Metacognitive Account
Nobandegani, Ardavan S., Castanheira, Kevin da Silva, Otto, A. Ross, Shultz, Thomas R.
The Availability bias, manifested in the over-representation of extreme eventualities in decision-making, is a well-known cognitive bias, and is generally taken as evidence of human irrationality. In this work, we present the first rational, metacognitive account of the Availability bias, formally articulated at Marr's algorithmic level of analysis. Concretely, we present a normative, metacognitive model of how a cognitive system should over-represent extreme eventualities, depending on the amount of time available at its disposal for decision-making. Our model also accounts for two well-known framing effects in human decision-making under risk---the fourfold pattern of risk preferences in outcome probability (Tversky & Kahneman, 1992) and in outcome magnitude (Markovitz, 1952)---thereby providing the first metacognitively-rational basis for those effects. Empirical evidence, furthermore, confirms an important prediction of our model. Surprisingly, our model is unimaginably robust with respect to its focal parameter. We discuss the implications of our work for studies on human decision-making, and conclude by presenting a counterintuitive prediction of our model, which, if confirmed, would have intriguing implications for human decision-making under risk. To our knowledge, our model is the first metacognitive, resource-rational process model of cognitive biases in decision-making.
Contextual Explanation Networks
Al-Shedivat, Maruan, Dubey, Avinava, Xing, Eric P.
We introduce contextual explanation networks (CENs)---a class of models that learn to predict by generating and leveraging intermediate explanations. CENs are deep networks that generate parameters for context-specific probabilistic graphical models which are further used for prediction and play the role of explanations. Contrary to the existing post-hoc model-explanation tools, CENs learn to predict and to explain jointly. Our approach offers two major advantages: (i) for each prediction, valid instance-specific explanations are generated with no computational overhead and (ii) prediction via explanation acts as a regularization and boosts performance in low-resource settings. We prove that local approximations to the decision boundary of our networks are consistent with the generated explanations. Our results on image and text classification and survival analysis tasks demonstrate that CENs are competitive with the state-of-the-art while offering additional insights behind each prediction, valuable for decision support.
Towards a Quantum World Wide Web
Aerts, Diederik, Arguelles, Jonito Aerts, Beltran, Lester, Beltran, Lyneth, Distrito, Isaac, de Bianchi, Massimiliano Sassoli, Sozzo, Sandro, Veloz, Tomas
We elaborate a quantum model for the meaning associated with corpora of written documents, like the pages forming the World Wide Web. To that end, we are guided by how physicists constructed quantum theory for microscopic entities, which unlike classical objects cannot be fully represented in our spatial theater. We suggest that a similar construction needs to be carried out by linguists and computational scientists, to capture the full meaning carried by collections of documental entities. More precisely, we show how to associate a quantum-like 'entity of meaning' to a 'language entity formed by printed documents', considering the latter as the collection of traces that are left by the former, in specific results of search actions that we describe as measurements. In other words, we offer a perspective where a collection of documents, like the Web, is described as the space of manifestation of a more complex entity - the QWeb - which is the object of our modeling, drawing its inspiration from previous studies on operational-realistic approaches to quantum physics and quantum modeling of human cognition and decision-making. We emphasize that a consistent QWeb model needs to account for the observed correlations between words appearing in printed documents, e.g., co-occurrences, as the latter would depend on the 'meaning connections' existing between the concepts that are associated with these words. In that respect, we show that both 'context and interference (quantum) effects' are required to explain the probabilities calculated by counting the relative number of documents containing certain words and co-ocurrrences of words.
Marketing Analytics: Methods, Practice, Implementation, and Links to Other Fields
France, Stephen L., Ghose, Sanjoy
Marketing analytics is a diverse field, with both academic researchers and practitioners coming from a range of backgrounds including marketing, operations research, statistics, and computer science. This paper provides an integrative review at the boundary of these three areas. The topics of visualization, segmentation, and class prediction are featured. Links between the disciplines are emphasized. For each of these topics, a historical overview is given, starting with initial work in the 1960s and carrying through to the present day. Recent innovations for modern large and complex "big data" sets are described. Practical implementation advice is given, along with a directory of open source R routines for implementing marketing analytics techniques.
A Review of Multiple Try MCMC algorithms for Signal Processing
Many applications in signal processing require the estimation of some parameters of interest given a set of observed data. More specifically, Bayesian inference needs the computation of {\it a-posteriori} estimators which are often expressed as complicated multi-dimensional integrals. Unfortunately, analytical expressions for these estimators cannot be found in most real-world applications, and Monte Carlo methods are the only feasible approach. A very powerful class of Monte Carlo techniques is formed by the Markov Chain Monte Carlo (MCMC) algorithms. They generate a Markov chain such that its stationary distribution coincides with the target posterior density. In this work, we perform a thorough review of MCMC methods using multiple candidates in order to select the next state of the chain, at each iteration. With respect to the classical Metropolis-Hastings method, the use of multiple try techniques foster the exploration of the sample space. We present different Multiple Try Metropolis schemes, Ensemble MCMC methods, Particle Metropolis-Hastings algorithms and the Delayed Rejection Metropolis technique. We highlight limitations, benefits, connections and differences among the different methods, and compare them by numerical simulations.
Ontology-based Fuzzy Markup Language Agent for Student and Robot Co-Learning
Lee, Chang-Shing, Wang, Mei-Hui, Huang, Tzong-Xiang, Chen, Li-Chung, Huang, Yung-Ching, Yang, Sheng-Chi, Tseng, Chien-Hsun, Hung, Pi-Hsia, Kubota, Naoyuki
An intelligent robot agent based on domain ontology, machine learning mechanism, and Fuzzy Markup Language (FML) for students and robot co-learning is presented in this paper. The machine-human co-learning model is established to help various students learn the mathematical concepts based on their learning ability and performance. Meanwhile, the robot acts as a teacher's assistant to co-learn with children in the class. The FML-based knowledge base and rule base are embedded in the robot so that the teachers can get feedback from the robot on whether students make progress or not. Next, we inferred students' learning performance based on learning content's difficulty and students' ability, concentration level, as well as teamwork sprit in the class. Experimental results show that learning with the robot is helpful for disadvantaged and below-basic children. Moreover, the accuracy of the intelligent FML-based agent for student learning is increased after machine learning mechanism.