Instructional Material
Online dictionary learning for kernel LMS. Analysis and forward-backward splitting algorithm
Gao, Wei, Chen, Jie, Richard, Cédric, Huang, Jianguo
Adaptive filtering algorithms operating in reproducing kernel Hilbert spaces have demonstrated superiority over their linear counterpart for nonlinear system identification. Unfortunately, an undesirable characteristic of these methods is that the order of the filters grows linearly with the number of input data. This dramatically increases the computational burden and memory requirement. A variety of strategies based on dictionary learning have been proposed to overcome this severe drawback. Few, if any, of these works analyze the problem of updating the dictionary in a time-varying environment. In this paper, we present an analytical study of the convergence behavior of the Gaussian least-mean-square algorithm in the case where the statistics of the dictionary elements only partially match the statistics of the input data. This allows us to emphasize the need for updating the dictionary in an online way, by discarding the obsolete elements and adding appropriate ones. We introduce a kernel least-mean-square algorithm with L1-norm regularization to automatically perform this task. The stability in the mean of this method is analyzed, and its performance is tested with experiments.
Hacking Smart Machines with Smarter Ones: How to Extract Meaningful Data from Machine Learning Classifiers
Ateniese, Giuseppe, Felici, Giovanni, Mancini, Luigi V., Spognardi, Angelo, Villani, Antonio, Vitali, Domenico
Machine Learning (ML) algorithms are used to train computers to perform a variety of complex tasks and improve with experience. Computers learn how to recognize patterns, make unintended decisions, or react to a dynamic environment. Certain trained machines may be more effective than others because they are based on more suitable ML algorithms or because they were trained through superior training sets. Although ML algorithms are known and publicly released, training sets may not be reasonably ascertainable and, indeed, may be guarded as trade secrets. While much research has been performed about the privacy of the elements of training sets, in this paper we focus our attention on ML classifiers and on the statistical information that can be unconsciously or maliciously revealed from them. We show that it is possible to infer unexpected but useful information from ML classifiers. In particular, we build a novel meta-classifier and train it to hack other classifiers, obtaining meaningful information about their training sets. This kind of information leakage can be exploited, for example, by a vendor to build more effective classifiers or to simply acquire trade secrets from a competitor's apparatus, potentially violating its intellectual property rights.
Multi-Objective AI Planning: Comparing Aggregation and Pareto Approaches
Khouadjia, Mostepha Redouane, Schoenauer, Marc, Vidal, Vincent, Dréo, Johann, Savéant, Pierre
Most real-world Planning problems are multi-objective, trying to minimize both the makespan of the solution plan, and some cost of the actions involved in the plan. But most, if not all existing approaches are based on single-objective planners, and use an aggregation of the objectives to remain in the single-objective context. Divide and Evolve (DaE) is an evolutionary planner that won the temporal deterministic satisficing track at the last International Planning Competitions (IPC). Like all Evolutionary Algorithms (EA), it can easily be turned into a Pareto-based Multi-Objective EA. It is however important to validate the resulting algorithm by comparing it with the aggregation approach: this is the goal of this paper. The comparative experiments on a recently proposed benchmark set that are reported here demonstrate the usefulness of going Pareto-based in AI Planning.
Stochastic Variational Inference
Hoffman, Matt, Blei, David M., Wang, Chong, Paisley, John
We develop stochastic variational inference, a scalable algorithm for approximating posterior distributions. We develop this technique for a large class of probabilistic models and we demonstrate it with two probabilistic topic models, latent Dirichlet allocation and the hierarchical Dirichlet process topic model. Using stochastic variational inference, we analyze several large collections of documents: 300K articles from Nature, 1.8M articles from The New York Times, and 3.8M articles from Wikipedia. Stochastic inference can easily handle data sets of this size and outperforms traditional variational inference, which can only handle a smaller subset. (We also show that the Bayesian nonparametric topic model outperforms its parametric counterpart.) Stochastic variational inference lets us apply complex Bayesian models to massive data sets.
Pattern-Based Constraint Satisfaction and Logic Puzzles
Pattern-Based Constraint Satisfaction and Logic Puzzles develops a pure logic, pattern-based perspective of solving the finite Constraint Satisfaction Problem (CSP), with emphasis on finding the "simplest" solution. Different ways of reasoning with the constraints are formalised by various families of "resolution rules", each of them carrying its own notion of simplicity. A large part of the book illustrates the power of the approach by applying it to various popular logic puzzles. It provides a unified view of how to model and solve them, even though they involve very different types of constraints: obvious symmetric ones in Sudoku, non-symmetric but transitive ones (inequalities) in Futoshiki, topological and geometric ones in Map colouring, Numbrix and Hidato, and even much more complex non-binary arithmetic ones in Kakuro (or Cross Sums). It also shows that the most familiar techniques for these puzzles can indeed be understood as mere application-specific presentations of the general rules. Sudoku is used as the main example throughout the book, making it also an advanced level sequel to "The Hidden Logic of Sudoku" (another book by the same author), with: many examples of relationships among different rules and of exceptional situations; comparisons of the resolution potential of various families of rules; detailed statistics of puzzles hardness; analysis of extreme instances.
Mechanix: A Sketch-Based Tutoring and Grading System for Free-Body Diagrams
Valentine, Stephanie (Texas A&M University) | Vides, Francisco (Texas A&M University) | Lucchese, George (Texas A&M University) | Turner, David (Texas A&M University) | Kim, Hong-hoe (Texas A&M University) | Li, Wenzhe (Texas A&M University) | Linsey, Julie (Texas A&M University) | Hammond, Tracy (Texas A&M University)
Introductory engineering courses within large universities often have annual enrollments which can reach up to a thousand students. It is very challenging to achieve differentiated instruction in classrooms with class sizes and student diversity of such great magnitude. Professors can only assess whether students have mastered a concept by using multiple choice questions, while detailed homework assignments, such as planar truss diagrams, are rarely assigned because professors and teaching assistants would be too overburdened with grading to return assignments with valuable feedback in a timely manner. In this paper, we introduce Mechanix, a sketch-based deployed tutoring system for engineering students enrolled in statics courses. Our system not only allows students to enter planar truss and free body diagrams into the system just as they would with pencil and paper, but our system checks the student's work against a hand-drawn answer entered by the instructor, and then returns immediate and detailed feedback to the student. Students are allowed to correct any errors in their work and resubmit until the entire content is correct and thus all of the objectives are learned. Since Mechanix facilitates the grading and feedback processes, instructors are now able to assign free response questions, increasing teacher's knowledge of student comprehension. Furthermore, the iterative correction process allows students to learn during a test, rather than simply displaying memorized information.
Game-Initiated Learning: A Case Study For Disaster Education Research In Taiwan
Lin, Sarah Chen (National Taiwan University) | Tsai, Meng-Han (National Taiwan University ) | Chang, Yu-Lien (National Taiwan University) | Kang, Shih-Chung (National Taiwan University)
Game-based learning has been proven an effective method to engage students in the class. However, it is very challenging to balance playability and learnability when only developing digital games. Some "playable" games may not carry sufficient knowledge; some "learnable" games may reduce the students' interest and curiosity. In this ongoing research, we proposed an innovative learning method, "game-initiated learning." This method consists of three main steps: game, discussion and self-directedlearning. In this model, students can experience real-world problems from the game, discuss problems they found in the game, and finally, the instructors can deliver related knowledge that is useful to solving the problems previously discussed. To validate the proposed method, we selected a topic of disaster education in Taiwan and experimentally developed a set of course materials including a digital game, animation videos and an e-book. We conducted a review meeting, inviting experts from hydraulic engineering, game development, and disaster mediation as well as schoolteachers and students. The reviewers were asked to play the games and review all course materials. From the feedbacks of the reviewers, we found game-initiated learning an educational method with great potential in providing tacit and explicit knowledge about disaster management.
Fairness in Academic Course Timetabling
Mühlenthaler, Moritz, Wanka, Rolf
We consider the problem of creating fair course timetables in the setting of a university. Our motivation is to improve the overall satisfaction of individuals concerned (students, teachers, etc.) by providing a fair timetable to them. The central idea is that undesirable arrangements in the course timetable, i.e., violations of soft constraints, should be distributed in a fair way among the individuals. We propose two formulations for the fair course timetabling problem that are based on max-min fairness and Jain's fairness index, respectively. Furthermore, we present and experimentally evaluate an optimization algorithm based on simulated annealing for solving max-min fair course timetabling problems. The new contribution is concerned with measuring the energy difference between two timetables, i.e., how much worse a timetable is compared to another timetable with respect to max-min fairness. We introduce three different energy difference measures and evaluate their impact on the overall algorithm performance. The second proposed problem formulation focuses on the tradeoff between fairness and the total amount of soft constraint violations. Our experimental evaluation shows that the known best solutions to the ITC2007 curriculum-based course timetabling instances are quite fair with respect to Jain's fairness index. However, the experiments also show that the fairness can be improved further for only a rather small increase in the total amount of soft constraint violations.
Causality in concurrent systems
Crafa, Silvia, Russo, Federica
In the terminology of computer science, Concurrent Systems identify systems, either software, hardware or even biological systems, where sets of activities run in parallel with possible occasional interactions. A simple example of concurrent system is the Internet, which can be thought of as a set of computers, each one computing its independent activity, that often communicate to exchange some information. A further example is the railway system of a country, where many trains travel sharing tracks in an ordered way so that two trains can move at the same time along different tracks, whereas a single track (e.g, a platform in a train station) can only be used by a single train at a time. Furthermore, the large number of activities carried on by a single human cell form a biological concurrent system, that actually shares a number of similarities with the Internet. Compared to sequential systems, where a single action is executed at a time according to a sequential algorithm, concurrent systems raise new complex issues dealing with the ordering of action executions.