Asia
Hashigo: A Next-Generation Sketch Interactive System for Japanese Kanji
Taele, Paul (Texas A&M University) | Hammond, Tracy (Texas A&M University)
Language students can increase their effectiveness in learning written Japanese by mastering the visual structure and written technique of Japanese kanji. Yet, existing kanji handwriting recognition systems do not assess the written technique sufficiently enough to discourage students from developing bad learning habits. In this paper, we describe our work on Hashigo, a kanji sketch interactive system which achieves human instructor-level critique and feedback on both the visual structure and written technique of students’ sketched kanji. This type of automated critique and feedback allows students to target and correct specific deficiencies in their sketches that, if left untreated, are detrimental to effective long-term kanji learning.
Not So Naive Online Bayesian Spam Filter
Su, Baojun (Zhejiang University) | Xu, Congfu (Zhejiang University)
Spam filtering, as a key problem in electronic communication, has drawn significant attention due to increasingly huge amounts of junk email on the Internet. Content-based filtering is one reliable method in combating with spammers' changing tactics. Naive Bayes (NB) is one of the earliest content-based machine learning methods both in theory and practice in combating with spammers, which is easy to implement while can achieve considerable accuracy. In this paper, the traditional online Bayesian classifier are enhanced by two ways. First, from theory's point of view, we devise a self-adaptive mechanism to gradually weaken the assumption of independence required by original NB in the online training process, and as a result of that our NSNB is no longer ``naive''. Second, we propose other engineering ways to make the classifier more robust and accuracy. The experiment results show that our NSNB does give state-of-the-art classification performance on online spam filtering on large benchmark data sets while it is extremely fast and takes up little memory in comparison with other statistical methods.
Automating Art Print Authentication Using Metric Learning
Parker, Charles Lincoln (Eastman Kodak Company) | Messier, Paul (Paul Messier, LLC)
An important problem in the world of art historians is determining the type of paper on which a photograph is printed. One way to determine the paper type is to capture a highly magnified image of the paper, then to compare this image to a database of known paper images. Traditionally, this process is carried out by a human and is generally time-intensive. Here we propose an automated solution to this problem, using wavelet decomposition techniques from image processing, as well as metric learning from the machine learning area. We show, on a collection of real-world images of photographic paper, that the use of machine learning techniques produces a much better solution than image processing alone.
Pedagogical Discourse: Connecting Students to Past Discussions and Peer Mentors within an Online Discussion Board
The goal of the Pedagogical Discourse project is to develop instructional tools that will help students and instructors use discussion boards more effectively, with an emphasis on automatically assessing discussion activities and building tools for promoting student discussion participation and learning. In this paper, we present a two related participation and learning scaffolding tools that exploit natural language processing and information retrieval techniques. The PedaBot tool is designed to aid student knowledge acquisition and promote reflection about course topics by connecting related discussions from a knowledge base of past discussions to the current discussion thread. The MentorMatch tool aims at promoting student participation using student mentors, i.e., course peers with a relatively good understanding of a particular topic. The system identifies students who often provide answers on a given topic and encourages classmates to invite mentors to participate in related discussions. Both tools have been integrated into a live discussion board that is used by an undergraduate computer science course. This paper describes our approaches to applying information retrieval and natural language processing techniques in the development of the tools and presents initial results from instrumentation and survey.
An Agent-based Commodity Trading Simulation
Cheng, Shih-Fen (Singapore Management University) | Lim, Yee Pin (Singapore Management University)
In this paper, an event-centric commodity trading simulation powered by the multiagent framework is presented. The purpose of this simulation platform is for training novice traders. The simulation is progressed by announcing news events that affect various aspects of the commodity supply chain. Upon receiving these events, market agents that play the roles of producers, consumers, and speculators would adjust their views on the market and act accordingly. Their actions would be based on their roles and also their private information, and collectively they shape the market dynamics. This simulation has been effectively deployed for several training sessions. We will present the underlying technologies that are employed and discuss the practical significance of such platform.
Simulation-based Optimization of Resource Placement and Emergency Response
Bjarnason, Ronald (Oregon State University) | Tadepalli, Prasad (Oregon State University) | Fern, Alan (Oregon State University) | Niedner, Carl (Coelo Company of Design)
Many city governments are under pressure to optimize the utilization of their resources to respond to fire, rescue and medical emergencies. In this paper we describe a simulation-based optimization software called SOFER that learns from a history of emergency requests to optimize the placement of resources and response policies. We describe a two-level random-restart hill climbing approach that yields policies which perform better than the current practice, satisfy the usability constraints, and are sensitive to optimization metrics and population changes. Some of the policies learned by the system give insight into response practices that would otherwise be counterintuitive.
Archiving the Semantics of Digital Engineering Artifacts in CIBER-U
Regli, William C. (Drexel University) | Grauer, Michael (Drexel University) | Kopena, Joseph (Drexel University) | Wilkie, David (University of North Carolina) | Piecyk, Martin (Drexel University) | Osecki, Jordan (Drexel University)
This paper introduces the challenge of digital preservation in the area of engineering design and manufacturing and presents a methodology to apply knowledge representation and semantic techniques to develop Digital Engineering Archives. This work is part of an ongoing, multi-university, effort to create Cyber-Infrastructure-Based Engineering Repositories for Undergraduates (CIBER-U) to support engineering design education. The technical approach is to use knowledge representation techniques to create formal models of engineering data elements, workflows and processes. With these formal engineering knowledge and processes can be captured and preserved with some guarantee of long-term interpretability. The paper presents examples of how the techniques can be used to encode specific engineering information packages and workflows. These techniques are being integrated into a semantic Wiki that supports the CIBER-U engineering education activities across nine universities and involving over 3,500 students since 2006.
Enabling Data Quality with Lightweight Ontologies
Bidlack, Clint R. (ActivePrime Inc.)
As the volume and interconnectedness of corporate data grows, data quality is becoming a business competency essential to success. Existing methods for managing data quality do not scale up to large volumes of data in a way that is directly manageable by the owner of the data. For the past two years a new breed of data quality products, built on applied AI techniques, are empowering non-technical users. Over 150 businesses are benefiting from these products including NASDAQ, Visa, Experian, Oracle, Fidelity, Bank of America, Volvo, Dell, Sabic, and Dassault Systems. The applied AI techniques described include lightweight ontologies to efficiently find inexact textual matches in large data sets.
An Augmented Lagrangian Approach for Sparse Principal Component Analysis
Principal component analysis (PCA) is a widely used technique for data analysis and dimension reduction with numerous applications in science and engineering. However, the standard PCA suffers from the fact that the principal components (PCs) are usually linear combinations of all the original variables, and it is thus often difficult to interpret the PCs. To alleviate this drawback, various sparse PCA approaches were proposed in literature [15, 6, 17, 28, 8, 25, 18, 7, 16]. Despite success in achieving sparsity, some important properties enjoyed by the standard PCA are lost in these methods such as uncorrelation of PCs and orthogonality of loading vectors. Also, the total explained variance that they attempt to maximize can be too optimistic. In this paper we propose a new formulation for sparse PCA, aiming at finding sparse and nearly uncorrelated PCs with orthogonal loading vectors while explaining as much of the total variance as possible. We also develop a novel augmented Lagrangian method for solving a class of nonsmooth constrained optimization problems, which is well suited for our formulation of sparse PCA. We show that it converges to a feasible point, and moreover under some regularity assumptions, it converges to a stationary point. Additionally, we propose two nonmonotone gradient methods for solving the augmented Lagrangian subproblems, and establish their global and local convergence. Finally, we compare our sparse PCA approach with several existing methods on synthetic, random, and real data, respectively. The computational results demonstrate that the sparse PCs produced by our approach substantially outperform those by other methods in terms of total explained variance, correlation of PCs, and orthogonality of loading vectors.
Tactical Language and Culture Training Systems: Using AI to Teach Foreign Languages and Cultures
Johnson, W. Lewis (Alelo) | Valente, Andre (Alelo)
The Tactical Language and Culture Training System (TLCTS) helps people quickly acquire communicative skills in foreign languages and cultures. More than 40,000 learners worldwide have used TLCTS courses. TLCTS utilizes artificial intelligence technologies during the authoring process, and at run time to process learner speech, engage in dialog, and evaluate and assess learner performance. This paper describes the architecture of TLCTS and the artificial intelligence technologies that it employs, and presents results from multiple evaluation studies that demonstrate the benefits of learning foreign language and culture using this approach.