AITopics

We propose a new application of embedding techniques for problem retrieval in adaptive tutoring. The objective is to retrieve problems whose mathematical concepts are similar. There are two challenges: First, like sentences, problems helpful to tutoring are never exactly the same in terms of the underlying concepts. Instead, good problems mix concepts in innovative ways, while still displaying continuity in their relationships. Second, it is difficult for humans to determine a similarity score that is consistent across a large enough training set. We propose a hierarchical problem embedding algorithm, called Prob2Vec, that consists of abstraction and embedding steps. Prob2Vec achieves 96.88\% accuracy on a problem similarity test, in contrast to 75\% from directly applying state-of-the-art sentence embedding methods. It is interesting that Prob2Vec is able to distinguish very fine-grained differences among problems, an ability humans need time and effort to acquire. In addition, the sub-problem of concept labeling with imbalanced training data set is interesting in its own right. It is a multi-label problem suffering from dimensionality explosion, which we propose ways to ameliorate. We propose the novel negative pre-training algorithm that dramatically reduces false negative and positive ratios for classification, using an imbalanced training data set.

neural network, representation, similarity, (14 more...)

2003.10838

Country: North America > United States > Illinois (0.04)

Genre:

Instructional Material > Course Syllabus & Notes (0.68)
Research Report (0.64)

Industry: Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

FTT-NAS: Discovering Fault-Tolerant Neural Architecture

Ning, Xuefei, Ge, Guangjun, Li, Wenshuo, Zhu, Zhenhua, Zheng, Yin, Chen, Xiaoming, Gao, Zhen, Wang, Yu, Yang, Huazhong

With the fast evolvement of embedded deep-learning computing systems, applications powered by deep learning are moving from the cloud to the edge. When deploying neural networks (NNs) onto the devices under complex environments, there are various types of possible faults: soft errors caused by cosmic radiation and radioactive impurities, voltage instability, aging, temperature variations, and malicious attackers. Thus the safety risk of deploying NNs is now drawing much attention. In this paper, after the analysis of the possible faults in various types of NN accelerators, we formalize and implement various fault models from the algorithmic perspective. We propose Fault-Tolerant Neural Architecture Search (FT-NAS) to automatically discover convolutional neural network (CNN) architectures that are reliable to various faults in nowadays devices. Then we incorporate fault-tolerant training (FTT) in the search process to achieve better results, which is referred to as FTT-NAS. Experiments on CIFAR-10 show that the discovered architectures outperform other manually designed baseline architectures significantly, with comparable or fewer floating-point operations (FLOPs) and parameters. Specifically, with the same fault settings, F-FTT-Net discovered under the feature fault model achieves an accuracy of 86.2% (VS. 68.1% achieved by MobileNet-V2), and W-FTT-Net discovered under the weight fault model achieves an accuracy of 69.6% (VS. 60.8% achieved by ResNet-20). By inspecting the discovered architectures, we find that the operation primitives, the weight quantization range, the capacity of the model, and the connection pattern have influences on the fault resilience capability of NN models.

architecture, fault model, neural network, (16 more...)

2003.10375

Country:

Asia > China > Beijing > Beijing (0.05)
Asia > China > Tianjin Province > Tianjin (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology (0.48)
Semiconductors & Electronics (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

BusTime: Which is the Right Prediction Model for My Bus Arrival Time?

Liu, Dairui, Sun, Jingxiang, Wang, Shen

With the rise of big data technologies, many smart transportation applications have been rapidly developed in recent years including bus arrival time predictions. This type of applications help passengers to plan trips more efficiently without wasting unpredictable amount of waiting time at bus stops. Many studies focus on improving the prediction accuracy of various machine learning and statistical models, while much less work demonstrate their applicability of being deployed and used in realistic urban settings. This paper tries to fill this gap by proposing a general and practical evaluation framework for analysing various widely used prediction models (i.e. delay, k-nearest-neighbour, kernel regression, additive model, and recurrent neural network using long short term memory) for bus arrival time. In particular, this framework contains a raw bus GPS data pre-processing method that needs much less number of input data points while still maintain satisfactory prediction results. This pre-processing method enables various models to predict arrival time at bus stops only, by using a KD-tree based nearest point search method. Based on this framework, using raw bus GPS dataset in different scales from the city of Dublin, Ireland, we also present preliminary results for city managers by analysing the practical strengths and weaknesses in both training and predicting stages of commonly used prediction models.

bus trip, prediction, prediction model, (15 more...)

2003.10373

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.25)
North America > United States > New York > New York County > New York City (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.50)

Industry:

Transportation > Passenger (1.00)
Consumer Products & Services > Travel (1.00)
Transportation > Ground > Road (0.90)
Transportation > Infrastructure & Services (0.90)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Gergatsouli, Evangelia, Lucier, Brendan, Tzamos, Christos

Black-box Methods for Restoring Monotonicity

In many practical applications, heuristic or approximation algorithms are used to efficiently solve the task at hand. However their solutions frequently do not satisfy natural monotonicity properties of optimal solutions. In this work we develop algorithms that are able to restore monotonicity in the parameters of interest. Specifically, given oracle access to a (possibly non-monotone) multi-dimensional real-valued function $f$, we provide an algorithm that restores monotonicity while degrading the expected value of the function by at most $\varepsilon$. The number of queries required is at most logarithmic in $1/\varepsilon$ and exponential in the number of parameters. We also give a lower bound showing that this exponential dependence is necessary. Finally, we obtain improved query complexity bounds for restoring the weaker property of $k$-marginal monotonicity. Under this property, every $k$-dimensional projection of the function $f$ is required to be monotone. The query complexity we obtain only scales exponentially with $k$.

algorithm, monotone, monotonicity, (14 more...)

2003.09554

Country:

North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
(7 more...)

Genre: Research Report (0.50)

Industry: Transportation > Air (0.41)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Zhang, Zhongxia, Wu, Meng

Predicting Real-Time Locational Marginal Prices: A GAN-Based Video Prediction Approach

In this paper, we propose an unsupervised data-driven approach to predict real-time locational marginal prices (RTLMPs). The proposed approach is built upon a general data structure for organizing system-wide heterogeneous market data streams into the format of market data images and videos. Leveraging this general data structure, the system-wide RTLMP prediction problem is formulated as a video prediction problem. A video prediction model based on generative adversarial networks (GAN) is proposed to learn the spatio-temporal correlations among historical RTLMPs and predict system-wide RTLMPs for the next hour. An autoregressive moving average (ARMA) calibration method is adopted to improve the prediction accuracy. The proposed RTLMP prediction method takes public market data as inputs, without requiring any confidential information on system topology, model parameters, or market operating details. Case studies using public market data from ISO New England (ISO-NE) and Southwest Power Pool (SPP) demonstrate that the proposed method is able to learn spatio-temporal correlations among RTLMPs and perform accurate RTLMP prediction.

market data, rtlmp, video, (13 more...)

2003.09527

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
North America > United States > Arizona > Maricopa County > Tempe (0.04)
Asia > Japan (0.04)

Genre: Research Report (0.40)

Industry:

Energy > Power Industry (1.00)
Banking & Finance > Trading (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Event-Based Control for Online Training of Neural Networks

Zhao, Zilong, Cerf, Sophie, Robu, Bogdan, Marchand, Nicolas

Convolutional Neural Network (CNN) has become the most used method for image classification tasks. During its training the learning rate and the gradient are two key factors to tune for influencing the convergence speed of the model. Usual learning rate strategies are time-based i.e. monotonous decay over time. Recent state-of-the-art techniques focus on adaptive gradient algorithms i.e. Adam and its versions. In this paper we consider an online learning scenario and we propose two Event-Based control loops to adjust the learning rate of a classical algorithm E (Exponential)/PD (Proportional Derivative)-Control. The first Event-Based control loop will be implemented to prevent sudden drop of the learning rate when the model is approaching the optimum. The second Event-Based control loop will decide, based on the learning speed, when to switch to the next data batch. Experimental evaluationis provided using two state-of-the-art machine learning image datasets (CIFAR-10 and CIFAR-100). Results show the Event-Based E/PD is better than the original algorithm (higher final accuracy, lower final loss value), and the Double-Event-BasedE/PD can accelerate the training process, save up to 67% training time compared to state-of-the-art algorithms and even result in better performance.

algorithm, data batch, epoch, (15 more...)

doi: 10.1109/LCSYS.2020.2981984

2003.09503

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.05)
North America > United States > Nevada (0.04)
(5 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Zheng, Liyuan, Shi, Yuanyuan, Ratliff, Lillian J., Zhang, Baosen

Safe Reinforcement Learning of Control-Affine Systems with Vertex Networks

This paper focuses on finding reinforcement learning policies for control systems with hard state and action constraints. Despite its success in many domains, reinforcement learning is challenging to apply to problems with hard constraints, especially if both the state variables and actions are constrained. Previous works seeking to ensure constraint satisfaction, or safety, have focused on adding a projection step to a learned policy. Yet, this approach requires solving an optimization problem at every policy execution step, which can lead to significant computational costs. To tackle this problem, this paper proposes a new approach, termed Vertex Networks (VNs), with guarantees on safety during exploration and on learned control policies by incorporating the safety constraints into the policy network architecture. Leveraging the geometric property that all points within a convex set can be represented as the convex combination of its vertices, the proposed algorithm first learns the convex combination weights and then uses these weights along with the pre-calculated vertices to output an action. The output action is guaranteed to be safe by construction. Numerical examples illustrate that the proposed VN algorithm outperforms vanilla reinforcement learning in a variety of benchmark control tasks.

constraint, polytope, vertex, (17 more...)

2003.09488

Country: Asia > Middle East > Jordan (0.04)

Genre: Workflow (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Li, Qiaomei, Cummings, Rachel, Mintz, Yonatan

Locally Interpretable Predictions of Parkinson's Disease Progression

In precision medicine, machine learning techniques have been commonly proposed to aid physicians in early screening of chronic diseases. Many of these diseases become more difficult to treat as they progress, so accurate early screening is critical to ensure resources are directed towards the most effective treatment plan [Pagan, 2012]. Since the final treatment decision must inevitably be made by a doctor, these screening procedures should be interpretable such that a clinician can explain the decision-making process to patients for informed consent. However, the types of models that achieve the highest level of accuracy given early screening data tend to be extremely complex, meaning that even machine learning experts have difficulties explaining why certain predictions are made, leading many to describe them as "black box" [Breiman, 2001]. In this paper, we bridge this gap by providing a novel approach for explaining black box model predictions which can give high fidelity explanations with lower model complexity. In particular we will focus on early screening of Parkinson's Disease (PD). PD is a complicated neurodegenerative disorder that affects the central nervous system and specifically the motor control of individuals [mjf, 2019]. This disorder is estimated to affect 930,000 individuals in the US by 2020, and is more prevalent in the geriatric population affecting more then 1% of the population over the age of 60 and 5% of the population over age 85 [Findley, 2007, Kowal et al., 2013, Rossi et al., 2018]. These statistics and other recent studies on Parkinson's epidemiology indicate that as the population ages, the prevalence of PD is expected to grow to over 1.2 million by 2030 in the US alone, increasing the total economic burden of the disorder to approximately $26 billion USD [Kowal et al., 2013, Rossi et al., 2018].

explainer, local explainer, prediction, (14 more...)

2003.09466

Country:

Europe > United Kingdom > England (0.04)
North America > Canada > Quebec > Montreal (0.04)
North America > United States > Massachusetts > Middlesex County > Natick (0.04)
Asia > Japan > Kyūshū & Okinawa > Kyūshū > Fukuoka Prefecture > Fukuoka (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Parkinson's Disease (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Weighted Meta-Learning

Cai, Diana, Sheth, Rishit, Mackey, Lester, Fusi, Nicolo

Meta-learning leverages related source tasks to learn an initialization that can be quickly fine-tuned to a target task with limited labeled examples. However, many popular meta-learning algorithms, such as model-agnostic meta-learning (MAML), only assume access to the target samples for fine-tuning. In this work, we provide a general framework for meta-learning based on weighting the loss of different source tasks, where the weights are allowed to depend on the target samples. In this general setting, we provide upper bounds on the distance of the weighted empirical risk of the source tasks and expected target risk in terms of an integral probability metric (IPM) and Rademacher complexity, which apply to a number of meta-learning settings including MAML and a weighted MAML variant. We then develop a learning algorithm based on minimizing the error bound with respect to an empirical IPM, including a weighted MAML algorithm, $\alpha$-MAML. Finally, we demonstrate empirically on several regression problems that our weighted meta-learning algorithm is able to find better initializations than uniformly-weighted meta-learning algorithms, such as MAML.

initialization, source task, target task, (16 more...)

2003.09465

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Augustin, Maximilian, Meinke, Alexander, Hein, Matthias

Adversarial Robustness on In- and Out-Distribution Improves Explainability

Neural networks have led to major improvements in image classification but suffer from being non-robust to adversarial changes, unreliable uncertainty estimates on out-distribution samples and their inscrutable black-box decisions. In this work we propose RATIO, a training procedure for Robustness via Adversarial Training on In- and Out-distribution, which leads to robust models with reliable and robust confidence estimates on the out-distribution. RATIO has similar generative properties to adversarial training so that visual counterfactuals produce class specific features. While adversarial training comes at the price of lower clean accuracy, RATIO achieves state-of-the-art $l_2$-adversarial robustness on CIFAR10 and maintains better clean accuracy.

plane, ship, truck, (12 more...)

2003.09461

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.81)

Industry: Transportation (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)