Country
Rethinking Curriculum Learning with Incremental Labels and Adaptive Compensation
Ganesh, Madan Ravi, Corso, Jason J.
Like humans, deep networks learn better when samples are organized and introduced in a meaningful order or curriculum (Weinshall et al., 2018). While con-ventional approaches to curriculum learning emphasize the difficulty of samples as the core incremental strategy, it forces networks to learn from small subsets of data while introducing pre-computation overheads. In this work, we propose Learning with Incremental Labels and Adaptive Compensation(LILAC), which takes a novel approach to curriculum learning. LILAC emphasizes incrementally learning labels instead of incrementally learning difficult samples. It works in two distinct phases: first, in the incremental label introduction phase, we recursively reveal ground-truth labels in small installments while using a fake label for the remaining data. In the adaptive compensation phase, we compensate for failed predictions by adaptively altering the target vector to a smoother distribution. We evaluate LILAC against the closest comparable methods in batch and curriculum learning and label smoothing, across three standard image benchmarks, CIFAR-10, CIFAR-100, and STL-10. We show that our method outperforms batch learning with higher mean recognition accuracy as well as lower standard deviation in performance consistently across all benchmarks. We further extend LILAC to show the highest performance on CIFAR-10 for methods using simple data augmentation while exhibiting label-order invariance among other properties.
Perspectives and Ethics of the Autonomous Artificial Thinking Systems
The feasibility of autonomous artificial thinking systems needs to compare the way th e human beings acquire their information and develops the thought with the current capacities of the autonomous information systems. Our model uses four hierarchies: the hierarchy of information systems, the cognitive hierarchy, the linguistic hierarchy and t he digital informative hierarchy that combines artificial intelligence, the power of computers models, methods and tools to develop autonomous information systems. The question of the capability of autonomous system to provide a form of artificial thought arises with the ethical consequences on the social life and the perspec tive of transhumanism.
Explaining the Explainer: A First Theoretical Analysis of LIME
Garreau, Damien, von Luxburg, Ulrike
Machine learning is used more and more often for sensitive applications, sometimes replacing humans in critical decision-making processes. As such, interpretability of these algorithms is a pressing need. One popular algorithm to provide interpretability is LIME (Local Interpretable Model-Agnostic Explanation). In this paper, we provide the first theoretical analysis of LIME. We derive closed-form expressions for the coefficients of the interpretable model when the function to explain is linear. The good news is that these coefficients are proportional to the gradient of the function to explain: LIME indeed discovers meaningful features. However, our analysis also reveals that poor choices of parameters can lead LIME to miss important features.
Machine Learning for Performance-Aware Virtual Network Function Placement
Manias, Dimitrios Michael, Jammal, Manar, Hawilo, Hassan, Shami, Abdallah, Heidari, Parisa, Larabi, Adel, Brunner, Richard
With the growing demand for data connectivity, network service providers are faced with the task of reducing their capital and operational expenses while simultaneously improving network performance and addressing the increased connectivity demand. Although Network Function Virtualization (NFV) has been identified as a solution, several challenges must be addressed to ensure its feasibility. In this paper, we address the Virtual Network Function (VNF) placement problem by developing a machine learning decision tree model that learns from the effective placement of the various VNF instances forming a Service Function Chain (SFC). The model takes several performance-related features from the network as an input and selects the placement of the various VNF instances on network servers with the objective of minimizing the delay between dependent VNF instances. The benefits of using machine learning are realized by moving away from a complex mathematical modelling of the system and towards a data-based understanding of the system. Using the Evolved Packet Core (EPC) as a use case, we evaluate our model on different data center networks and compare it to the BACON algorithm in terms of the delay between interconnected components and the total delay across the SFC. Furthermore, a time complexity analysis is performed to show the effectiveness of the model in NFV applications.
A machine learning approach to investigate regulatory control circuits in bacterial metabolic pathways
Bardozzo, Francesco, Lio', Pietro, Tagliaferri, Roberto
In this work a machine learning approach for identifying the multi-omicsmetabolic regulatory control circuits inside the pathways is described. Therefore, the identification of bacterial metabolic pathways that are more regulated than others in termof their multi-omics follows from the analysis of these circuits . This is a consequenceof the alternation of the omic values of codon usage and protein abundance along thecircuits. In this work, the E.Coli's Glycolysis and its multi-omic circuit features areshown as an example. 1 Background In the bacterial metabolic pathways, it is possible to identify different small circuitsthat lead from an intermediate compound to another. Each bacterial pathway could be considered as a highly specific directed graph that presents more than one multi-omic circuit (MOC).
Inference for linear forms of eigenvectors under minimal eigenvalue separation: Asymmetry and heteroscedasticity
Cheng, Chen, Wei, Yuting, Chen, Yuxin
A fundamental task that spans numerous applications is inference and uncertainty quantification for linear functionals of the eigenvectors of an unknown low-rank matrix. We prove that this task can be accomplished in a setting where the true matrix is symmetric and the additive noise matrix contains independent (and non-symmetric) entries. Specifically, we develop algorithms that produce confidence intervals for linear forms of individual eigenvectors, based on eigen-decomposition of the asymmetric data matrix followed by a careful de-biasing scheme. The proposed procedures and the accompanying theory enjoy several important features: (1) distribution-free (i.e. prior knowledge about the noise distributions is not needed); (2) adaptive to heteroscedastic noise; (3) statistically optimal under Gaussian noise. Along the way, we establish procedures to construct optimal confidence intervals for the eigenvalues of interest. All this happens under minimal eigenvalue separation, a condition that goes far beyond what generic matrix perturbation theory has to offer. Our studies fall under the category of "fine-grained" functional inference in low-complexity models.
For2For: Learning to forecast from forecasts
January 15, 2020 Abstract This paper presents a time series forecasting framework which combines standard forecasting methods and a machine learning model. The inputs to the machine learning model are not lagged values or regular time series features, but instead forecasts produced by standard methods. The machine learning model can be either a convolutional neural network model or a recurrent neural network model. The intuition behind this approach is that forecasts of a time series are themselves good features characterizing the series, especially when the modelling purpose is forecasting. It can also be viewed as a weighted ensemble method. Tested on the M4 competition dataset, this approach outperforms all submissions for quarterly series, and is more accurate than all but the winning algorithm for monthly series. 1 Introduction The competitiveness of neural network (NN) models and other machine learning (ML) models for time series forecasting compared to statistical models has long been questioned by practitioners [1] [2]. Although in the field of time series forecasting, there is a plethora of literature presenting complex novel models, in practice the performance of ML models is often below expectation [3].
Faster Transformer Decoding: N-gram Masked Self-Attention
Chelba, Ciprian, Chen, Mia, Bapna, Ankur, Shazeer, Noam
Motivated by the fact that most of the information relevant to the prediction of target tokens is drawn from the source sentence $S=s_1, \ldots, s_S$, we propose truncating the target-side window used for computing self-attention by making an $N$-gram assumption. Experiments on WMT EnDe and EnFr data sets show that the $N$-gram masked self-attention model loses very little in BLEU score for $N$ values in the range $4, \ldots, 8$, depending on the task.
Unifying and generalizing models of neural dynamics during decision-making
Zoltowski, David M., Pillow, Jonathan W., Linderman, Scott W.
An open question in systems and computational neuroscience is how neural circuits accumulate evidence towards a decision. Fitting models of decision-making theory to neural activity helps answer this question, but current approaches limit the number of these models that we can fit to neural data. Here we propose a unifying framework for modeling neural activity during decision-making tasks. The framework includes the canonical drift-diffusion model and enables extensions such as multi-dimensional accumulators, variable and collapsing boundaries, and discrete jumps. Our framework is based on constraining the parameters of recurrent state-space models, for which we introduce a scalable variational Laplace-EM inference algorithm. We applied the modeling approach to spiking responses recorded from monkey parietal cortex during two decision-making tasks. We found that a two-dimensional accumulator better captured the trial-averaged responses of a set of parietal neurons than a single accumulator model. Next, we identified a variable lower boundary in the responses of an LIP neuron during a random dot motion task.
Statistical Inference of the Value Function for Reinforcement Learning in Infinite Horizon Settings
Shi, C., Zhang, S., Lu, W., Song, R.
Reinforcement learning is a general technique that allows an agent to learn an optimal policy and interact with an environment in sequential decision making problems. The goodness of a policy is measured by its value function starting from some initial state. The focus of this paper is to construct confidence intervals (CIs) for a policy's value in infinite horizon settings where the number of decision points diverges to infinity. We propose to model the action-value state function (Q-function) associated with a policy based on series/sieve method to derive its confidence interval. When the target policy depends on the observed data as well, we propose a SequentiAl Value Evaluation (SAVE) method to recursively update the estimated policy and its value estimator. As long as either the number of trajectories or the number of decision points diverges to infinity, we show that the proposed CI achieves nominal coverage even in cases where the optimal policy is not unique. Simulation studies are conducted to back up our theoretical findings. We apply the proposed method to a dataset from mobile health studies and find that reinforcement learning algorithms could help improve patient's health status.