Oceania
Implicit Langevin Algorithms for Sampling From Log-concave Densities
Hodgkinson, Liam, Salomone, Robert, Roosta, Fred
For sampling from a log-concave density, we study implicit integrators resulting from $\theta$-method discretization of the overdamped Langevin diffusion stochastic differential equation. Theoretical and algorithmic properties of the resulting sampling methods for $ \theta \in [0,1] $ and a range of step sizes are established. Our results generalize and extend prior works in several directions. In particular, for $\theta\ge1/2$, we prove geometric ergodicity and stability of the resulting methods for all step sizes. We show that obtaining subsequent samples amounts to solving a strongly-convex optimization problem, which is readily achievable using one of numerous existing methods. Numerical examples supporting our theoretical analysis are also presented.
Scalable Deep Learning on Distributed Infrastructures: Challenges, Techniques and Tools
Mayer, Ruben, Jacobsen, Hans-Arno
Deep Learning (DL) has had an immense success in the recent past, leading to state-of-the-art results in various domains such as image recognition and natural language processing. One of the reasons for this success is the increasing size of DL models and the proliferation of vast amounts of training data being available. To keep on improving the performance of DL, increasing the scalability of DL systems is necessary. In this survey, we perform a broad and thorough investigation on challenges, techniques and tools for scalable DL on distributed infrastructures. This incorporates infrastructures for DL, methods for parallel DL training, multi-tenant resource scheduling and the management of training and model data. Further, we analyze and compare 11 current open-source DL frameworks and tools and investigate which of the techniques are commonly implemented in practice. Finally, we highlight future research trends in DL systems that deserve further research.
How GMU students' eating habits changed when delivery robots invaded their campus
In the first days after a fleet of 25 delivery robots descended on George Mason University's campus in January, school officials could only speculate about the machines' long-term impact. The Igloo cooler-sized robots from the Bay Area start-up Starship Technologies -- which were designed to deliver food on demand across campus -- appeared to elicit curious glances and numerous photos, but not much else. It was clear, officials said at the time, that more time and more data would be necessary to understand whether the robots would actually change the campus culture or become a forgettable novelty. Today, some of that data emerged for the first time. In the two months since the robots arrived at the Fairfax, Va.-based school, an extra 1,500 breakfast orders have been delivered autonomously, according to Starship Technologies and Sodexo, a company that manages food services for GMU on contract and works closely with the robots.
On the Functional Equivalence of TSK Fuzzy Systems to Neural Networks, Mixture of Experts, CART, and Stacking Ensemble Regression
Wu, Dongrui, Lin, Chin-Teng, Huang, Jian, Zeng, Zhigang
Fuzzy systems have achieved great success in numerous applications. However, there are still many challenges in designing an optimal fuzzy system, e.g., how to efficiently train its parameters, how to improve its performance without adding too many parameters, how to balance the trade-off between cooperations and competitions among the rules, how to overcome the curse of dimensionality, etc. Literature has shown that by making appropriate connections between fuzzy systems and other machine learning approaches, good practices from other domains may be used to improve the fuzzy systems, and vice versa. This paper gives an overview on the functional equivalence between Takagi-Sugeno-Kang fuzzy systems and four classic machine learning approaches -- neural networks, mixture of experts, classification and regression trees, and stacking ensemble regression -- for regression problems. We also point out some promising new research directions, inspired by the functional equivalence, that could lead to solutions to the aforementioned problems. To our knowledge, this is so far the most comprehensive overview on the connections between fuzzy systems and other popular machine learning approaches, and hopefully will stimulate more hybridization between different machine learning algorithms.
Deep recommender engine based on efficient product embeddings neural pipeline
Piciu, Laurentiu, Damian, Andrei, Tapus, Nicolae, Simion-Constantinescu, Andrei, Dumitrescu, Bogdan
Predictive analytics systems are currently one of the most important areas of research and development within the Artificial Intelligence domain and particularly in Machine Learning. One of the "holy grails" of predictive analytics is the research and development of the "perfect" recommendation system. In our paper we propose an advanced pipeline model for the multi-task objective of determining product complementarity, similarity and sales prediction using deep neural models applied to big-data sequential transaction systems. Our highly parallelized hybrid pipeline consists of both unsupervised and supervised models, used for the objectives of generating semantic product embeddings and predicting sales, respectively. Our experimentation and benchmarking have been done using very large pharma-industry retailer Big Data stream.
Facebook says its artificial intelligence systems failed to detect New Zealand shooting video
Facebook said on Wednesday night that its artificial intelligence systems failed to automatically detect the New Zealand mosque shooting video. A senior executive at the social media giant responded in a blog post to criticism that it didn't act quickly enough to take down the gunman's livestream video of his attack in Christchurch that left 50 people dead, allowing it to spread rapidly online. Facebook's vice president of integrity, Guy Rosen, said "this particular video did not trigger our automatic detection systems." "AI has made massive progress over the years and in many areas, which has enabled us to proactively detect the vast majority of the content we remove," Rosen said. One reason is because artificial intelligence systems are trained with large volumes of similar content, but in this case there was not enough because such attacks are rare.
Symbolic Regression Methods for Reinforcement Learning
Kubalík, Jiří, Žegklitz, Jan, Derner, Erik, Babuška, Robert
Reinforcement learning algorithms can be used to optimally solve dynamic decision-making and control problems. With continuous-valued state and input variables, reinforcement learning algorithms must rely on function approximators to represent the value function and policy mappings. Commonly used numerical approximators, such as neural networks or basis function expansions, have two main drawbacks: they are black-box models offering no insight in the mappings learned, and they require significant trial and error tuning of their meta-parameters. In this paper, we propose a new approach to constructing smooth value functions by means of symbolic regression. We introduce three off-line methods for finding value functions based on a state transition model: symbolic value iteration, symbolic policy iteration, and a direct solution of the Bellman equation. The methods are illustrated on four nonlinear control problems: velocity control under friction, one-link and two-link pendulum swing-up, and magnetic manipulation. The results show that the value functions not only yield well-performing policies, but also are compact, human-readable and mathematically tractable. This makes them potentially suitable for further analysis of the closed-loop system. A comparison with alternative approaches using neural networks shows that our method constructs well-performing value functions with substantially fewer parameters.
Facebook vows to improve AI detection of terrorist videos
Facebook rushed to pull down footage of the New Zealand mass shooter's video from its platform, but it didn't start doing so until after the live broadcast was done. In a new post, Facebook VP of Integrity Guy Rosen discussed the company's successes and shortcomings in addressing the situation, as well as its plans to prevent videos like that from spreading on the social network in the future. He explained that while the platform's AI can quickly detect videos containing suicidal or harmful acts, the shooter's stream didn't trigger it. To be able to train the matching AI to detect that specific type of content, the platform needs big volumes of training data. As Facebook explains, something like that is difficult to obtain as "these events are thankfully rare."
Iteratively Learning Embeddings and Rules for Knowledge Graph Reasoning
Zhang, Wen, Paudel, Bibek, Wang, Liang, Chen, Jiaoyan, Zhu, Hai, Zhang, Wei, Bernstein, Abraham, Chen, Huajun
Reasoning is essential for the development of large knowledge graphs, especially for completion, which aims to infer new triples based on existing ones. Both rules and embeddings can be used for knowledge graph reasoning and they have their own advantages and difficulties. Rule-based reasoning is accurate and explainable but rule learning with searching over the graph always suffers from efficiency due to huge search space. Embedding-based reasoning is more scalable and efficient as the reasoning is conducted via computation between embeddings, but it has difficulty learning good representations for sparse entities because a good embedding relies heavily on data richness. Based on this observation, in this paper we explore how embedding and rule learning can be combined together and complement each other's difficulties with their advantages. We propose a novel framework IterE iteratively learning embeddings and rules, in which rules are learned from embeddings with proper pruning strategy and embeddings are learned from existing triples and new triples inferred by rules. Evaluations on embedding qualities of IterE show that rules help improve the quality of sparse entity embeddings and their link prediction results. We also evaluate the efficiency of rule learning and quality of rules from IterE compared with AMIE+, showing that IterE is capable of generating high quality rules more efficiently. Experiments show that iteratively learning embeddings and rules benefit each other during learning and prediction.
Biasing MCTS with Features for General Games
Soemers, Dennis J. N. J., Piette, Éric, Browne, Cameron
This paper proposes using a linear function approximator, rather than a deep neural network (DNN), to bias a Monte Carlo tree search (MCTS) player for general games. This is unlikely to match the potential raw playing strength of DNNs, but has advantages in terms of generality, interpretability and resources (time and hardware) required for training. Features describing local patterns are used as inputs. The features are formulated in such a way that they are easily interpretable and applicable to a wide range of general games, and might encode simple local strategies. We gradually create new features during the same self-play training process used to learn feature weights. We evaluate the playing strength of an MCTS player biased by learnt features against a standard upper confidence bounds for trees (UCT) player in multiple different board games, and demonstrate significantly improved playing strength in the majority of them after a small number of self-play training games.