Search
Solving a few AI problems with Python: Part 1
In this blog we shall discuss about a few problems in artificial intelligence and their python implementations. The problems discussed here appeared as programming assignments in the edX course CS50's Introduction to Artificial Intelligence with Python (HarvardX:CS50 AI). The problem statements are taken from the course itself. Write a program that determines how many "degrees of separation" apart two actors are. According to the Six Degrees of Kevin Bacon game, anyone in the Hollywood film industry can be connected to Kevin Bacon within six steps, where each step consists of finding a film that two actors both starred in.
A Statistical Perspective on Coreset Density Estimation
Turner, Paxton, Liu, Jingbo, Rigollet, Philippe
Coresets have emerged as a powerful tool to summarize data by selecting a small subset of the original observations while retaining most of its information. This approach has led to significant computational speedups but the performance of statistical procedures run on coresets is largely unexplored. In this work, we develop a statistical framework to study coresets and focus on the canonical task of nonparameteric density estimation. Our contributions are twofold. First, we establish the minimax rate of estimation achievable by coreset-based estimators. Second, we show that the practical coreset kernel density estimators are near-minimax optimal over a large class of H\"{o}lder-smooth densities.
Learning Data Augmentation with Online Bilevel Optimization for Image Classification
Mounsaveng, Saypraseuth, Laradji, Issam, Ayed, Ismail Ben, Vazquez, David, Pedersoli, Marco
Data augmentation is a key practice in machine learning for improving generalization performance. However, finding the best data augmentation hyperparameters requires domain knowledge or a computationally demanding search. We address this issue by proposing an efficient approach to automatically train a network that learns an effective distribution of transformations to improve its generalization. Using bilevel optimization, we directly optimize the data augmentation parameters using a validation set. This framework can be used as a general solution to learn the optimal data augmentation jointly with an end task model like a classifier. Results show that our joint training method produces an image classification accuracy that is comparable to or better than carefully hand-crafted data augmentation. Yet, it does not need an expensive external validation loop on the data augmentation hyperparameters.
Discrete solution pools and noise-contrastive estimation for predict-and-optimize
Mulamba, Maxime, Mandi, Jayanta, Diligenti, Michelangelo, Lombardi, Michele, Bucarey, Victor, Guns, Tias
Numerous real-life decision-making processes involve solving a combinatorial optimization problem with uncertain input that can be estimated from historic data. There is a growing interest in decision-focused learning methods, where the loss function used for learning to predict the uncertain input uses the outcome of solving the combinatorial problem over a set of predictions. Different surrogate loss functions have been identified, often using a continuous approximation of the combinatorial problem. However, a key bottleneck is that to compute the loss, one has to solve the combinatorial optimisation problem for each training instance in each epoch, which is computationally expensive even in the case of continuous approximations. We propose a different solver-agnostic method for decision-focused learning, namely by considering a pool of feasible solutions as a discrete approximation of the full combinatorial problem. Solving is now trivial through a single pass over the solution pool. We design several variants of a noise-contrastive loss over the solution pool, which we substantiate theoretically and empirically. Furthermore, we show that by dynamically re-solving only a fraction of the training instances each epoch, our method performs on par with the state of the art, whilst drastically reducing the time spent solving, hence increasing the feasibility of predict-and-optimize for larger problems.
Multi-Agent Active Search using Realistic Depth-Aware Noise Model
Ghods, Ramina, Durkin, William J., Schneider, Jeff
The search for objects of interest in an unknown environment by making data-collection decisions (i.e., active search or active sensing) has robotics applications in many fields, including the search and rescue of human survivors following disasters, detecting gas leaks or locating and preventing animal poachers. Existing algorithms often prioritize the location accuracy of objects of interest while other practical issues such as the reliability of object detection as a function of distance and lines of sight remain largely ignored. An additional challenge is that in many active search scenarios, communication infrastructure may be damaged, unreliable, or unestablished, making centralized control of multiple search agents impractical. We present an algorithm called Noise-Aware Thompson Sampling (NATS) that addresses these issues for multiple ground-based robot agents performing active search considering two sources of sensory information from monocular optical imagery and sonar tracking. NATS utilizes communications between robot agents in a decentralized manner that is robust to intermittent loss of communication links. Additionally, it takes into account object detection uncertainty from depth as well as environmental occlusions. Using simulation results, we show that NATS significantly outperforms existing methods such as information-greedy policies or exhaustive search. We demonstrate the real-world viability of NATS using a photo-realistic environment created in the Unreal Engine 4 game development platform with the AirSim plugin.
Solving the Steiner Tree Problem with few Terminals
Fichte, Johannes K., Hecher, Markus, Schidler, Andre
The Steiner tree problem is a well-known problem in network design, routing, and VLSI design. Given a graph, edge costs, and a set of dedicated vertices (terminals), the Steiner tree problem asks to output a sub-graph that connects all terminals at minimum cost. A state-of-the-art algorithm to solve the Steiner tree problem by means of dynamic programming is the Dijkstra-Steiner algorithm. The algorithm builds a Steiner tree of the entire instance by systematically searching for smaller instances, based on subsets of the terminals, and combining Steiner trees for these smaller instances. The search heavily relies on a guiding heuristic function in order to prune the search space. However, to ensure correctness, this algorithm allows only for limited heuristic functions, namely, those that satisfy a so-called consistency condition. In this paper, we enhance the Dijkstra-Steiner algorithm and establish a revisited algorithm, called DS*. The DS* algorithm allows for arbitrary lower bounds as heuristics relaxing the previous condition on the heuristic function. Notably, we can now use linear programming based lower bounds. Further, we capture new requirements for a heuristic function in a condition, which we call admissibility. We show that admissibility is indeed weaker than consistency and establish correctness of the DS* algorithm when using an admissible heuristic function. We implement DS* and combine it with modern preprocessing, resulting in an open-source solver (DS* Solve). Finally, we compare its performance on standard benchmarks and observe a competitive behavior.
Adaptive Linear Span Network for Object Skeleton Detection
Liu, Chang, Tian, Yunjie, Jiao, Jianbin, Ye, Qixiang
Conventional networks for object skeleton detection are usually hand-crafted. Although effective, they require intensive priori knowledge to configure representative features for objects in different scale granularity.In this paper, we propose adaptive linear span network (AdaLSN), driven by neural architecture search (NAS), to automatically configure and integrate scale-aware features for object skeleton detection. AdaLSN is formulated with the theory of linear span, which provides one of the earliest explanations for multi-scale deep feature fusion. AdaLSN is materialized by defining a mixed unit-pyramid search space, which goes beyond many existing search spaces using unit-level or pyramid-level features.Within the mixed space, we apply genetic architecture search to jointly optimize unit-level operations and pyramid-level connections for adaptive feature space expansion. AdaLSN substantiates its versatility by achieving significantly higher accuracy and latency trade-off compared with state-of-the-arts. It also demonstrates general applicability to image-to-mask tasks such as edge detection and road extraction. Code is available at \href{https://github.com/sunsmarterjie/SDL-Skeleton}{\color{magenta}github.com/sunsmarterjie/SDL-Skeleton}.
Simulated annealing - Wikipedia
Simulated annealing (SA) is a probabilistic technique for approximating the global optimum of a given function. Specifically, it is a metaheuristic to approximate global optimization in a large search space for an optimization problem. It is often used when the search space is discrete (e.g., the traveling salesman problem). For problems where finding an approximate global optimum is more important than finding a precise local optimum in a fixed amount of time, simulated annealing may be preferable to exact algorithms such as gradient descent, Branch and Bound. The name of the algorithm comes from annealing in metallurgy, a technique involving heating and controlled cooling of a material to increase the size of its crystals and reduce their defects.
Beyond Pointwise Submodularity: Non-Monotone Adaptive Submodular Maximization in Linear Time
In this paper, we study the non-monotone adaptive submodular maximization problem subject to a cardinality constraint. We first revisit the adaptive random greedy algorithm proposed in \citep{gotovos2015non}, where they show that this algorithm achieves a $1/e$ approximation ratio if the objective function is adaptive submodular and pointwise submodular. It is not clear whether the same guarantee holds under adaptive submodularity (without resorting to pointwise submodularity) or not. Our first contribution is to show that the adaptive random greedy algorithm achieves a $1/e$ approximation ratio under adaptive submodularity. One limitation of the adaptive random greedy algorithm is that it requires $O(n\times k)$ value oracle queries, where $n$ is the size of the ground set and $k$ is the cardinality constraint. Our second contribution is to develop the first linear-time algorithm for the non-monotone adaptive submodular maximization problem. Our algorithm achieves a $1/e-\epsilon$ approximation ratio (this bound is improved to $1-1/e-\epsilon$ for monotone case), using only $O(n\epsilon^{-2}\log \epsilon^{-1})$ value oracle queries. Notably, $O(n\epsilon^{-2}\log \epsilon^{-1})$ is independent of the cardinality constraint. For the monotone case, we propose a faster algorithm that achieves a $1-1/e-\epsilon$ approximation ratio in expectation with $O(n \log \frac{1}{\epsilon})$ value oracle queries. We also generalize our study by considering a partition matroid constraint, and develop a linear-time algorithm for monotone and fully adaptive submodular functions.
Obstacles in Fully Automatic Program Repair: A survey
Mousavi, S. Amirhossein, Babani, Donya Azizi, Flammini, Francesco
The current article is an interdisciplinary attempt to decipher automatic program repair processes. The review is done by the manner typical to human science known as diffraction. We attempt to spot a gap in the literature of self-healing and self-repair operations and further investigate the approaches that would enable us to tackle the problems we face. As a conclusion, we suggest a shift in the current approach to automatic program repair operations in order to attain our goals. The emphasis of this review is to achieve full automation. Several obstacles are shortly mentioned in the current essay but the main shortage that is covered is the overfitting obstacle, and this particular problem is investigated in the stream that is related to full automation of the repair process.