Goto

Collaborating Authors

Results


MS-DARTS: Mean-Shift Based Differentiable Architecture Search

arXiv.org Artificial Intelligence

Differentiable Architecture Search (DARTS) is an effective continuous relaxation-based network architecture search (NAS) method with low search cost. It has attracted significant attentions in Auto-ML research and becomes one of the most useful paradigms in NAS. Although DARTS can produce superior efficiency over traditional NAS approaches with better control of complex parameters, oftentimes it suffers from stabilization issues in producing deteriorating architectures when discretizing the continuous architecture. We observed considerable loss of validity causing dramatic decline in performance at this final discretization step of DARTS. To address this issue, we propose a Mean-Shift based DARTS (MS-DARTS) to improve stability based on sampling and perturbation. Our approach can improve bot the stability and accuracy of DARTS, by smoothing the loss landscape and sampling architecture parameters within a suitable bandwidth. We investigate the convergence of our mean-shift approach, together with the effects of bandwidth selection that affects stability and accuracy. Evaluations performed on CIFAR-10, CIFAR-100, and ImageNet show that MS-DARTS archives higher performance over other state-of-the-art NAS methods with reduced search cost.


Towards Personalized and Human-in-the-Loop Document Summarization

arXiv.org Artificial Intelligence

The ubiquitous availability of computing devices and the widespread use of the internet have generated a large amount of data continuously. Therefore, the amount of available information on any given topic is far beyond humans' processing capacity to properly process, causing what is known as information overload. To efficiently cope with large amounts of information and generate content with significant value to users, we require identifying, merging and summarising information. Data summaries can help gather related information and collect it into a shorter format that enables answering complicated questions, gaining new insight and discovering conceptual boundaries. This thesis focuses on three main challenges to alleviate information overload using novel summarisation techniques. It further intends to facilitate the analysis of documents to support personalised information extraction. This thesis separates the research issues into four areas, covering (i) feature engineering in document summarisation, (ii) traditional static and inflexible summaries, (iii) traditional generic summarisation approaches, and (iv) the need for reference summaries. We propose novel approaches to tackle these challenges, by: i)enabling automatic intelligent feature engineering, ii) enabling flexible and interactive summarisation, iii) utilising intelligent and personalised summarisation approaches. The experimental results prove the efficiency of the proposed approaches compared to other state-of-the-art models. We further propose solutions to the information overload problem in different domains through summarisation, covering network traffic data, health data and business process data.


MOFit: A Framework to reduce Obesity using Machine learning and IoT

arXiv.org Artificial Intelligence

From the past few years, due to advancements in technologies, the sedentary living style in urban areas is at its peak. This results in individuals getting a victim of obesity at an early age. There are various health impacts of obesity like Diabetes, Heart disease, Blood pressure problems, and many more. Machine learning from the past few years is showing its implications in all expertise like forecasting, healthcare, medical imaging, sentiment analysis, etc. In this work, we aim to provide a framework that uses machine learning algorithms namely, Random Forest, Decision Tree, XGBoost, Extra Trees, and KNN to train models that would help predict obesity levels (Classification), Bodyweight, and fat percentage levels (Regression) using various parameters. We also applied and compared various hyperparameter optimization (HPO) algorithms such as Genetic algorithm, Random Search, Grid Search, Optuna to further improve the accuracy of the models. The website framework contains various other features like making customizable Diet plans, workout plans, and a dashboard to track the progress. The framework is built using the Python Flask. Furthermore, a weighing scale using the Internet of Things (IoT) is also integrated into the framework to track calories and macronutrients from food intake.


RANK-NOSH: Efficient Predictor-Based Architecture Search via Non-Uniform Successive Halving

arXiv.org Artificial Intelligence

Predictor-based algorithms have achieved remarkable performance in the Neural Architecture Search (NAS) tasks. However, these methods suffer from high computation costs, as training the performance predictor usually requires training and evaluating hundreds of architectures from scratch. Previous works along this line mainly focus on reducing the number of architectures required to fit the predictor. In this work, we tackle this challenge from a different perspective - improve search efficiency by cutting down the computation budget of architecture training. We propose NOn-uniform Successive Halving (NOSH), a hierarchical scheduling algorithm that terminates the training of underperforming architectures early to avoid wasting budget. To effectively leverage the non-uniform supervision signals produced by NOSH, we formulate predictor-based architecture search as learning to rank with pairwise comparisons. The resulting method - RANK-NOSH, reduces the search budget by ~5x while achieving competitive or even better performance than previous state-of-the-art predictor-based methods on various spaces and datasets.


A New Constructive Heuristic driven by Machine Learning for the Traveling Salesman Problem

arXiv.org Artificial Intelligence

Recent systems applying Machine Learning (ML) to solve the Traveling Salesman Problem (TSP) exhibit issues when they try to scale up to real case scenarios with several hundred vertices. The use of Candidate Lists (CLs) has been brought up to cope with the issues. The procedure allows to restrict the search space during solution creation, consequently reducing the solver computational burden. So far, ML were engaged to create CLs and values on the edges of these CLs expressing ML preferences at solution insertion. Although promising, these systems do not clearly restrict what the ML learns and does to create solutions, bringing with them some generalization issues. Therefore, motivated by exploratory and statistical studies, in this work we instead use a machine learning model to confirm the addition in the solution just for high probable edges. CLs of the high probable edge are employed as input, and the ML is in charge of distinguishing cases where such edges are in the optimal solution from those where they are not. . This strategy enables a better generalization and creates an efficient balance between machine learning and searching techniques. Our ML-Constructive heuristic is trained on small instances. Then, it is able to produce solutions, without losing quality, to large problems as well. We compare our results with classic constructive heuristics, showing good performances for TSPLIB instances up to 1748 cities. Although our heuristic exhibits an expensive constant time operation, we proved that the computational complexity in worst-case scenario, for the solution construction after training, is $O(n^2 \log n^2)$, being $n$ the number of vertices in the TSP instance.


AIRCHITECT: Learning Custom Architecture Design and Mapping Space

arXiv.org Artificial Intelligence

Design space exploration is an important but costly step involved in the design/deployment of custom architectures to squeeze out maximum possible performance and energy efficiency. Conventionally, optimizations require iterative sampling of the design space using simulation or heuristic tools. In this paper we investigate the possibility of learning the optimization task using machine learning and hence using the learnt model to predict optimal parameters for the design and mapping space of custom architectures, bypassing any exploration step. We use three case studies involving the optimal array design, SRAM buffer sizing, mapping, and schedule determination for systolic-array-based custom architecture design and mapping space. Within the purview of these case studies, we show that it is possible to capture the design space and train a model to "generalize" prediction the optimal design and mapping parameters when queried with workload and design constraints. We perform systematic design-aware and statistical analysis of the optimization space for our case studies and highlight the patterns in the design space. We formulate the architecture design and mapping as a machine learning problem that allows us to leverage existing ML models for training and inference. We design and train a custom network architecture called AIRCHITECT, which is capable of learning the architecture design space with as high as 94.3% test accuracy and predicting optimal configurations which achieve on average (GeoMean) of 99.9% the best possible performance on a test dataset with $10^5$ GEMM workloads.


The application of artificial intelligence in software engineering: a review challenging conventional wisdom

arXiv.org Artificial Intelligence

The field of artificial intelligence (AI) is witnessing a recent upsurge in research, tools development, and deployment of applications. Multiple software companies are shifting their focus to developing intelligent systems; and many others are deploying AI paradigms to their existing processes. In parallel, the academic research community is injecting AI paradigms to provide solutions to traditional engineering problems. Similarly, AI has evidently been proved useful to software engineering (SE). When one observes the SE phases (requirements, design, development, testing, release, and maintenance), it becomes clear that multiple AI paradigms (such as neural networks, machine learning, knowledge-based systems, natural language processing) could be applied to improve the process and eliminate many of the major challenges that the SE field has been facing. This survey chapter is a review of the most commonplace methods of AI applied to SE. The review covers methods between years 1975-2017, for the requirements phase, 46 major AI-driven methods are found, 19 for design, 15 for development, 68 for testing, and 15 for release and maintenance. Furthermore, the purpose of this chapter is threefold; firstly, to answer the following questions: is there sufficient intelligence in the SE lifecycle? What does applying AI to SE entail? Secondly, to measure, formulize, and evaluate the overlap of SE phases and AI disciplines. Lastly, this chapter aims to provide serious questions to challenging the current conventional wisdom (i.e., status quo) of the state-of-the-art, craft a call for action, and to redefine the path forward.


Time-based Dynamic Controllability of Disjunctive Temporal Networks with Uncertainty: A Tree Search Approach with Graph Neural Network Guidance

arXiv.org Artificial Intelligence

Scheduling in the presence of uncertainty is an area of interest in artificial intelligence due to the large number of applications. We study the problem of dynamic controllability (DC) of disjunctive temporal networks with uncertainty (DTNU), which seeks a strategy to satisfy all constraints in response to uncontrollable action durations. We introduce a more restricted, stronger form of controllability than DC for DTNUs, time-based dynamic controllability (TDC), and present a tree search approach to determine whether or not a DTNU is TDC. Moreover, we leverage the learning capability of a message passing neural network (MPNN) as a heuristic for tree search guidance. Finally, we conduct experiments for which the tree search shows superior results to state-of-the-art timed-game automata (TGA) based approaches. We observe that using an MPNN for tree search guidance leads to a significant increase in solving performance and scalability to harder DTNU problems.


Machine Learning Constructives and Local Searches for the Travelling Salesman Problem

arXiv.org Artificial Intelligence

The Travelling Salesman Problem (TSP) is one of the most investigated problems in the Combinatorial Optimization (CO) field. This is partly due to the fact that it belongs to the set of NP-Hard problems, which makes it particularly challenging. Moreover, the many practical problems that can be reduced to this - such as in Ratnesh et al. [10] where models of the TSP are presented to be used in the manufacture of microchips - make it even more attractive. At the same time, the full potentials of Machine Learning (ML) and Deep Learning (DL) techniques are becoming increasingly recognized in the CO field [2]. Mele et al. [17] recently introduced ML-Constructive, a promising constructive approach that computes fast solutions in two separate phases.


Learning off-road maneuver plans for autonomous vehicles

arXiv.org Artificial Intelligence

This thesis explores the benefits machine learning algorithms can bring to online planning and scheduling for autonomous vehicles in off-road situations. Mainly, we focus on typical problems of interest which include computing itineraries that meet certain objectives, as well as computing scheduling strategies to execute synchronized maneuvers with other vehicles. We present a range of learning-based heuristics to assist different itinerary planners. We show that these heuristics allow a significant increase in performance for optimal planners. Furthermore, in the case of approximate planning, we show that not only does the running time decrease, the quality of the itinerary found also becomes almost always better. Finally, in order to synthesize strategies to execute synchronized maneuvers, we propose a novel type of scheduling controllability and a learning-assisted algorithm. The proposed framework achieves significant improvement on known benchmarks in this controllability type over the performance of state-of-the-art works in a related controllability type. Moreover, it is able to find strategies on complex scheduling problems for which previous works fail to do so.