Goto

Collaborating Authors

 Search


Learning Augmented, Multi-Robot Long-Horizon Navigation in Partially Mapped Environments

arXiv.org Artificial Intelligence

We present a novel approach for efficient and reliable goal-directed long-horizon navigation for a multi-robot team in a structured, unknown environment by predicting statistics of unknown space. Building on recent work in learning-augmented model based planning under uncertainty, we introduce a high-level state and action abstraction that lets us approximate the challenging Dec-POMDP into a tractable stochastic MDP. Our Multi-Robot Learning over Subgoals Planner (MR-LSP) guides agents towards coordinated exploration of regions more likely to reach the unseen goal. We demonstrate improvement in cost against other multi-robot strategies; in simulated office-like environments, we show that our approach saves 13.29% (2 robot) and 4.6% (3 robot) average cost versus standard non-learned optimistic planning and a learning-informed baseline.


Hybrid ACO-CI Algorithm for Beam Design problems

arXiv.org Artificial Intelligence

A range of complicated real-world problems have inspired the development of several optimization methods. Here, a novel hybrid version of the Ant colony optimization (ACO) method is developed using the sample space reduction technique of the Cohort Intelligence (CI) Algorithm. The algorithm is developed, and accuracy is tested by solving 35 standard benchmark test functions. Furthermore, the constrained version of the algorithm is used to solve two mechanical design problems involving stepped cantilever beams and I-section beams. The effectiveness of the proposed technique of solution is evaluated relative to contemporary algorithmic approaches that are already in use. The results show that our proposed hybrid ACO-CI algorithm will take lesser number of iterations to produce the desired output which means lesser computational time. For the minimization of weight of stepped cantilever beam and deflection in I-section beam a proposed hybrid ACO-CI algorithm yielded best results when compared to other existing algorithms. The proposed work could be investigate for variegated real world applications encompassing domains of engineering, combinatorial and health care problems.


ytopt: Autotuning Scientific Applications for Energy Efficiency at Large Scales

arXiv.org Artificial Intelligence

As we enter the exascale computing era, efficiently utilizing power and optimizing the performance of scientific applications under power and energy constraints has become critical and challenging. We propose a low-overhead autotuning framework to autotune performance and energy for various hybrid MPI/OpenMP scientific applications at large scales and to explore the tradeoffs between application runtime and power/energy for energy efficient application execution, then use this framework to autotune four ECP proxy applications -- XSBench, AMG, SWFFT, and SW4lite. Our approach uses Bayesian optimization with a Random Forest surrogate model to effectively search parameter spaces with up to 6 million different configurations on two large-scale production systems, Theta at Argonne National Laboratory and Summit at Oak Ridge National Laboratory. The experimental results show that our autotuning framework at large scales has low overhead and achieves good scalability. Using the proposed autotuning framework to identify the best configurations, we achieve up to 91.59% performance improvement, up to 21.2% energy savings, and up to 37.84% EDP improvement on up to 4,096 nodes.


HyT-NAS: Hybrid Transformers Neural Architecture Search for Edge Devices

arXiv.org Artificial Intelligence

Vision Transformers have enabled recent attention-based Deep Learning (DL) architectures to achieve remarkable results in Computer Vision (CV) tasks. However, due to the extensive computational resources required, these architectures are rarely implemented on resource-constrained platforms. Current research investigates hybrid handcrafted convolution-based and attention-based models for CV tasks such as image classification and object detection. In this paper, we propose HyT-NAS, an efficient Hardware-aware Neural Architecture Search (HW-NAS) including hybrid architectures targeting vision tasks on tiny devices. HyT-NAS improves state-of-the-art HW-NAS by enriching the search space and enhancing the search strategy as well as the performance predictors. Our experiments show that HyT-NAS achieves a similar hypervolume with less than ~5x training evaluations. Our resulting architecture outperforms MLPerf MobileNetV1 by 6.3% accuracy improvement with 3.5x less number of parameters on Visual Wake Words.


Large-scale Training Data Search for Object Re-identification

arXiv.org Artificial Intelligence

We consider a scenario where we have access to the target domain, but cannot afford on-the-fly training data annotation, and instead would like to construct an alternative training set from a large-scale data pool such that a competitive model can be obtained. We propose a search and pruning (SnP) solution to this training data search problem, tailored to object re-identification (re-ID), an application aiming to match the same object captured by different cameras. Specifically, the search stage identifies and merges clusters of source identities which exhibit similar distributions with the target domain. The second stage, subject to a budget, then selects identities and their images from the Stage I output, to control the size of the resulting training set for efficient training. The two steps provide us with training sets 80\% smaller than the source pool while achieving a similar or even higher re-ID accuracy. These training sets are also shown to be superior to a few existing search methods such as random sampling and greedy sampling under the same budget on training data size. If we release the budget, training sets resulting from the first stage alone allow even higher re-ID accuracy. We provide interesting discussions on the specificity of our method to the re-ID problem and particularly its role in bridging the re-ID domain gap. The code is available at https://github.com/yorkeyao/SnP.


DisWOT: Student Architecture Search for Distillation WithOut Training

arXiv.org Artificial Intelligence

Knowledge distillation (KD) is an effective training strategy to improve the lightweight student models under the guidance of cumbersome teachers. However, the large architecture difference across the teacher-student pairs limits the distillation gains. In contrast to previous adaptive distillation methods to reduce the teacher-student gap, we explore a novel training-free framework to search for the best student architectures for a given teacher. Our work first empirically show that the optimal model under vanilla training cannot be the winner in distillation. Secondly, we find that the similarity of feature semantics and sample relations between random-initialized teacher-student networks have good correlations with final distillation performances. Thus, we efficiently measure similarity matrixs conditioned on the semantic activation maps to select the optimal student via an evolutionary algorithm without any training. In this way, our student architecture search for Distillation WithOut Training (DisWOT) significantly improves the performance of the model in the distillation stage with at least 180$\times$ training acceleration. Additionally, we extend similarity metrics in DisWOT as new distillers and KD-based zero-proxies. Our experiments on CIFAR, ImageNet and NAS-Bench-201 demonstrate that our technique achieves state-of-the-art results on different search spaces. Our project and code are available at https://lilujunai.github.io/DisWOT-CVPR2023/.


Artificial intelligence approaches for materials-by-design of energetic materials: state-of-the-art, challenges, and future directions

arXiv.org Artificial Intelligence

Energetic materials (EM) cover a wide spectrum of propellants, pyrotechnics, and explosives and are key components in military applications for propulsion and munition systems and in civilian applications such as construction and mining [1]. Heterogenous/composite EMs have complex microstructures which significantly influence--along with chemistry--the property and performance of these materials [2-8]. There is increasing research interest in controlling the microstructure of EM, to engineer their properties and performance for targeted functional specificity [9-10]. EMs are typically solid-solid composites of organic energetic crystals (commonly CHNO compounds), inclusions (i.e., metals, nanoparticles), and plastic binders. The CHNO materials are commonly categorized based on how sensitive they are to an external load/mechanical insult. They can range f rom'insensitive' (such as TATB - based EMs [11]) to'highly sensitive' (PETN-based EMs [12-13]) with others such as HMX, CL-20, and RDX ranging in between [14]. The sensitivity is closely connected with the molecular structure of these species of EMs within the CHNO family. However, when they are formed into propellants and explosives, the sensitivity is also impacted by the physical structure, composition, and formulation of the material mixtures, as reviewed by Handley et al. [1]. In other words, the design of a mixture and its microstructure can define the overall properties and performance characteristics of formed EM, thus opening the possibility of systematic methods to engineer materials by their design.


Sampling-Based Trajectory (re)planning for Differentially Flat Systems: Application to a 3D Gantry Crane

arXiv.org Artificial Intelligence

In this paper, a sampling-based trajectory planning algorithm for a laboratory-scale 3D gantry crane in an environment with static obstacles and subject to bounds on the velocity and acceleration of the gantry crane system is presented. The focus is on developing a fast motion planning algorithm for differentially flat systems, where intermediate results can be stored and reused for further tasks, such as replanning. The proposed approach is based on the informed optimal rapidly exploring random tree algorithm (informed RRT*), which is utilized to build trajectory trees that are reused for replanning when the start and/or target states change. In contrast to state-of-the-art approaches, the proposed motion planning algorithm incorporates a linear quadratic minimum time (LQTM) local planner. Thus, dynamic properties such as time optimality and the smoothness of the trajectory are directly considered in the proposed algorithm. Moreover, by integrating the branch-and-bound method to perform the pruning process on the trajectory tree, the proposed algorithm can eliminate points in the tree that do not contribute to finding better solutions. This helps to curb memory consumption and reduce the computational complexity during motion (re)planning. Simulation results for a validated mathematical model of a 3D gantry crane show the feasibility of the proposed approach.


Common Subexpression-based Compression and Multiplication of Sparse Constant Matrices

arXiv.org Artificial Intelligence

In deep learning inference, model parameters are pruned and quantized to reduce the model size. Compression methods and common subexpression (CSE) elimination algorithms are applied on sparse constant matrices to deploy the models on low-cost embedded devices. However, the state-of-the-art CSE elimination methods do not scale well for handling large matrices. They reach hours for extracting CSEs in a $200 \times 200$ matrix while their matrix multiplication algorithms execute longer than the conventional matrix multiplication methods. Besides, there exist no compression methods for matrices utilizing CSEs. As a remedy to this problem, a random search-based algorithm is proposed in this paper to extract CSEs in the column pairs of a constant matrix. It produces an adder tree for a $1000 \times 1000$ matrix in a minute. To compress the adder tree, this paper presents a compression format by extending the Compressed Sparse Row (CSR) to include CSEs. While compression rates of more than $50\%$ can be achieved compared to the original CSR format, simulations for a single-core embedded system show that the matrix multiplication execution time can be reduced by $20\%$.


Causality-based Counterfactual Explanation for Classification Models

arXiv.org Artificial Intelligence

Counterfactual explanation is one branch of interpretable machine learning that produces a perturbation sample to change the model's original decision. The generated samples can act as a recommendation for end-users to achieve their desired outputs. Most of the current counterfactual explanation approaches are the gradient-based method, which can only optimize the differentiable loss functions with continuous variables. Accordingly, the gradient-free methods are proposed to handle the categorical variables, which however have several major limitations: 1) causal relationships among features are typically ignored when generating the counterfactuals, possibly resulting in impractical guidelines for decision-makers; 2) the counterfactual explanation algorithm requires a great deal of effort into parameter tuning for dertermining the optimal weight for each loss functions which must be conducted repeatedly for different datasets and settings. In this work, to address the above limitations, we propose a prototype-based counterfactual explanation framework (ProCE). ProCE is capable of preserving the causal relationship underlying the features of the counterfactual data. In addition, we design a novel gradient-free optimization based on the multi-objective genetic algorithm that generates the counterfactual explanations for the mixed-type of continuous and categorical features. Numerical experiments demonstrate that our method compares favorably with state-of-the-art methods and therefore is applicable to existing prediction models. All the source codes and data are available at \url{https://github.com/tridungduong16/multiobj-scm-cf}.