Lin, Peng
Local Search for Integer Quadratic Programming
He, Xiang, Lin, Peng, Cai, Shaowei
Integer Quadratic Programming (IQP) is an important problem in operations research. Local search is a powerful method for solving hard problems, but the research on local search algorithms for IQP solving is still on its early stage. This paper develops an efficient local search solver for solving general IQP, called LS-IQCQP. We propose four new local search operators for IQP that can handle quadratic terms in the objective function, constraints or both. Furthermore, a two-mode local search algorithm is introduced, utilizing newly designed scoring functions to enhance the search process. Experiments are conducted on standard IQP benchmarks QPLIB and MINLPLIB, comparing LS-IQCQP with several state-of-the-art IQP solvers. Experimental results demonstrate that LS-IQCQP is competitive with the most powerful commercial solver Gurobi and outperforms other state-of-the-art solvers. Moreover, LS-IQCQP has established 6 new records for QPLIB and MINLPLIB open instances.
Dynamic neural network with memristive CIM and CAM for 2D and 3D vision
Zhang, Yue, Zhang, Woyu, Wang, Shaocong, Lin, Ning, Yu, Yifei, He, Yangu, Wang, Bo, Jiang, Hao, Lin, Peng, Xu, Xiaoxin, Qi, Xiaojuan, Wang, Zhongrui, Zhang, Xumeng, Shang, Dashan, Liu, Qi, Cheng, Kwang-Ting, Liu, Ming
The brain is dynamic, associative and efficient. It reconfigures by associating the inputs with past experiences, with fused memory and processing. In contrast, AI models are static, unable to associate inputs with past experiences, and run on digital computers with physically separated memory and processing. We propose a hardware-software co-design, a semantic memory-based dynamic neural network (DNN) using memristor. The network associates incoming data with the past experience stored as semantic vectors. The network and the semantic memory are physically implemented on noise-robust ternary memristor-based Computing-In-Memory (CIM) and Content-Addressable Memory (CAM) circuits, respectively. We validate our co-designs, using a 40nm memristor macro, on ResNet and PointNet++ for classifying images and 3D points from the MNIST and ModelNet datasets, which not only achieves accuracy on par with software but also a 48.1% and 15.9% reduction in computational budget. Moreover, it delivers a 77.6% and 93.3% reduction in energy consumption.
Stacking Factorizing Partitioned Expressions in Hybrid Bayesian Network Models
Lin, Peng, Neil, Martin, Fenton, Norman
Hybrid Bayesian networks (HBN) contain complex conditional probabilistic distributions (CPD) specified as partitioned expressions over discrete and continuous variables. The size of these CPDs grows exponentially with the number of parent nodes when using discrete inference, resulting in significant inefficiency. Normally, an effective way to reduce the CPD size is to use a binary factorization (BF) algorithm to decompose the statistical or arithmetic functions in the CPD by factorizing the number of connected parent nodes to sets of size two. However, the BF algorithm was not designed to handle partitioned expressions. Hence, we propose a new algorithm called stacking factorization (SF) to decompose the partitioned expressions. The SF algorithm creates intermediate nodes to incrementally reconstruct the densities in the original partitioned expression, allowing no more than two continuous parent nodes to be connected to each child node in the resulting HBN. SF can be either used independently or combined with the BF algorithm. We show that the SF+BF algorithm significantly reduces the CPD size and contributes to lowering the tree-width of a model, thus improving efficiency.
Random resistive memory-based deep extreme point learning machine for unified visual processing
Wang, Shaocong, Gao, Yizhao, Li, Yi, Zhang, Woyu, Yu, Yifei, Wang, Bo, Lin, Ning, Chen, Hegan, Zhang, Yue, Jiang, Yang, Wang, Dingchen, Chen, Jia, Dai, Peng, Jiang, Hao, Lin, Peng, Zhang, Xumeng, Qi, Xiaojuan, Xu, Xiaoxin, So, Hayden, Wang, Zhongrui, Shang, Dashan, Liu, Qi, Cheng, Kwang-Ting, Liu, Ming
Visual sensors, including 3D LiDAR, neuromorphic DVS sensors, and conventional frame cameras, are increasingly integrated into edge-side intelligent machines. Realizing intensive multi-sensory data analysis directly on edge intelligent machines is crucial for numerous emerging edge applications, such as augmented and virtual reality and unmanned aerial vehicles, which necessitates unified data representation, unprecedented hardware energy efficiency and rapid model training. However, multi-sensory data are intrinsically heterogeneous, causing significant complexity in the system development for edge-side intelligent machines. In addition, the performance of conventional digital hardware is limited by the physically separated processing and memory units, known as the von Neumann bottleneck, and the physical limit of transistor scaling, which contributes to the slowdown of Moore's law. These limitations are further intensified by the tedious training of models with ever-increasing sizes. We propose a novel hardware-software co-design, random resistive memory-based deep extreme point learning machine (DEPLM), that offers efficient unified point set analysis. We show the system's versatility across various data modalities and two different learning tasks. Compared to a conventional digital hardware-based system, our co-design system achieves huge energy efficiency improvements and training cost reduction when compared to conventional systems. Our random resistive memory-based deep extreme point learning machine may pave the way for energy-efficient and training-friendly edge AI across various data modalities and tasks.
Pruning random resistive memory for optimizing analogue AI
Li, Yi, Wang, Songqi, Zhao, Yaping, Wang, Shaocong, Zhang, Woyu, He, Yangu, Lin, Ning, Cui, Binbin, Chen, Xi, Zhang, Shiming, Jiang, Hao, Lin, Peng, Zhang, Xumeng, Qi, Xiaojuan, Wang, Zhongrui, Xu, Xiaoxin, Shang, Dashan, Liu, Qi, Cheng, Kwang-Ting, Liu, Ming
The rapid advancement of artificial intelligence (AI) has been marked by the large language models exhibiting human-like intelligence. However, these models also present unprecedented challenges to energy consumption and environmental sustainability. One promising solution is to revisit analogue computing, a technique that predates digital computing and exploits emerging analogue electronic devices, such as resistive memory, which features in-memory computing, high scalability, and nonvolatility. However, analogue computing still faces the same challenges as before: programming nonidealities and expensive programming due to the underlying devices physics. Here, we report a universal solution, software-hardware co-design using structural plasticity-inspired edge pruning to optimize the topology of a randomly weighted analogue resistive memory neural network. Software-wise, the topology of a randomly weighted neural network is optimized by pruning connections rather than precisely tuning resistive memory weights. Hardware-wise, we reveal the physical origin of the programming stochasticity using transmission electron microscopy, which is leveraged for large-scale and low-cost implementation of an overparameterized random neural network containing high-performance sub-networks. We implemented the co-design on a 40nm 256K resistive memory macro, observing 17.3% and 19.9% accuracy improvements in image and audio classification on FashionMNIST and Spoken digits datasets, as well as 9.8% (2%) improvement in PR (ROC) in image segmentation on DRIVE datasets, respectively. This is accompanied by 82.1%, 51.2%, and 99.8% improvement in energy efficiency thanks to analogue in-memory computing. By embracing the intrinsic stochasticity and in-memory computing, this work may solve the biggest obstacle of analogue computing systems and thus unleash their immense potential for next-generation AI hardware.
New Characterizations and Efficient Local Search for General Integer Linear Programming
Lin, Peng, Cai, Shaowei, Zou, Mengchuan, Lin, Jinkun
Integer linear programming (ILP) models a wide range of practical combinatorial optimization problems and has significant impacts in industry and management sectors. This work proposes new characterizations of ILP with the concept of boundary solutions. Motivated by the new characterizations, we develop an efficient local search solver, which is the first local search solver for general ILP validated on a large heterogeneous problem dataset. We propose a new local search framework that switches between three modes, namely Search, Improve, and Restore modes. We design tailored operators adapted to different modes, thus improving the quality of the current solution according to different situations. For the Search and Restore modes, we propose an operator named tight move, which adaptively modifies variables' values, trying to make some constraint tight. For the Improve mode, an efficient operator lift move is proposed to improve the quality of the objective function while maintaining feasibility. Putting these together, we develop a local search solver for integer linear programming called Local-ILP. Experiments conducted on the MIPLIB dataset show the effectiveness of our solver in solving large-scale hard integer linear programming problems within a reasonably short time. Local-ILP is competitive and complementary to the state-of-the-art commercial solver Gurobi and significantly outperforms the state-of-the-art non-commercial solver SCIP. Moreover, our solver establishes new records for 6 MIPLIB open instances. The theoretical analysis of our algorithm is also presented, which shows our algorithm could avoid visiting unnecessary regions and also maintain good connectivity of targeted solutions.
FreeEagle: Detecting Complex Neural Trojans in Data-Free Cases
Fu, Chong, Zhang, Xuhong, Ji, Shouling, Wang, Ting, Lin, Peng, Feng, Yanghe, Yin, Jianwei
Trojan attack on deep neural networks, also known as backdoor attack, is a typical threat to artificial intelligence. A trojaned neural network behaves normally with clean inputs. However, if the input contains a particular trigger, the trojaned model will have attacker-chosen abnormal behavior. Although many backdoor detection methods exist, most of them assume that the defender has access to a set of clean validation samples or samples with the trigger, which may not hold in some crucial real-world cases, e.g., the case where the defender is the maintainer of model-sharing platforms. Thus, in this paper, we propose FreeEagle, the first data-free backdoor detection method that can effectively detect complex backdoor attacks on deep neural networks, without relying on the access to any clean samples or samples with the trigger. The evaluation results on diverse datasets and model architectures show that FreeEagle is effective against various complex backdoor attacks, even outperforming some state-of-the-art non-data-free backdoor detection methods.
Improved High Dimensional Discrete Bayesian Network Inference using Triplet Region Construction
Lin, Peng ( Capital University of Economics and Business) | Neil, Martin | Fenton, Norman
Performing efficient inference on high dimensional discrete Bayesian Networks (BNs) is challenging. When using exact inference methods the space complexity can grow exponentially with the tree-width, thus making computation intractable. This paper presents a general purpose approximate inference algorithm, based on a new region belief approximation method, called Triplet Region Construction (TRC). TRC reduces the cluster space complexity for factorized models from worst-case exponential to polynomial by performing graph factorization and producing clusters of limited size. Unlike previous generations of region-based algorithms, TRC is guaranteed to converge and effectively addresses the region choice problem that bedevils other region-based algorithms used for BN inference. Our experiments demonstrate that it also achieves significantly more accurate results than competing algorithms.
Infinite Hidden Semi-Markov Modulated Interaction Point Process
zhang, matt, Lin, Peng, Lin, Peng, Guo, Ting, Wang, Yang, Wang, Yang, Chen, Fang
The correlation between events is ubiquitous and important for temporal events modelling. In many cases, the correlation exists between not only events' emitted observations, but also their arrival times. State space models (e.g., hidden Markov model) and stochastic interaction point process models (e.g., Hawkes process) have been studied extensively yet separately for the two types of correlations in the past. In this paper, we propose a Bayesian nonparametric approach that considers both types of correlations via unifying and generalizing hidden semi-Markov model and interaction point process model. The proposed approach can simultaneously model both the observations and arrival times of temporal events, and determine the number of latent states from data. A Metropolis-within-particle-Gibbs sampler with ancestor resampling is developed for efficient posterior inference. The approach is tested on both synthetic and real-world data with promising outcomes.
Interaction Point Processes via Infinite Branching Model
Lin, Peng (NICTA and the University of New South Wales) | Zhang, Bang (NICTA ) | Guo, Ting (NICTA) | Wang, Yang (NICTA) | Chen, Fang (NICTA)
Many natural and social phenomena can be modeled by interaction point processes (IPPs) (Diggle et al. 1994), stochastic point processes considering the interaction between points. In this paper, we propose the infinite branching model (IBM), a Bayesian statistical model that can generalize and extend some popular IPPs, e.g., Hawkes process (Hawkes 1971; Hawkes and Oakes 1974). It treats IPP as a mixture of basis point processes with the aid of a distance dependent prior over branching structure that describes the relationship between points. The IBM can estimate point event intensity, interaction mechanism and branching structure simultaneously. A generic Metropolis-within-Gibbs sampling method is also developed for model parameter inference. The experiments on synthetic and real-world data demonstrate the superiority of the IBM.