AITopics

2406.14449

Country:

North America > United States (0.14)
Asia (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceDec-11-2023

FOSS: A Self-Learned Doctor for Query Optimizer

Zhong, Kai, Sun, Luming, Ji, Tao, Li, Cuiping, Chen, Hong

Various works have utilized deep reinforcement learning (DRL) to address the query optimization problem in database system. They either learn to construct plans from scratch in a bottom-up manner or guide the plan generation behavior of traditional optimizer using hints. While these methods have achieved some success, they face challenges in either low training efficiency or limited plan search space. To address these challenges, we introduce FOSS, a novel DRL-based framework for query optimization. FOSS initiates optimization from the original plan generated by a traditional optimizer and incrementally refines suboptimal nodes of the plan through a sequence of actions. Additionally, we devise an asymmetric advantage model to evaluate the advantage between two plans. We integrate it with a traditional optimizer to form a simulated environment. Leveraging this simulated environment, FOSS can bootstrap itself to rapidly generate a large amount of high-quality simulated experiences. FOSS then learns and improves its optimization capability from these simulated experiences. We evaluate the performance of FOSS on Join Order Benchmark, TPC-DS, and Stack Overflow. The experimental results demonstrate that FOSS outperforms the state-of-the-art methods in terms of latency performance and optimization time. Compared to PostgreSQL, FOSS achieves savings ranging from 15% to 83% in total latency across different benchmarks.

machine learning, natural language, reinforcement learning, (23 more...)

2312.06357

Country: North America > United States > New York (0.14)

Genre:

Workflow (1.00)
Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.86)

arXiv.org Artificial IntelligenceJun-13-2021

BoolNet: Minimizing The Energy Consumption of Binary Neural Networks

Guo, Nianhui, Bethge, Joseph, Yang, Haojin, Zhong, Kai, Ning, Xuefei, Meinel, Christoph, Wang, Yu

Recent works on Binary Neural Networks (BNNs) have made promising progress in narrowing the accuracy gap of BNNs to their 32-bit counterparts. However, the accuracy gains are often based on specialized model designs using additional 32-bit components. Furthermore, almost all previous BNNs use 32-bit for feature maps and the shortcuts enclosing the corresponding binary convolution blocks, which helps to effectively maintain the accuracy, but is not friendly to hardware accelerators with limited memory, energy, and computing resources. Thus, we raise the following question: How can accuracy and energy consumption be balanced in a BNN network design? We extensively study this fundamental problem in this work and propose a novel BNN architecture without most commonly used 32-bit components: \textit{BoolNet}. Experimental results on ImageNet demonstrate that BoolNet can achieve 4.6x energy reduction coupled with 1.2\% higher accuracy than the commonly used BNN architecture Bi-RealNet. Code and trained models are available at: https://github.com/hpi-xnor/BoolNet.

boolnet, deep learning, neural network, (18 more...)

2106.06991

Country: Europe (0.14)

Genre: Research Report (1.00)

Industry: Energy (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJan-10-2021

Machine Learning for Electronic Design Automation: A Survey

Huang, Guyue, Hu, Jingbo, He, Yifan, Liu, Jialong, Ma, Mingyuan, Shen, Zhaoyang, Wu, Juejian, Xu, Yuanfan, Zhang, Hengrui, Zhong, Kai, Ning, Xuefei, Ma, Yuzhe, Yang, Haoyu, Yu, Bei, Yang, Huazhong, Wang, Yu

In recent years, with the development of semiconductor technology, the scale of integrated circuit (IC) has grown exponentially, challenging the scalability and reliability of the circuit design flow. Therefore, EDA algorithms and software are required to be more effective and efficient to deal with extremely large search space with low latency. Machine learning (ML) is taking an important role in our lives these days, which has been widely used in many scenarios. ML methods, including traditional and deep learning algorithms, achieve amazing performance in solving classification, detection, and design space exploration problems. Additionally, ML methods show great potential to generate high-quality solutions for many NP-complete (NPC) problems, which are common in the EDA field, while traditional methods lead to huge time and resource consumption to solve these problems. Traditional methods usually solve every problem from the beginning, with a lack of knowledge accumulation. Instead, ML algorithms focus on extracting high-level features or patterns that can be reused in other related or similar situations, avoiding repeated complicated analysis. Therefore, applying machine learning methods is a promising direction to accelerate the solving of EDA problems. These authors are ordered alphabetically.

deep learning, neural network, optimization problem, (17 more...)

2102.03357

Country:

Asia > China (0.14)
North America > United States (0.14)
Europe > France (0.14)
Asia > Japan (0.14)

Genre:

Overview (1.00)
Research Report > New Finding (0.93)

Industry: Semiconductors & Electronics (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

arXiv.org Machine LearningJun-4-2020

Towards Lower Bit Multiplication for Convolutional Neural Network Training

Zhong, Kai, Zhao, Tianchen, Ning, Xuefei, Zeng, Shulin, Guo, Kaiyuan, Wang, Yu, Yang, Huazhong

Convolutional Neural Networks (CNNs) have been widely used in many fields. However, the training process costs much energy and time, in which the convolution operations consume the major part. In this paper, we propose a fixed-point training framework, in order to reduce the data bit-width for the convolution multiplications. Firstly, we propose two constrained group-wise scaling methods that can be implemented with low hardware cost. Secondly, to overcome the challenge of trading off overflow and rounding error, a shiftable fixed-point data format is used in this framework. Finally, we propose a double-width deployment technique to boost inference performance with the same bit-width hardware multiplier. The experimental results show that the input data of convolution in the training process can be quantized to 2-bit for CIFAR-10 dataset, 6-bit for ImageNet dataset, with negligible accuracy degradation. Furthermore, our fixed-point train-ing framework has the potential to save at least 75% energy of the computation in the training process.

artificial intelligence, deep learning, neural network, (15 more...)

arXiv.org Machine Learning

2006.02804

Genre: Research Report > New Finding (0.48)

Industry: Energy (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsFeb-14-2020, 21:14:29 GMT

MixLasso: Generalized Mixed Regression via Convex Atomic-Norm Regularization

Yen, Ian En-Hsu, Lee, Wei-Cheng, Zhong, Kai, Chang, Sung-En, Ravikumar, Pradeep K., Lin, Shou-De

artificial intelligence, machine learning, mixed regression, (5 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.63)

Neural Information Processing SystemsFeb-14-2020, 10:42:01 GMT

Mixed Linear Regression with Multiple Components

Zhong, Kai, Jain, Prateek, Dhillon, Inderjit S.

In this paper, we study the mixed linear regression (MLR) problem, where the goal is to recover multiple underlying linear models from their unlabeled linear measurements. We propose a non-convex objective function which we show is {\em locally strongly convex} in the neighborhood of the ground truth. We use a tensor method for initialization so that the initial models are in the local strong convexity region. We then employ general convex optimization algorithms to minimize the objective function. To the best of our knowledge, our approach provides first exact recovery guarantees for the MLR problem with $K \geq 2$ components.

artificial intelligence, linear regression, optimization problem, (8 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.64)

Neural Information Processing SystemsFeb-14-2020, 09:42:43 GMT

Proximal Quasi-Newton for Computationally Intensive L1-regularized M-estimators

Zhong, Kai, Yen, Ian En-Hsu, Dhillon, Inderjit S., Ravikumar, Pradeep K.

We consider the class of optimization problems arising from computationally intensive L1-regularized M-estimators, where the function or gradient values are very expensive to compute. A particular instance of interest is the L1-regularized MLE for learning Conditional Random Fields (CRFs), which are a popular class of statistical models for varied structured prediction problems such as sequence labeling, alignment, and classification with label taxonomy. L1-regularized MLEs for CRFs are particularly expensive to optimize since computing the gradient values requires an expensive inference step. In this work, we propose the use of a carefully constructed proximal quasi-Newton algorithm for such computationally intensive M-estimation problems, where we employ an aggressive active set selection technique. In a key contribution of the paper, we show that our proximal quasi-Newton algorithm is provably super-linearly convergent, even in the absence of strong convexity, by leveraging a restricted variant of strong convexity.

artificial intelligence, machine learning, proximal quasi-newton, (6 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningMay-6-2019

A Modular Deep Learning Approach for Extreme Multi-label Text Classification

Chang, Wei-Cheng, Yu, Hsiang-Fu, Zhong, Kai, Yang, Yiming, Dhillon, Inderjit

Extreme multi-label classification (XMC) aims to assign to an instance the most relevant subset of labels from a colossal label set. Due to modern applications that lead to massive label sets, the scalability of XMC has attracted much recent attention from both academia and industry. In this paper, we establish a three-stage framework to solve XMC efficiently, which includes 1) indexing the labels, 2) matching the instance to the relevant indices, and 3) ranking the labels from the relevant indices. This framework unifies many existing XMC approaches. Based on this framework, we propose a modular deep learning approach SLINMER: Semantic Label Indexing, Neural Matching, and Efficient Ranking. The label indexing stage of SLINMER can adopt different semantic label representations leading to different configurations of SLINMER. Empirically, we demonstrate that several individual configurations of SLINMER achieve superior performance than the state-of-the-art XMC approaches on several benchmark datasets. Moreover, by ensembling those configurations, SLINMER can achieve even better results. In particular, on a Wiki dataset with around 0.5 millions of labels, the precision@1 is increased from 61% to 67%.

deep learning, neural network, representation, (20 more...)

arXiv.org Machine Learning

1905.02331

Country: North America (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsDec-31-2018

MixLasso: Generalized Mixed Regression via Convex Atomic-Norm Regularization

Yen, Ian En-Hsu, Lee, Wei-Cheng, Zhong, Kai, Chang, Sung-En, Ravikumar, Pradeep K., Lin, Shou-De

We consider a generalization of mixed regression where the response is an additive combination of several mixture components. Standard mixed regression is a special case where each response is generated from exactly one component. Typical approaches to the mixture regression problem employ local search methods such as Expectation Maximization (EM) that are prone to spurious local optima. On the other hand, a number of recent theoretically-motivated \emph{Tensor-based methods} either have high sample complexity, or require the knowledge of the input distribution, which is not available in most of practical situations. In this work, we study a novel convex estimator \emph{MixLasso} for the estimation of generalized mixed regression, based on an atomic norm specifically constructed to regularize the number of mixture components. Our algorithm gives a risk bound that trades off between prediction accuracy and model sparsity without imposing stringent assumptions on the input/output distribution, and can be easily adapted to the case of non-linear functions. In our numerical experiments on mixtures of linear as well as nonlinear regressions, the proposed method yields high-quality solutions in a wider range of settings than existing approaches.

artificial intelligence, machine learning, mixlasso, (16 more...)

Country:

North America > Canada (0.14)
Asia (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)