AITopics | group number

Collaborating Authors

group number

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On the Nonlinearity of Layer Normalization

Ni, Yunhao, Guo, Yuxin, Jia, Junlong, Huang, Lei

arXiv.org Artificial IntelligenceJun-3-2024

Layer normalization (LN) is a ubiquitous technique in deep learning but our theoretical understanding to it remains elusive. This paper investigates a new theoretical direction for LN, regarding to its nonlinearity and representation capacity. We investigate the representation capacity of a network with layerwise composition of linear and LN transformations, referred to as LN-Net. We theoretically show that, given $m$ samples with any label assignment, an LN-Net with only 3 neurons in each layer and $O(m)$ LN layers can correctly classify them. We further show the lower bound of the VC dimension of an LN-Net. The nonlinearity of LN can be amplified by group partition, which is also theoretically demonstrated with mild assumption and empirically supported by our experiments. Based on our analyses, we consider to design neural architecture by exploiting and amplifying the nonlinearity of LN, and the effectiveness is supported by our experiments.

ln-net, nonlinearity, ssr, (14 more...)

arXiv.org Artificial Intelligence

2406.01255

Country:

Europe > Austria > Vienna (0.14)
Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

LearningGroup: A Real-Time Sparse Training on FPGA via Learnable Weight Grouping for Multi-Agent Reinforcement Learning

Yang, Je, Kim, JaeUk, Kim, Joo-Young

arXiv.org Artificial IntelligenceOct-29-2022

Abstract--Multi-agent reinforcement learning (MARL) is a powerful technology to construct interactive artificial intelligent systems in various applications such as multi-robot control and self-driving cars. Unlike supervised model or single-agent reinforcement learning, which actively exploits network pruning, it is obscure that how pruning will work in multi-agent reinforcement learning with its cooperative and interactive characteristics. MARL, which are 7.13 higher and 12.43 more energy efficient Most importantly, the accelerator shows speedup up to 12.52 for MARL requires up to 942.9 GFLOPS for effective realtime In addition, as the MARL system is I. Current CPU and GPU-based systems cannot learning, known for solving long-term decision-making problems meet the above requirements due to the lack of computing effectively. It aims to train the action policy, which is units, high power consumption or low utilization for small about how an agent should take actions based on the feedback batch sizes. Instead, FPGA is emerging as a new solution for from the given environment to maximize cumulative rewards. For example, Recently, deep reinforcement learning (DRL) that utilizes a the Xilinx U280 acceleration card provides robust computing deep neural network (DNN) as an action policy has been proposed potential through 9,024 DSPs over 41MB of on-chip BRAM [1]-[4]. Although DRL stands out in various domains while showing less power consumption than GPU. In addition, such as industrial control and robotics [5]-[7], all of them the reconfigurability of FPGA allows the optimization of are limited to a single agent. Other significant applications irregular data access and parallelism with customized compact have started to employ interaction between multiple agents, for data format, where these hardware overhead occurs in network instance, analysis of language communication and the network pruning to handle computation-bound applications. Hence, extending DRL to have In this paper, we propose a FPGA-based acceleration system many agents is critical for developing intelligent systems named LearningGroup, to yield high performance for where agents can interact with each other or even with people.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2210.16624

Genre: Research Report (0.50)

Industry:

Transportation > Ground > Road (0.66)
Information Technology > Robotics & Automation (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Associative Memory Based Experience Replay for Deep Reinforcement Learning

Li, Mengyuan, Kazemi, Arman, Laguna, Ann Franchesca, Hu, X. Sharon

arXiv.org Artificial IntelligenceJul-15-2022

Experience replay is an essential component in deep reinforcement learning (DRL), which stores the experiences and generates experiences for the agent to learn in real time. Recently, prioritized experience replay (PER) has been proven to be powerful and widely deployed in DRL agents. However, implementing PER on traditional CPU or GPU architectures incurs significant latency overhead due to its frequent and irregular memory accesses. This paper proposes a hardware-software co-design approach to design an associative memory (AM) based PER, AMPER, with an AM-friendly priority sampling operation. AMPER replaces the widely-used time-costly tree-traversal-based priority sampling in PER while preserving the learning performance. Further, we design an in-memory computing hardware architecture based on AM to support AMPER by leveraging parallel in-memory search operations. AMPER shows comparable learning performance while achieving 55x to 270x latency improvement when running on the proposed hardware compared to the state-of-the-art PER running on GPU.

amper, opération, priority value, (16 more...)

arXiv.org Artificial Intelligence

2207.07791

Country:

North America > United States > Indiana > St. Joseph County > Notre Dame (0.04)
Europe (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Automatic Mixed-Precision Quantization Search of BERT

Zhao, Changsheng, Hua, Ting, Shen, Yilin, Lou, Qian, Jin, Hongxia

arXiv.org Artificial IntelligenceDec-30-2021

Pre-trained language models such as BERT have shown remarkable effectiveness in various natural language processing tasks. However, these models usually contain millions of parameters, which prevents them from practical deployment on resource-constrained devices. Knowledge distillation, Weight pruning, and Quantization are known to be the main directions in model compression. However, compact models obtained through knowledge distillation may suffer from significant accuracy drop even for a relatively small compression ratio. On the other hand, there are only a few quantization attempts that are specifically designed for natural language processing tasks. They suffer from a small compression ratio or a large error rate since manual setting on hyper-parameters is required and fine-grained subgroup-wise quantization is not supported. In this paper, we proposed an automatic mixed-precision quantization framework designed for BERT that can simultaneously conduct quantization and pruning in a subgroup-wise level. Specifically, our proposed method leverages Differentiable Neural Architecture Search to assign scale and precision for parameters in each sub-group automatically, and at the same time pruning out redundant groups of parameters. Extensive evaluations on BERT downstream tasks reveal that our proposed method outperforms baselines by providing the same performance with much smaller model size. We also show the feasibility of obtaining the extremely light-weight model by combining our solution with orthogonal methods such as DistilBERT.

assignment, preprint arxiv, quantization, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.24963/ijcai.2021/472

2112.14938

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Group Whitening: Balancing Learning Efficiency and Representational Capacity

Huang, Lei, Liu, Li, Zhu, Fan, Shao, Ling

arXiv.org Machine LearningSep-28-2020

Batch normalization (BN) is an important technique commonly incorporated into deep learning models to perform standardization within mini-batches. The merits of BN in improving model's learning efficiency can be further amplified by applying whitening, while its drawbacks in estimating population statistics for inference can be avoided through group normalization (GN). This paper proposes group whitening (GW), which elaborately exploits the advantages of the whitening operation and avoids the disadvantages of normalization within mini-batches. Specifically, GW divides the neurons of a sample into groups for standardization, like GN, and then further decorrelates the groups. In addition, we quantitatively analyze the constraint imposed by normalization, and show how the batch size (group number) affects the performance of batch (group) normalized networks, from the perspective of model's representational capacity. This analysis provides theoretical guidance for applying GW in practice. Finally, we apply the proposed GW to ResNet and ResNeXt architectures and conduct experiments on the ImageNet and COCO benchmarks. Results show that GW consistently improves the performance of different architectures, with absolute gains of $1.02\%$ $\sim$ $1.49\%$ in top-1 accuracy on ImageNet and $1.82\%$ $\sim$ $3.21\%$ in bounding box AP on COCO.

machine learning, natural language, normalization, (21 more...)

arXiv.org Machine Learning

2009.13333

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Self-Tuning Spectral Clustering

Zelnik-manor, Lihi, Perona, Pietro

Neural Information Processing SystemsDec-31-2005

We study a number of open issues in spectral clustering: (i) Selecting the appropriate scale of analysis, (ii) Handling multi-scale data, (iii) Clustering with irregular background clutter, and, (iv) Finding automatically the number of groups. We first propose that a'local' scale should be used to compute the affinity between each pair of points. This local scaling leads to better clustering especially when the data includes multiple scales and when the clusters are placed within a cluttered background. We further suggest exploiting the structure of the eigenvectors to infer automatically the number of groups. This leads to a new algorithm in which the final randomly initialized k-means stage is eliminated.

eigenvalue, eigenvector, group number, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > California > Los Angeles County > Pasadena (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Alpes-Maritimes > Nice (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Self-Tuning Spectral Clustering

Zelnik-manor, Lihi, Perona, Pietro

Neural Information Processing SystemsDec-31-2005

eigenvalue, eigenvector, group number, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > California > Los Angeles County > Pasadena (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Alpes-Maritimes > Nice (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Self-Tuning Spectral Clustering

Zelnik-manor, Lihi, Perona, Pietro

Neural Information Processing SystemsDec-31-2005

We study a number of open issues in spectral clustering: (i) Selecting the appropriate scale of analysis, (ii) Handling multi-scale data, (iii) Clustering withirregular background clutter, and, (iv) Finding automatically the number of groups. We first propose that a'local' scale should be used to compute the affinity between each pair of points. This local scaling leads to better clustering especially when the data includes multiple scales and when the clusters are placed within a cluttered background. We further suggest exploiting the structure of the eigenvectors to infer automatically the number of groups. This leads to a new algorithm in which the final randomly initialized k-means stage is eliminated.

eigenvalue, eigenvector, group number, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > California > Los Angeles County > Pasadena (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Alpes-Maritimes > Nice (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback