AITopics | bitvector

Collaborating Authors

bitvector

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Yggdrasil: An Optimized System for Training Deep Decision Trees at Scale

Firas Abuzaid, Joseph K. Bradley, Feynman T. Liang, Andrew Feng, Lee Yang, Matei Zaharia, Ameet S. Talwalkar

Neural Information Processing SystemsNov-21-2025, 08:26:21 GMT

Neural Information Processing Systems http://nips.cc/

communication cost, ggdrasil, node, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

SBVR: Summation of BitVector Representation for Efficient LLM Quantization

Bang, Wonjun, Park, Jongseok, Yu, Hongseung, Bin, Kyungmin, Lee, Kyunghan

arXiv.org Artificial IntelligenceSep-24-2025

With the advent of large language models (LLMs), numerous Post-Training Quantization (PTQ) strategies have been proposed to alleviate deployment barriers created by their enormous parameter counts. Quantization achieves compression by limiting the number of representable points in the data. Therefore, the key to achieving efficient quantization is selecting the optimal combination of representation points, or codes, for the given data. Existing PTQ solutions adopt two major approaches to this problem: Round-To-Nearest (RTN)-based methods and codebook-based methods. RTN-based methods map LLM weights onto uniformly distributed integer grids, failing to account for the Gaussian-like weight distribution of LLM weights. Codebook-based methods mitigate this issue by constructing distribution-aware codebooks; however, they suffer from random and strided memory access patterns, resulting in degraded inference speed that is exacerbated by the limited size of GPU L1 cache. To overcome these limitations, we propose a novel LLM quantization method, SBVR (Summation of BitVector Representation), that enables Gaussian-like code representation in a hardware-friendly manner for fast inference. SBVR maps weight values to non-uniform representation points whose distribution follows the actual distribution of LLM weights, enabling more accurate compression. Additionally, we design a custom CUDA kernel that allows matrix-vector multiplication directly in the SBVR format without decompression, thereby enabling high-performance execution of SBVR-compressed models. Our evaluations of SBVR on various models demonstrate state-of-the-art perplexity and accuracy benchmark performance while delivering a 2.21x- 3.04x end-to-end token-generation speedup over naive FP16 models in the 4-bit quantization regime.

large language model, machine learning, quantization, (18 more...)

arXiv.org Artificial Intelligence

2509.18172

Country:

North America > United States > Minnesota (0.28)
North America > United States > California (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Synthetic Programming Elicitation and Repair for Text-to-Code in Very Low-Resource Programming Languages

Mora, Federico, Wong, Justin, Lepe, Haley, Bhatia, Sahil, Elmaaroufi, Karim, Varghese, George, Gonzalez, Joseph E., Polgreen, Elizabeth, Seshia, Sanjit A.

arXiv.org Artificial IntelligenceJun-29-2024

Recent advances in large language models (LLMs) for code applications have demonstrated remarkable zero-shot fluency and instruction following on challenging code related tasks ranging from test case generation to self-repair. Unsurprisingly, however, models struggle to compose syntactically valid programs in programming languages unrepresented in pre-training, referred to as very low-resource Programming Languages (VLPLs). VLPLs appear in crucial settings, including domain-specific languages for internal tools and tool-chains for legacy languages. Inspired by an HCI technique called natural program elicitation, we propose designing an intermediate language that LLMs ``naturally'' know how to use and which can be automatically compiled to a target VLPL. When LLMs generate code that lies outside of this intermediate language, we use compiler techniques to repair the code into programs in the intermediate language. Overall, we introduce \emph{synthetic programming elicitation and compilation} (SPEAC), an approach that enables LLMs to generate syntactically valid code even for VLPLs. We empirically evaluate the performance of SPEAC in a case study and find that, compared to existing retrieval and fine-tuning baselines, SPEAC produces syntactically correct programs significantly more frequently without sacrificing semantic correctness.

arxiv preprint arxiv, llm, programming language, (15 more...)

arXiv.org Artificial Intelligence

2406.03636

Country:

Oceania > Australia (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
(5 more...)

Genre:

Research Report (0.82)
Workflow (0.68)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Yggdrasil: An Optimized System for Training Deep Decision Trees at Scale

Neural Information Processing SystemsMar-12-2024, 15:28:30 GMT

Deep distributed decision trees and tree ensembles have grown in importance due to the need to model increasingly large datasets.

artificial intelligence, ggdrasil, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

LearningGroup: A Real-Time Sparse Training on FPGA via Learnable Weight Grouping for Multi-Agent Reinforcement Learning

Yang, Je, Kim, JaeUk, Kim, Joo-Young

arXiv.org Artificial IntelligenceOct-29-2022

Abstract--Multi-agent reinforcement learning (MARL) is a powerful technology to construct interactive artificial intelligent systems in various applications such as multi-robot control and self-driving cars. Unlike supervised model or single-agent reinforcement learning, which actively exploits network pruning, it is obscure that how pruning will work in multi-agent reinforcement learning with its cooperative and interactive characteristics. MARL, which are 7.13 higher and 12.43 more energy efficient Most importantly, the accelerator shows speedup up to 12.52 for MARL requires up to 942.9 GFLOPS for effective realtime In addition, as the MARL system is I. Current CPU and GPU-based systems cannot learning, known for solving long-term decision-making problems meet the above requirements due to the lack of computing effectively. It aims to train the action policy, which is units, high power consumption or low utilization for small about how an agent should take actions based on the feedback batch sizes. Instead, FPGA is emerging as a new solution for from the given environment to maximize cumulative rewards. For example, Recently, deep reinforcement learning (DRL) that utilizes a the Xilinx U280 acceleration card provides robust computing deep neural network (DNN) as an action policy has been proposed potential through 9,024 DSPs over 41MB of on-chip BRAM [1]-[4]. Although DRL stands out in various domains while showing less power consumption than GPU. In addition, such as industrial control and robotics [5]-[7], all of them the reconfigurability of FPGA allows the optimization of are limited to a single agent. Other significant applications irregular data access and parallelism with customized compact have started to employ interaction between multiple agents, for data format, where these hardware overhead occurs in network instance, analysis of language communication and the network pruning to handle computation-bound applications. Hence, extending DRL to have In this paper, we propose a FPGA-based acceleration system many agents is critical for developing intelligent systems named LearningGroup, to yield high performance for where agents can interact with each other or even with people.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2210.16624

Genre: Research Report (0.50)

Industry:

Transportation > Ground > Road (0.66)
Information Technology > Robotics & Automation (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Rethink Decision Tree Traversal

Zhang, Jinxiong

arXiv.org Artificial IntelligenceOct-6-2022

QuickScorer[12] and RapidScorer[21] are proposed based on bit-vectors of the false nodes in order to speed up the additive ensemble of regression trees in learning to rank. Inspired by [12], more works, such as [2; 11; 13; 15], focus on the application and acceleration of additive tree models while we will pay attention to the theory of algorithms specially the representation of binary decision tree in the language of matrix computation. Based on so-called Tree Supervision Loss, a hierarchical classifier is built from the weights of the softmax layer in convolutional neural networks in [18]. In [20; 19], tree regularization is used to enhance the interpretability of deep neural networks. A generalized tree representation termed TART is based on transition matrix shown in [22].

artificial intelligence, bitvector, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2209.04825

Country: Asia > Taiwan > Taiwan Province > Taipei (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

Yggdrasil: An Optimized System for Training Deep Decision Trees at Scale

Abuzaid, Firas, Bradley, Joseph K., Liang, Feynman T., Feng, Andrew, Yang, Lee, Zaharia, Matei, Talwalkar, Ameet S.

Neural Information Processing SystemsDec-31-2016

Deep distributed decision trees and tree ensembles have grown in importance due to the need to model increasingly large datasets. However, PLANET, the standard distributed tree learning algorithm implemented in systems such as \xgboost and Spark MLlib, scales poorly as data dimensionality and tree depths grow. We present Yggdrasil, a new distributed tree learning method that outperforms existing methods by up to 24x. Unlike PLANET, Yggdrasil is based on vertical partitioning of the data (i.e., partitioning by feature), along with a set of optimized data structures to reduce the CPU and communication costs of training. Yggdrasil (1) trains directly on compressed data for compressible features and labels; (2) introduces efficient data structures for training on uncompressed data; and (3) minimizes communication between nodes by using sparse bitvectors. Moreover, while PLANET approximates split points through feature binning, Yggdrasil does not require binning, and we analytically characterize the impact of this approximation. We evaluate Yggdrasil against the MNIST 8M dataset and a high-dimensional dataset at Yahoo; for both, Yggdrasil is faster by up to an order of magnitude.

artificial intelligence, machine learning, yggdrasil, (17 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

GPU Exploration of Two-Player Games with Perfect Hash Functions

Edelkamp, Stefan (University of Bremen) | Sulewski, Damian (University of Bremen) | Yücel, Cengizhan (Dortmund University of Technology)

AAAI ConferencesAug-25-2010

In this paper we improve solving two-player games by computing the game-theoretical value of every reachable state. A graphics processing unit located on the graphics card is used as a co-processor to accelerate the solution process. We exploit perfect hash functions to store the game states efficiently in memory and to transfer their ordinal representation between the host and the graphics card. As an application we validate Gasser's results that Nine-Men-Morris is a draw on a personal computer. Moreover, our solution is strong, while for the opening phase Gasser only provided a weak solution.

coefficient, hash function, perfect hash function, (14 more...)

AAAI Conferences

Third Annual Symposium on Combinatorial Search

Country:

North America > Canada > Alberta (0.14)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
Europe > Germany > Bremen > Bremen (0.04)

Industry: Leisure & Entertainment > Games (0.35)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)

Add feedback

Perfect Hashing for State Space Exploration on the GPU

Edelkamp, Stefan (TZI, Universität Bremen) | Sulewski, Damian (TZI, Universität Bremen) | Yücel, Cengizhan (Technische Universität Dortmund)

AAAI ConferencesMay-1-2010

This paper exploits parallel computing power of graphics cards to accelerate state space search. We illustrate that modern graphics processing units (GPUs) have the potential to speed up breadth-first search significantly. For a bitvector representation of the search frontier, GPU algorithms with one and two bits per state are presented. Efficient perfect hash functions and their inverse are explored in order to achieve enhanced compression. We report maximal speed-ups of up to a factor of 27 wrt. single core CPU computation.

algorithm, hash function, perfect hash function, (15 more...)

AAAI Conferences

Twentieth International Conference on Automated Planning and Scheduling

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > New York > New York County > New York City (0.04)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Add feedback