AITopics | large-scale training

Collaborating Authors

large-scale training

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Probabilistic Weight Fixing: Large-scale training of neural network weight uncertainties for quantisation.

Neural Information Processing SystemsDec-26-2025, 15:40:45 GMT

Weight-sharing quantization has emerged as a technique to reduce energy expenditure during inference in large neural networks by constraining their weights to a limited set of values. However, existing methods often assume weights are treated solely based on value, neglecting the unique role of weight position. This paper proposes a probabilistic framework based on Bayesian neural networks (BNNs) and a variational relaxation to identify which weights can be moved to which cluster center and to what degree based on their individual position-specific learned uncertainty distributions. We introduce a new initialization setting and a regularization term, enabling the training of BNNs with complex dataset-model combinations. Leveraging the flexibility of weight values from probability distributions, we enhance noise resilience and compressibility. Our iterative clustering procedure demonstrates superior compressibility and higher accuracy compared to state-of-the-art methods on both ResNet models and the more complex transformer-based architectures. In particular, our method outperforms the state-of-the-art quantization method top-1 accuracy by 1.6\% on ImageNet using DeiT-Tiny, with its 5 million+ weights now represented by only 296 unique values.

large-scale training, neural network weight uncertainty, probabilistic weight fixing, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Advancing Event Forecasting through Massive Training of Large Language Models: Challenges, Solutions, and Broader Impacts

Lee, Sang-Woo, Yang, Sohee, Kwak, Donghyun, Siegel, Noah Y.

arXiv.org Artificial IntelligenceJul-28-2025

Many recent papers have studied the development of superforecaster-level event forecasting LLMs. While methodological problems with early studies cast doubt on the use of LLMs for event forecasting, recent studies with improved evaluation methods have shown that state-of-the-art LLMs are gradually reaching superforecaster-level performance, and reinforcement learning has also been reported to improve future forecasting. Additionally, the unprecedented success of recent reasoning models and Deep Research-style models suggests that technology capable of greatly improving forecasting performance has been developed. Therefore, based on these positive recent trends, we argue that the time is ripe for research on large-scale training of superforecaster-level event forecasting LLMs. We discuss two key research directions: training methods and data acquisition. For training, we first introduce three difficulties of LLM-based event forecasting training: noisiness-sparsity, knowledge cut-off, and simple reward structure problems. Then, we present related ideas to mitigate these problems: hypothetical event Bayesian networks, utilizing poorly-recalled and counterfactual events, and auxiliary reward signals. For data, we propose aggressive use of market, public, and crawling datasets to enable large-scale training and evaluation. Finally, we explain how these technical advances could enable AI to provide predictive intelligence to society in broader areas. This position paper presents promising specific paths and considerations for getting closer to superforecaster-level AI technology, aiming to call for researchers' interest in these directions.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2507.19477

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.67)

Industry:

Banking & Finance > Trading (1.00)
Banking & Finance > Economy (1.00)
Leisure & Entertainment (0.93)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Probabilistic Weight Fixing: Large-scale training of neural network weight uncertainties for quantisation.

Neural Information Processing SystemsJan-19-2025, 20:44:47 GMT

Weight-sharing quantization has emerged as a technique to reduce energy expenditure during inference in large neural networks by constraining their weights to a limited set of values. However, existing methods often assume weights are treated solely based on value, neglecting the unique role of weight position. This paper proposes a probabilistic framework based on Bayesian neural networks (BNNs) and a variational relaxation to identify which weights can be moved to which cluster center and to what degree based on their individual position-specific learned uncertainty distributions. We introduce a new initialization setting and a regularization term, enabling the training of BNNs with complex dataset-model combinations. Leveraging the flexibility of weight values from probability distributions, we enhance noise resilience and compressibility.

artificial intelligence, machine learning, neural network weight uncertainty, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A Semantic Partitioning Method for Large-Scale Training of Knowledge Graph Embeddings

Bai, Yuhe

arXiv.org Artificial IntelligenceJan-8-2025

In recent years, knowledge graph embeddings have achieved great success. Many methods have been proposed and achieved state-of-the-art results in various tasks. However, most of the current methods present one or more of the following problems: (i) They only consider fact triplets, while ignoring the ontology information of knowledge graphs. (ii) The obtained embeddings do not contain much semantic information. Therefore, using these embeddings for semantic tasks is problematic. (iii) They do not enable large-scale training. In this paper, we propose a new algorithm that incorporates the ontology of knowledge graphs and partitions the knowledge graph based on classes to include more semantic information for parallel training of large-scale knowledge graph embeddings. Our preliminary results show that our algorithm performs well on several popular benchmarks.

artificial intelligence, knowledge graph, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3543873.358753

2501.04613

Country:

Europe (0.28)
North America > United States (0.16)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Probabilistic Weight Fixing: Large-scale training of neural network weight uncertainties for quantization

Subia-Waud, Christopher, Dasmahapatra, Srinandan

arXiv.org Artificial IntelligenceOct-3-2023

Weight-sharing quantization has emerged as a technique to reduce energy expenditure during inference in large neural networks by constraining their weights to a limited set of values. However, existing methods for weight-sharing quantization often make assumptions about the treatment of weights based on value alone that neglect the unique role weight position plays. This paper proposes a probabilistic framework based on Bayesian neural networks (BNNs) and a variational relaxation to identify which weights can be moved to which cluster centre and to what degree based on their individual position-specific learned uncertainty distributions. We introduce a new initialisation setting and a regularisation term which allow for the training of BNNs under complex dataset-model combinations. By leveraging the flexibility of weight values captured through a probability distribution, we enhance noise resilience and downstream compressibility. Our iterative clustering procedure demonstrates superior compressibility and higher accuracy compared to state-of-the-art methods on both ResNet models and the more complex transformer-based architectures. In particular, our method outperforms the state-of-the-art quantization method top-1 accuracy by 1.6% on ImageNet using DeiT-Tiny, with its 5 million+ weights now represented by only 296 unique values.

large-scale training, neural network weight uncertainty, probabilistic weight fixing, (1 more...)

arXiv.org Artificial Intelligence

2309.13575

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

CPU vs GPU and its use in Machine Learning

#artificialintelligenceFeb-10-2023, 23:15:12 GMT

Speed: GPUs have a high number of cores, which makes them well-suited for parallel processing tasks such as matrix operations. This makes them faster for certain types of machine learning tasks, such as training deep neural networks. Cost-effectiveness: Training large machine learning models can require a lot of computational resources, and using GPUs can be more cost-effective than using CPUs for these tasks, as they can process large amounts of data much faster. Large-scale training: Training deep neural networks requires a lot of data and computational power, which makes GPUs ideal for this type of work. By using GPUs, researchers and practitioners can train much larger and more complex models than they would be able to with CPUs alone.

deep neural network, gpus, machine learning, (11 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback