AITopics | pmatrix

Collaborating Authors

pmatrix

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Bohdi: Heterogeneous LLM Fusion with Automatic Data Exploration

Gao, Junqi, Guo, Zhichang, Zhang, Dazhi, Li, Dong, Liu, Runze, Li, Pengfei, Tian, Kai, Qi, Biqing

arXiv.org Artificial IntelligenceOct-30-2025

Heterogeneous Large Language Model (LLM) fusion integrates the strengths of multiple source LLMs with different architectures into a target LLM with low computational overhead. While promising, existing methods suffer from two major limitations: 1) reliance on real data from limited domain for knowledge fusion, preventing the target LLM from fully acquiring knowledge across diverse domains, and 2) fixed data allocation proportions across domains, failing to dynamically adjust according to the target LLM's varying capabilities across domains, leading to a capability imbalance. To overcome these limitations, we propose Bohdi, a synthetic-data-only heterogeneous LLM fusion framework. Through the organization of knowledge domains into a hierarchical tree structure, Bohdi enables automatic domain exploration and multi-domain data generation through multi-model collaboration, thereby comprehensively extracting knowledge from source LLMs. By formalizing domain expansion and data sampling proportion allocation on the knowledge tree as a Hierarchical Multi-Armed Bandit problem, Bohdi leverages the designed DynaBranches mechanism to adaptively adjust sampling proportions based on the target LLM's performance feedback across domains. Integrated with our proposed Introspection-Rebirth (IR) mechanism, DynaBranches dynamically tracks capability shifts during target LLM's updates via Sliding Window Binomial Likelihood Ratio Testing (SWBLRT), further enhancing its online adaptation capability. Comparative experimental results on a comprehensive suite of benchmarks demonstrate that Bohdi significantly outperforms existing baselines on multiple target LLMs, exhibits higher data efficiency, and virtually eliminates the imbalance in the target LLM's capabilities. Our code is available at https://github.com/gjq100/Bohdi.git.

artificial intelligence, large language model, natural language, (20 more...)

arXiv.org Artificial Intelligence

2506.15721

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Law (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

INC-Math: Integrating Natural Language and Code for Enhanced Mathematical Reasoning in Large Language Models

Xiong, Xuyuan, Han, Simeng, Zhou, Ziyue, Cohan, Arman

arXiv.org Artificial IntelligenceNov-1-2024

Large Language Models (LLMs) are commonly used to generate solutions for mathematical reasoning problems in the following formats: natural language, code, or a combination of both. In this paper, we explore fundamental questions related to solving mathematical reasoning problems using natural language and code with state-of-the-art LLMs, including GPT-4o-mini and LLama-3.1-8b-Turbo. Our findings show that LLMs are better at reasoning in natural language compared to code. Additionally, although natural language and code serve as complementary forms of reasoning, they can affect each other in a negative way in certain scenarios. These insights motivate our development of a new prompting method, INC-Math, which leverages an LLM to dynamically select the most appropriate reasoning form, resulting in improved performance over comparable baselines with GPT-4o-mini.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2409.19381

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning

Huang, Yiming, Liu, Xiao, Gong, Yeyun, Gou, Zhibin, Shen, Yelong, Duan, Nan, Chen, Weizhu

arXiv.org Artificial IntelligenceMay-7-2024

Large language models (LLMs) have shown great potential in complex reasoning tasks, yet their performance is often hampered by the scarcity of high-quality and reasoning-focused training datasets. Addressing this challenge, we propose Key-Point-Driven Data Synthesis (KPDDS), a novel data synthesis framework that synthesizes question-answer pairs by leveraging key points and exemplar practices from authentic data sources. KPDDS ensures the generation of novel questions with rigorous quality control and substantial scalability. As a result, we present KPMath, an extensive synthetic dataset tailored for mathematical reasoning, comprising over 800K question-answer pairs. Utilizing KPMath and augmenting it with additional reasoning-intensive corpora, we create the comprehensive KPMath-Plus dataset. The Qwen1.5-72B model, fine-tuned on KPMath-Plus, achieves 87.0%

arxiv preprint arxiv, dataset, pmatrix, (14 more...)

arXiv.org Artificial Intelligence

2403.02333

Country: Asia > China > Guangxi Province > Nanning (0.04)

Genre: Research Report (0.82)

Industry:

Education > Curriculum > Subject-Specific Education (0.48)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep Learning From Scratch I: Computational Graphs - deep ideas

@machinelearnbotDec-28-2017, 23:05:30 GMT

This is part 1 of a series of tutorials, in which we develop the mathematical and algorithmic underpinnings of deep neural networks from scratch and implement our own neural network library in Python, mimicing the TensorFlow API. I do not assume that you have any preknowledge about machine learning or neural networks. However, you should have some preknowledge of calculus, linear algebra, fundamental algorithms and probability theory on an undergraduate level. If you get stuck at some point, please leave a comment. By the end of this text, you will have a deep understanding of the math behind neural networks and how deep learning libraries work under the hood.

artificial intelligence, machine learning, opération, (12 more...)

@machinelearnbot

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

Neural Networks Tutorial - A Pathway to Deep Learning - Adventures in Machine Learning

#artificialintelligenceApr-20-2017, 23:08:45 GMT

Chances are, if you are searching for a tutorial on artificial neural networks (ANN) you already have some idea of what they are, and what they are capable of doing. But did you know that neural networks are the foundation of the new and exciting field of deep learning? Deep learning is the field of machine learning that is making many state-of-the-art advancements, from beating players at Go and Poker, to speeding up drug discovery and assisting self-driving cars. If these types of cutting edge applications excite you like they excite me, then you will be interesting in learning as much as you can about deep learning. However, that requires you to know quite a bit about how neural networks work. This tutorial article is designed to help you get up to speed in neural networks as quickly as possible. In this tutorial I'll be presenting some concepts, code and maths that will enable you to build and understand a simple neural network. Some tutorials focus only on the code and skip the maths – but this impedes understanding. I'll take things as slowly as possible, but it might help to brush up on your matrices and differentiation if you need to. The code will be in Python, so it will be beneficial if you have a basic understanding of how Python works. You'll pretty much get away with knowing about Python functions, loops and the basics of the numpy library. By the end of this neural networks tutorial you'll be able to build an ANN in Python that will correctly classify handwritten digits in images with a fair degree of accuracy. Once you're done with this tutorial, you can dive a little deeper with the following posts: All of the relevant code in this tutorial can be found here. Here's an outline of the tutorial, with links, so you can easily navigate to the parts you want: Artificial neural networks (ANNs) are software implementations of the neuronal structure of our brains. We don't need to talk about the complex biology of our brain structures, but suffice to say, the brain contains neurons which are kind of like organic switches. These can change their output state depending on the strength of their electrical or chemical input. The neural network in a person's brain is a hugely interconnected network of neurons, where the output of any given neuron may be the input to thousands of other neurons.

artificial intelligence, machine learning, neural network, (19 more...)

#artificialintelligence

Country: Africa > Nigeria (0.04)

Genre: Instructional Material > Course Syllabus & Notes (0.68)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback