AITopics | concatenate

Collaborating Authors

concatenate

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

f8290ccc2905538be1a7f7914ccef629-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 22:33:03 GMT

dataset, representation, video, (15 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.05)
Asia > China > Anhui Province > Hefei (0.04)

Industry: Education (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

NumericalinfluenceofReLU'(0)onbackpropagation SupplementaryMaterial

Neural Information Processing SystemsFeb-7-2026, 07:28:07 GMT

It can be inferred from Definition 1 that all elements in the definition of a ReLU network training problem are piecewise smooth, where each piece is an elementary log exp function. We refer the reader to [30] for an introduction to piecewise smoothness and recent use of such notions in the context of algorithmic differentiation in [8]. Let us first argue that the results of [8] apply to Definition1. This is Theorem 2 for s [0,T], note that a similar probabilistic argument was developped in [6]. Consider any fully connected ReLU network architecture of depth H, with the softmax function appliedonthelastlayer.

artificial intelligence, experiment, machine learning, (18 more...)

Neural Information Processing Systems

Country: Europe > France > Occitanie > Haute-Garonne > Toulouse (0.06)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)

Add feedback

Route-and-Reason: Scaling Large Language Model Reasoning with Reinforced Model Router

Shao, Chenyang, Liu, Xinyang, Lin, Yutang, Xu, Fengli, Li, Yong

arXiv.org Artificial IntelligenceDec-5-2025

Chain-of-thought has been proven essential for enhancing the complex reasoning abilities of Large Language Models (LLMs), but it also leads to high computational costs. Recent advances have explored the method to route queries among multiple models and proved it as a promising approach. However, previous works directly operate at the task level, i.e., assigning user queries to suitable LLMs, which does not allow hybrid LLMs to truly collaborate on finer-grained sub-tasks. Collaboration at the level of intermediate reasoning steps (thoughts) could enable more efficient coordination, but it also poses significant challenges for router scheduling, placing immense demands on the quality of task decomposition and the precision of the router. To address this, we propose R2-Reasoner, a novel framework centered around a Reinforced Model Router designed to efficiently scale LLM reasoning. This router orchestrates collaboration across nine heterogeneous models, whose parameter scales range from less than 1B to hundreds of billions, by first breaking down a complex query into subtasks with a decomposer, and then assigning each subtask to the optimal model with a subtask allocator, balancing performance with cost. Training this router involves a two-stage alternating process for the decomposer and the allocator, integrating supervised fine-tuning with reinforcement learning to enable effective self-supervised refinement. Extensive experiments across six challenging reasoning benchmarks demonstrate that R2-Reasoner reduces API costs by 84.46% compared with state-of-the-art baselines while maintaining competitive reasoning accuracy. Our framework paves the way for the development of more scalable and efficient reasoning systems. Our code is open-source at https://anonymous.4open.science/r/R2_Reasoner.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2506.05901

Genre: Research Report > New Finding (0.67)

Industry: Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.45)

Add feedback

f8290ccc2905538be1a7f7914ccef629-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-19-2025, 20:57:39 GMT

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.05)
Asia > China > Anhui Province > Hefei (0.04)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.96)

Add feedback

Automatic and Structure-Aware Sparsification of Hybrid Neural ODEs

Zou, Bob Junyi, Tian, Lu

arXiv.org Machine LearningMay-27-2025

Hybrid neural ordinary differential equations (neural ODEs) integrate mechanistic models with neural ODEs, offering strong inductive bias and flexibility, and are particularly advantageous in data-scarce healthcare settings. However, excessive latent states and interactions from mechanistic models can lead to training inefficiency and over-fitting, limiting practical effectiveness of hybrid neural ODEs. In response, we propose a new hybrid pipeline for automatic state selection and structure optimization in mechanistic neural ODEs, combining domain-informed graph modifications with data-driven regularization to sparsify the model for improving predictive performance and stability while retaining mechanistic plausibility. Experiments on synthetic and real-world data show improved predictive performance and robustness with desired sparsity, establishing an effective solution for hybrid model reduction in healthcare applications.

artificial intelligence, machine learning, neural network, (18 more...)

arXiv.org Machine Learning

2505.18996

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science (0.93)

Add feedback

Recursive Decomposition with Dependencies for Generic Divide-and-Conquer Reasoning

Hernández-Gutiérrez, Sergio, Alakuijala, Minttu, Nikitin, Alexander V., Marttinen, Pekka

arXiv.org Artificial IntelligenceMay-6-2025

Reasoning tasks are crucial in many domains, especially in science and engineering. Although large language models (LLMs) have made progress in reasoning tasks using techniques such as chain-of-thought and least-to-most prompting, these approaches still do not effectively scale to complex problems in either their performance or execution time. Moreover, they often require additional supervision for each new task, such as in-context examples. In this work, we introduce Recursive Decomposition with Dependencies (RDD), a scalable divide-and-conquer method for solving reasoning problems that requires less supervision than prior approaches. Our method can be directly applied to a new problem class even in the absence of any task-specific guidance. Furthermore, RDD supports sub-task dependencies, allowing for ordered execution of sub-tasks, as well as an error recovery mechanism that can correct mistakes made in previous steps. We evaluate our approach on two benchmarks with six difficulty levels each and in two in-context settings: one with task-specific examples and one without. Our results demonstrate that RDD outperforms other methods in a compute-matched setting as task complexity increases, while also being more computationally efficient.

concatenate, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2505.02576

Country:

Europe (0.68)
North America (0.68)
Asia > Middle East (0.28)

Genre: Research Report > New Finding (0.86)

Industry: Leisure & Entertainment > Sports (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
(2 more...)

Add feedback

EasyRAG: Efficient Retrieval-Augmented Generation Framework for Automated Network Operations

Feng, Zhangchi, Kuang, Dongdong, Wang, Zhongyuan, Nie, Zhijie, Zheng, Yaowei, Zhang, Richong

arXiv.org Artificial IntelligenceOct-14-2024

This paper presents EasyRAG, a simple, lightweight, and efficient retrieval-augmented generation framework for automated network operations. Our framework has three advantages. The first is accurate question answering. We designed a straightforward RAG scheme based on (1) a specific data processing workflow (2) dual-route sparse retrieval for coarse ranking (3) LLM Reranker for reranking (4) LLM answer generation and optimization. This approach achieved first place in the GLM4 track in the preliminary round and second place in the GLM4 track in the semifinals. The second is simple deployment. Our method primarily consists of BM25 retrieval and BGE-reranker reranking, requiring no fine-tuning of any models, occupying minimal VRAM, easy to deploy, and highly scalable; we provide a flexible code library with various search and generation strategies, facilitating custom process implementation. The last one is efficient inference. We designed an efficient inference acceleration scheme for the entire coarse ranking, reranking, and generation process that significantly reduces the inference latency of RAG while maintaining a good level of accuracy; each acceleration scheme can be plug-and-play into any component of the RAG process, consistently enhancing the efficiency of the RAG system. Our code and data are released at \url{https://github.com/BUAADreamer/EasyRAG}.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.10315

Country:

Asia > Singapore (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

GMM-ResNext: Combining Generative and Discriminative Models for Speaker Verification

Yan, Hui, Lei, Zhenchun, Liu, Changhong, Zhou, Yong

arXiv.org Artificial IntelligenceJul-3-2024

With the development of deep learning, many different network architectures have been explored in speaker verification. However, most network architectures rely on a single deep learning architecture, and hybrid networks combining different architectures have been little studied in ASV tasks. In this paper, we propose the GMM-ResNext model for speaker verification. Conventional GMM does not consider the score distribution of each frame feature over all Gaussian components and ignores the relationship between neighboring speech frames. So, we extract the log Gaussian probability features based on the raw acoustic features and use ResNext-based network as the backbone to extract the speaker embedding. GMM-ResNext combines Generative and Discriminative Models to improve the generalization ability of deep learning models and allows one to more easily specify meaningful priors on model parameters. A two-path GMM-ResNext model based on two gender-related GMMs has also been proposed. The Experimental results show that the proposed GMM-ResNext achieves relative improvements of 48.1\% and 11.3\% in EER compared with ResNet34 and ECAPA-TDNN on VoxCeleb1-O test set.

architecture, gmm-resnext, speaker verification, (11 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICASSP48485.2024.10447141

2407.03135

Country: Asia > China > Jiangxi Province > Nanchang (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Collaborating Authors

concatenate

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

f8290ccc2905538be1a7f7914ccef629-Supplemental-Conference.pdf

7c220a2091c26a7f5e9f1cfb099511e3-Supplemental.pdf

NumericalinfluenceofReLU'(0)onbackpropagation SupplementaryMaterial

Route-and-Reason: Scaling Large Language Model Reasoning with Reinforced Model Router

f8290ccc2905538be1a7f7914ccef629-Supplemental-Conference.pdf

7c220a2091c26a7f5e9f1cfb099511e3-Supplemental.pdf

Automatic and Structure-Aware Sparsification of Hybrid Neural ODEs

Recursive Decomposition with Dependencies for Generic Divide-and-Conquer Reasoning

EasyRAG: Efficient Retrieval-Augmented Generation Framework for Automated Network Operations

GMM-ResNext: Combining Generative and Discriminative Models for Speaker Verification