AITopics | automix

ecda225cb187b40ea8edc1f46b03ffda-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 14:45:42 GMT

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.14)
Asia > China (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
(7 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)
(2 more...)

Add feedback

AutoMix: Automatically Mixing Language Models

Neural Information Processing SystemsDec-27-2025, 12:21:35 GMT

Large language models (LLMs) are now available from cloud API providers in various sizes and configurations. While this diversity offers a broad spectrum of choices, effectively leveraging the options to optimize computational cost and performance remains challenging. In this work, we present AutoMix, an approach that strategically routes queries to larger LMs, based on the approximate correctness of outputs from a smaller LM. Central to AutoMix are two key technical contributions. First, it has a few-shot self-verification mechanism, which estimates the reliability of its own outputs without requiring extensive training. Second, given that self-verification can be noisy, it employs a POMDP based router that can effectively select an appropriately sized model, based on answer confidence. Experiments across five language models and five challenging datasets show that Automix consistently surpasses strong baselines, reducing computational cost by over 50\% for comparable performance.

artificial intelligence, large language model, natural language, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.61)

Add feedback

AutoMix: Automatically Mixing Language Models

Neural Information Processing SystemsOct-11-2025, 00:46:54 GMT

Large language models (LLMs) are now available from cloud API providers in various sizes and configurations.

automix, dataset, language model, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.14)
Asia > China (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
(7 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.72)

Add feedback

AutoMix: Automatically Mixing Language Models

Neural Information Processing SystemsMay-27-2025, 20:35:12 GMT

Large language models (LLMs) are now available from cloud API providers in various sizes and configurations. While this diversity offers a broad spectrum of choices, effectively leveraging the options to optimize computational cost and performance remains challenging. In this work, we present AutoMix, an approach that strategically routes queries to larger LMs, based on the approximate correctness of outputs from a smaller LM. Central to AutoMix are two key technical contributions. First, it has a few-shot self-verification mechanism, which estimates the reliability of its own outputs without requiring extensive training.

automatically mixing language model, automix, computational cost

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.64)

Add feedback

AutoMix: Automatically Mixing Language Models

Madaan, Aman, Aggarwal, Pranjal, Anand, Ankit, Potharaju, Srividya Pranavi, Mishra, Swaroop, Zhou, Pei, Gupta, Aditya, Rajagopal, Dheeraj, Kappaganthu, Karthik, Yang, Yiming, Upadhyay, Shyam, Mausam, null, Faruqui, Manaal

arXiv.org Artificial IntelligenceNov-15-2023

Large language models (LLMs) are now available in various sizes and configurations from cloud API providers. While this diversity offers a broad spectrum of choices, effectively leveraging the options to optimize computational cost and performance remains challenging. In this work, we present AutoMix, an approach that strategically routes queries to larger LMs, based on the approximate correctness of outputs from a smaller LM. Central to AutoMix is a few-shot self-verification mechanism, which estimates the reliability of its own outputs without requiring training. Given that verifications can be noisy, we employ a meta verifier in AutoMix to refine the accuracy of these assessments. Our experiments using LLAMA2-13/70B, on five context-grounded reasoning datasets demonstrate that AutoMix surpasses established baselines, improving the incremental benefit per cost by up to 89%. Our code and data are available at https://github.com/automix-llm/automix.

language model, llm, verifier, (14 more...)

arXiv.org Artificial Intelligence

2310.12963

Country:

North America > United States > California (0.14)
Asia > China (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.50)

Add feedback

AutoMix: Unveiling the Power of Mixup

#artificialintelligenceMar-25-2021, 00:15:55 GMT

Mixup-based data augmentation has achieved great success as regularizer for deep neural networks. However, existing mixup methods require explicitly designed mixup policies. In this paper, we present a flexible, general Automatic Mixup (AutoMix) framework which utilizes discriminative features to learn a sample mixing policy adaptively. We regard mixup as a pretext task and split it into two sub-problems: mixed samples generation and mixup classification. To this end, we design a lightweight mix block to generate synthetic samples based on feature maps and mix labels. Since the two sub-problems are in the nature of Expectation-Maximization (EM), we also propose a momentum training pipeline to optimize the mixup process and mixup classification process alternatively in an end-to-end fashion. Extensive experiments on six popular classification benchmarks show that AutoMix consistently outperforms other leading mixup methods and improves generalization abilities to downstream tasks. We hope AutoMix will motivate the community to rethink the role of mixup in representation learning. The code will be released soon.

automix, mixup, unveiling

#artificialintelligence

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.53)

Add feedback

AutoMix: Unveiling the Power of Mixup

Liu, Zicheng, Li, Siyuan, Wu, Di, Chen, Zhiyuan, Wu, Lirong, Guo, Jianzhu, Li, Stan Z.

arXiv.org Artificial IntelligenceMar-24-2021

Mixup-based data augmentation has achieved great success as regularizer for deep neural networks. However, existing mixup methods require explicitly designed mixup policies. In this paper, we present a flexible, general Automatic Mixup (AutoMix) framework which utilizes discriminative features to learn a sample mixing policy adaptively. We regard mixup as a pretext task and split it into two sub-problems: mixed samples generation and mixup classification. To this end, we design a lightweight mix block to generate synthetic samples based on feature maps and mix labels. Since the two sub-problems are in the nature of Expectation-Maximization (EM), we also propose a momentum training pipeline to optimize the mixup process and mixup classification process alternatively in an end-to-end fashion. Extensive experiments on six popular classification benchmarks show that AutoMix consistently outperforms other leading mixup methods and improves generalization abilities to downstream tasks. We hope AutoMix will motivate the community to rethink the role of mixup in representation learning. The code will be released soon.

automix, dataset, experiment, (14 more...)

arXiv.org Artificial Intelligence

2103.13027

Country: