AITopics | base hypothesis

Better understanding of the potential benefits of information transfer and representation learning is an important step towards the goal of building intelligent systems that are able to persist in the world and learn over time. In this work, we consider a setting where the learner encounters a stream of tasks but is able to retain only limited information from each encountered task, such as a learned predictor. In contrast to most previous works analyzing this scenario, we do not make any distributional assumptions on the task generating process. Instead, we formulate a complexity measure that captures the diversity of the observed tasks. We provide a lifelong learning algorithm with error guarantees for every observed task (rather than on average). We show sample complexity reductions in comparison to solving every task in isolation in terms of our task complexity measure. Further, our algorithmic framework can naturally be viewed as learning a representation from encountered tasks with a neural network.

artificial intelligence, hypothesis, machine learning, (14 more...)

Neural Information Processing Systems

Country: Europe (0.68)

Genre: Instructional Material (0.65)

Industry: Education > Educational Setting > Continuing Education (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)

Add feedback

SnapBoost: AHeterogeneousBoostingMachine

Neural Information Processing SystemsFeb-9-2026, 03:33:04 GMT

Moreover,bothframeworks are homogeneous: the hypothesis class is fixed at each boosting iteration.

artificial intelligence, hypothesis, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.19)
North America > United States (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Spain > Andalusia > Cádiz Province > Cadiz (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)

Add feedback

SnapBoost: A Heterogeneous Boosting Machine Thomas Parnell

Neural Information Processing SystemsOct-3-2025, 09:25:45 GMT

We note that while the subclasses used in practice (e.g., trees) may well be infinite beyond a simple Our proposed method for solving this optimization problem is presented in full in Algorithm 1. The supplemental material contains exemplary code for Algorithm 1 that uses generic scikit-learn regressors.

hypothesis, iteration, snapboost, (16 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.17)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada (0.14)
Europe > Spain > Andalusia > Cádiz Province > Cadiz (0.04)

Industry:

Information Technology (0.69)
Banking & Finance (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Lifelong Learning with Weighted Majority Votes

Neural Information Processing SystemsMar-12-2024, 19:30:01 GMT

Better understanding of the potential benefits of information transfer and representation learning is an important step towards the goal of building intelligent systems that are able to persist in the world and learn over time. In this work, we consider a setting where the learner encounters a stream of tasks but is able to retain only limited information from each encountered task, such as a learned predictor. In contrast to most previous works analyzing this scenario, we do not make any distributional assumptions on the task generating process. Instead, we formulate a complexity measure that captures the diversity of the observed tasks. We provide a lifelong learning algorithm with error guarantees for every observed task (rather than on average). We show sample complexity reductions in comparison to solving every task in isolation in terms of our task complexity measure. Further, our algorithmic framework can naturally be viewed as learning a representation from encountered tasks with a neural network.

artificial intelligence, hypothesis, machine learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(2 more...)

Industry: Education > Educational Setting > Continuing Education (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)

Add feedback

Exploring the Numerical Reasoning Capabilities of Language Models: A Comprehensive Analysis on Tabular Data

Akhtar, Mubashara, Shankarampeta, Abhilash, Gupta, Vivek, Patil, Arpit, Cocarascu, Oana, Simperl, Elena

arXiv.org Artificial IntelligenceNov-3-2023

Numbers are crucial for various real-world domains such as finance, economics, and science. Thus, understanding and reasoning with numbers are essential skills for language models to solve different tasks. While different numerical benchmarks have been introduced in recent years, they are limited to specific numerical aspects mostly. In this paper, we propose a hierarchical taxonomy for numerical reasoning skills with more than ten reasoning types across four levels: representation, number sense, manipulation, and complex reasoning. We conduct a comprehensive evaluation of state-of-the-art models to identify reasoning challenges specific to them. Henceforth, we develop a diverse set of numerical probes employing a semi-automated approach. We focus on the tabular Natural Language Inference (TNLI) task as a case study and measure models' performance shifts. Our results show that no model consistently excels across all numerical reasoning types. Among the probed models, FlanT5 (few-/zero-shot) and GPT-3.5 (few-shot) demonstrate strong overall numerical reasoning skills compared to other models. Label-flipping probes indicate that models often exploit dataset artifacts to predict the correct labels.

computational linguistic, linguistic, probe, (16 more...)

arXiv.org Artificial Intelligence

2311.02216

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
North America > Dominican Republic (0.04)
(12 more...)

Genre: Research Report > New Finding (0.68)

Industry: Leisure & Entertainment (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Boosting Algorithms for Maximizing the Soft Margin

Neural Information Processing SystemsApr-6-2023, 14:51:29 GMT

We present a novel boosting algorithm, called SoftBoost, designed for sets of bi- nary labeled examples that are not necessarily separable by convex combinations of base hypotheses. Our algorithm achieves robustness by capping the distribu- tions on the examples. Our update of the distribution is motivated by minimizing a relative entropy subject to the capping constraints and constraints on the edges of the obtained base hypotheses. The capping constraints imply a soft margin in the dual optimization problem. Our algorithm produces a convex combination of hypotheses whose soft margin is within δ of its maximum.

algorithm, maximizing, soft margin, (7 more...)

Neural Information Processing Systems

Country: Europe > Germany (0.09)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.43)

Add feedback

SnapBoost: A Heterogeneous Boosting Machine

Parnell, Thomas, Anghel, Andreea, Lazuka, Malgorzata, Ioannou, Nikolas, Kurella, Sebastian, Agarwal, Peshal, Papandreou, Nikolaos, Pozidis, Haralampos

arXiv.org Machine LearningSep-25-2020

Modern gradient boosting software frameworks, such as XGBoost and LightGBM, implement Newton descent in a functional space. At each boosting iteration, their goal is to find the base hypothesis, selected from some base hypothesis class, that is closest to the Newton descent direction in a Euclidean sense. Typically, the base hypothesis class is fixed to be all binary decision trees up to a given depth. In this work, we study a Heterogeneous Newton Boosting Machine (HNBM) in which the base hypothesis class may vary across boosting iterations. Specifically, at each boosting iteration, the base hypothesis class is chosen, from a fixed set of subclasses, by sampling from a probability distribution. We derive a global linear convergence rate for the HNBM under certain assumptions, and show that it agrees with existing rates for Newton's method when the Newton direction can be perfectly fitted by the base hypothesis at each boosting iteration. We then describe a particular realization of a HNBM, SnapBoost, that, at each boosting iteration, randomly selects between either a decision tree of variable depth or a linear regressor with random Fourier features. We describe how SnapBoost is implemented, with a focus on the training complexity. Finally, we present experimental results, using OpenML and Kaggle datasets, that show that SnapBoost is able to achieve better generalization loss than competing boosting frameworks, without taking significantly longer to tune.

artificial intelligence, iteration, machine learning, (19 more...)

arXiv.org Machine Learning

2006.09745

Country:

Europe > Switzerland > Zürich > Zürich (0.16)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Spain > Andalusia > Cádiz Province > Cadiz (0.04)

Genre: Research Report (0.82)

Industry: Information Technology (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Optimal Minimal Margin Maximization with Boosting

Grønlund, Allan, Larsen, Kasper Green, Mathiasen, Alexander

arXiv.org Machine LearningJan-30-2019

Boosting algorithms produce a classifier by iteratively combining base hypotheses. It has been observed experimentally that the generalization error keeps improving even after achieving zero training error. One popular explanation attributes this to improvements in margins. A common goal in a long line of research, is to maximize the smallest margin using as few base hypotheses as possible, culminating with the AdaBoostV algorithm by (R{\"a}tsch and Warmuth [JMLR'04]). The AdaBoostV algorithm was later conjectured to yield an optimal trade-off between number of hypotheses trained and the minimal margin over all training points (Nie et al. [JMLR'13]). Our main contribution is a new algorithm refuting this conjecture. Furthermore, we prove a lower bound which implies that our new algorithm is optimal.

algorithm, classifier, hypothesis, (17 more...)

arXiv.org Machine Learning

1901.10789

Country: North America > United States > Texas > Travis County > Austin (0.04)

Genre:

Research Report (0.64)
Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Lifelong Learning with Weighted Majority Votes

Pentina, Anastasia, Urner, Ruth

Neural Information Processing SystemsDec-31-2016

Better understanding of the potential benefits of information transfer and representation learning is an important step towards the goal of building intelligent systems that are able to persist in the world and learn over time. In this work, we consider a setting where the learner encounters a stream of tasks but is able to retain only limited information from each encountered task, such as a learned predictor. In contrast to most previous works analyzing this scenario, we do not make any distributional assumptions on the task generating process. Instead, we formulate a complexity measure that captures the diversity of the observed tasks. We provide a lifelong learning algorithm with error guarantees for every observed task (rather than on average). We show sample complexity reductions in comparison to solving every task in isolation in terms of our task complexity measure. Further, our algorithmic framework can naturally be viewed as learning a representation from encountered tasks with a neural network.

artificial intelligence, hypothesis, machine learning, (14 more...)

Neural Information Processing Systems

Country:

Europe (0.93)
North America > United States (0.28)

Genre: Instructional Material (0.64)

Industry: Education > Educational Setting > Continuing Education (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)

Add feedback

Functional Frank-Wolfe Boosting for General Loss Functions

Wang, Chu, Wang, Yingfei, E, Weinan, Schapire, Robert

arXiv.org Machine LearningOct-8-2015

Boosting is a generic learning method for classification and regression. Yet, as the number of base hypotheses becomes larger, boosting can lead to a deterioration of test performance. Overfitting is an important and ubiquitous phenomenon, especially in regression settings. To avoid overfitting, we consider using $l_1$ regularization. We propose a novel Frank-Wolfe type boosting algorithm (FWBoost) applied to general loss functions. By using exponential loss, the FWBoost algorithm can be rewritten as a variant of AdaBoost for binary classification. FWBoost algorithms have exactly the same form as existing boosting methods, in terms of making calls to a base learning algorithm with different weights update. This direct connection between boosting and Frank-Wolfe yields a new algorithm that is as practical as existing boosting methods but with new guarantees and rates of convergence. Experimental results show that the test performance of FWBoost is not degraded with larger rounds in boosting, which is consistent with the theoretical analysis.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1510.02558

Genre: Research Report > New Finding (0.48)

Technology: