adaboost
A Boosting-Type Convergence Result for AdaBoost.MH with Factorized Multi-Class Classifiers
AdaBoost is a well-known algorithm in boosting. Schapire and Singer propose, an extension of AdaBoost, named AdaBoost.MH, for multi-class classification problems. Kégl shows empirically that AdaBoost.MH works better when the classical one-against-all base classifiers are replaced by factorized base classifiers containing a binary classifier and a vote (or code) vector. However, the factorization makes it much more difficult to provide a convergence result for the factorized version of AdaBoost.MH. Then, Kégl raises an open problem in COLT 2014 to look for a convergence result for the factorized AdaBoost.MH. In this work, we resolve this open problem by presenting a convergence result for AdaBoost.MH with factorized multi-class classifiers.
Quality analysis and evaluation prediction of RAG retrieval based on machine learning algorithms
Zhang, Ruoxin, Wen, Zhizhao, Wang, Chao, Tang, Chenchen, Xu, Puyang, Jiang, Yifan
With the rapid evolution of large language models, retrieval enhanced generation technology has been widely used due to its ability to integrate external knowledge to improve output accuracy. However, the performance of the system is highly dependent on the quality of the retrieval module. If the retrieval results have low relevance to user needs or contain noisy information, it will directly lead to distortion of the generated content. In response to the performance bottleneck of existing models in processing tabular features, this paper proposes an XGBoost machine learning regression model based on feature engineering and particle swarm optimization. Correlation analysis shows that answer_quality is positively correlated with doc_delevance by 0.66, indicating that document relevance has a significant positive effect on answer quality, and improving document relevance may enhance answer quality; The strong negative correlations between semantic similarity, redundancy, and diversity were -0.89 and -0.88, respectively, indicating a tradeoff between semantic similarity, redundancy, and diversity. In other words, as the former two increased, diversity significantly decreased. The experimental results comparing decision trees, AdaBoost, etc. show that the VMD PSO BiLSTM model is superior in all evaluation indicators, with significantly lower MSE, RMSE, MAE, and MAPE compared to the comparison model. The R2 value is higher, indicating that its prediction accuracy, stability, and data interpretation ability are more outstanding. This achievement provides an effective path for optimizing the retrieval quality and improving the generation effect of RAG system, and has important value in promoting the implementation and application of related technologies.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > Texas > Harris County > Houston (0.05)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.95)
- (2 more...)
Tight Margin-Based Generalization Bounds for Voting Classifiers over Finite Hypothesis Sets
Larsen, Kasper Green, Schalburg, Natascha
Ensemble learning is a powerful machine learning tool; it enables us to transform weak learners; hypothesis classes that are barely better than guessing, into learners with state-of-the-art performance. In essence, ensemble methods take a set of base classifiers, weigh those classifiers according to performance on the training set and retrieve the final prediction by aggregating according to those weights. An important historical example is AdaBoost (Freund and Schapire [1997]), a type of voting classifier, which builds the ensemble classifier sequentially; new base classifiers are added to the ensemble to correct the mistakes of the current ensemble. AdaBoost was the first efficient and practical implementation of a boosting algorithm, and hence the relevance of ensemble learners is often attributed to AdaBoost. Much theoretical research has been done to explain the impressive practical performance of AdaBoost and other ensemble methods.
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > Illinois (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Denmark (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
- North America > Canada (0.04)
- Europe > Denmark (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > Denmark (0.04)