Goto

Collaborating Authors

 xgboost 1


Benchmarking Classical and Quantum Models for DeFi Yield Prediction on Curve Finance

arXiv.org Artificial Intelligence

The rise of decentralized finance (DeFi) has created a growing demand for accurate yield and performance forecasting to guide liquidity allocation strategies. In this study, we benchmark six models, XGBoost, Random Forest, LSTM, Transformer, quantum neural networks (QNN), and quantum support vector machines with quantum feature maps (QSVM-QNN), on one year of historical data from 28 Curve Finance pools. We evaluate model performance on test MAE, RMSE, and directional accuracy. Our results show that classical ensemble models, particularly XGBoost and Random Forest, consistently outperform both deep learning and quantum models. XGBoost achieves the highest directional accuracy (71.57%) with a test MAE of 1.80, while Random Forest attains the lowest test MAE of 1.77 and 71.36% accuracy. In contrast, quantum models underperform with directional accuracy below 50% and higher errors, highlighting current limitations in applying quantum machine learning to real-world DeFi time series data. This work offers a reproducible benchmark and practical insights into model suitability for DeFi applications, emphasizing the robustness of classical methods over emerging quantum approaches in this domain.


Non-linear Phillips Curve for India: Evidence from Explainable Machine Learning

arXiv.org Artificial Intelligence

A foundational framework within the literature on inflation dynamics is the Phillips Curve (PC) model. The Phillips Curve posits a short-term trade-off between inflation and a measure of economic slack, typically proxied by unemployment rate, such that higher inflation is associated with lower slack in the economy and vice-versa. The earliest empirical validation of this relationship, based on wage inflation and unemployment rate was provided by Phillips (1958) for the United Kingdom. Since then, the Phillips Curve framework has undergone significant theoretical advancements, culminating in the development of the micro-founded New Keynesian Phillips Curve (NKPC) (Taylor, 1980; Calvo, 1983a; Gali and Gertler, 1999) as the workhorse model for inflation analysis. Despite its theoretical appeal, the practical application of the NKPC for inflation modelling and forecasting--particularly within central banks--has been fraught with challenges. Such difficulties stem from structural breaks, state dependencies, and intrinsic nonlinearities in the relationship between inflation and its fundamental determinants, complicating its empirical validity and predictive performance (see Cristini and Ferri, 2021).


Multiple Outputs -- xgboost 1.6.2 documentation

#artificialintelligence

Starting from version 1.6, XGBoost has experimental support for multi-output regression and multi-label classification with Python package. Multi-label classification usually refers to targets that have multiple non-exclusive class labels. For instance, a movie can be simultaneously classified as both sci-fi and comedy. For detailed explanation of terminologies related to different multi-output models please refer to the scikit-learn user guide. Internally, XGBoost builds one model for each target similar to sklearn meta estimators, with the added benefit of reusing data and other integrated features like SHAP.



The Proper Care and Feeding of CAMELS: How Limited Training Data Affects Streamflow Prediction

arXiv.org Machine Learning

Accurate streamflow prediction largely relies on historical records of both meteorological data and streamflow measurements. For many regions around the world, however, such data are only scarcely or not at all available. To select an appropriate model for a region with a given amount of historical data, it is therefore indispensable to know a model's sensitivity to limited training data, both in terms of geographic diversity and different spans of time. In this study, we provide decision support for tree- and LSTM-based models. We feed the models meteorological measurements from the CAMELS dataset, and individually restrict the training period length and the number of basins used in training. Our findings show that tree-based models provide more accurate predictions on small datasets, while LSTMs are superior given sufficient training data. This is perhaps not surprising, as neural networks are known to be data-hungry; however, we are able to characterize each model's strengths under different conditions, including the "breakeven point" when LSTMs begin to overtake tree-based models.