quantitative finance
Extraction of Research Objectives, Machine Learning Model Names, and Dataset Names from Academic Papers and Analysis of Their Interrelationships Using LLM and Network Analysis
Nishio, S., Nonaka, H., Tsuchiya, N., Migita, A., Banno, Y., Hayashi, T., Sakaji, H., Sakumoto, T., Watabe, K.
Machine learning is widely utilized across various industries. Identifying the appropriate machine learning models and datasets for specific tasks is crucial for the effective industrial application of machine learning. However, this requires expertise in both machine learning and the relevant domain, leading to a high learning cost. Therefore, research focused on extracting combinations of tasks, machine learning models, and datasets from academic papers is critically important, as it can facilitate the automatic recommendation of suitable methods. Conventional information extraction methods from academic papers have been limited to identifying machine learning models and other entities as named entities. To address this issue, this study proposes a methodology extracting tasks, machine learning methods, and dataset names from scientific papers and analyzing the relationships between these information by using LLM, embedding model, and network clustering. The proposed method's expression extraction performance, when using Llama3, achieves an F-score exceeding 0.8 across various categories, confirming its practical utility. Benchmarking results on financial domain papers have demonstrated the effectiveness of this method, providing insights into the use of the latest datasets, including those related to ESG (Environmental, Social, and Governance) data.
A time-stepping deep gradient flow method for option pricing in (rough) diffusion models
Papapantoleon, Antonis, Rou, Jasper
The option pricing partial differential equation is reformulated as an energy minimization problem, which is approximated in a time-stepping fashion by deep artificial neural networks. The proposed scheme respects the asymptotic behavior of option prices for large levels of moneyness, and adheres to a priori known bounds for option prices. The accuracy and efficiency of the proposed method is assessed in a series of numerical examples, with particular focus in the lifted Heston model. Stochastic volatility models have been popular in the mathematical finance literature because they allow to accurately model and reproduce the shape of implied volatility smiles for a single maturity. They require though certain modifications, such as making the parameters time-or maturity-dependent, in order to reproduce a whole volatility surface; see e.g. the comprehensive books by Gatheral [25] or Bergomi [15]. The class of rough volatility models, in which the volatility process is driven by a fractional Brownian motion, offers an attractive alternative to classical volatility models, since they allow to reproduce many stylized facts of asset and option prices with only a few (constant) parameters; see e.g. the seminal articles by Gatheral, Jaisson, and Rosenbaum [27] and Bayer, Friz, and Gatheral [9], and the recent volume by Bayer, Friz, Fukasawa, Gatheral, Jacquier, and Rosenbaum [13].
Finding Moving-Band Statistical Arbitrages via Convex-Concave Optimization
Johansson, Kasper, Schmelzer, Thomas, Boyd, Stephen
We propose a new method for finding statistical arbitrages that can contain more assets than just the traditional pair. We formulate the problem as seeking a portfolio with the highest volatility, subject to its price remaining in a band and a leverage limit. This optimization problem is not convex, but can be approximately solved using the convex-concave procedure, a specific sequential convex programming method. We show how the method generalizes to finding moving-band statistical arbitrages, where the price band midpoint varies over time.
Forecasting the movements of Bitcoin prices: an application of machine learning algorithms
Pabuccu, Hakan, Ongan, Serdar, Ongan, Ayse
Cryptocurrencies, such as Bitcoin, are one of the most controversial and complex technological innovations in today's financial system. This study aims to forecast the movements of Bitcoin prices at a high degree of accuracy. To this aim, four different Machine Learning (ML) algorithms are applied, namely, the Support Vector Machines (SVM), the Artificial Neural Network (ANN), the Naive Bayes (NB) and the Random Forest (RF) besides the logistic regression (LR) as a benchmark model. In order to test these algorithms, besides existing continuous dataset, discrete dataset was also created and used. For the evaluations of algorithm performances, the F statistic, accuracy statistic, the Mean Absolute Error (MAE), the Root Mean Square Error (RMSE) and the Root Absolute Error (RAE) metrics were used. The t test was used to compare the performances of the SVM, ANN, NB and RF with the performance of the LR. Empirical findings reveal that, while the RF has the highest forecasting performance in the continuous dataset, the NB has the lowest. On the other hand, while the ANN has the highest and the NB the lowest performance in the discrete dataset. Furthermore, the discrete dataset improves the overall forecasting performance in all algorithms (models) estimated.
If there is one machine learning book you should read, it's this one!
Just getting started in Machine learning? Wondering what material to pick from the flood of literature out there? I have just the right recommendation for you right here. Before we get started some quick info about my background. I studied mechanical engineering a few years ago, then did a PhD in quantitative finance.
Data-driven Hedging of Stock Index Options via Deep Learning
Options hedging is an important problem in financial markets. The prevailing approach to hedging first assumes a parametric stochastic model for the dynamics of the underlying asset. The model is then calibrated to observed option prices from the market, based on which various sensitivities are computed and used to hedge the risk of options. Popular choices include local volatility models ([5]), stochastic volatility models ([15], [12], [8]), jump-diffusions and purejump processes ([4], [18], [20]). Despite the prevalence of the model-based approach, it is well understood that model risk can affect the hedging result significantly. Recently, a data-driven approach that doesn't rely on any stochastic model for the underlying asset is proposed.
Model and data lineage in machine learning experimentation
Modern quantitative finance is based around the approach of pattern recognition in historical data. This approach requires teams of scientists to work in a collaborative and regulated setting in order to develop models that can be used to make trading predictions. With the growing influence of this field, both participants and regulators are looking to put in place mechanisms to understand how and why models have been developed, for reasons such as regulatory compliance and model reproducibility. We refer to this tractability problem as lineage. The challenge of reproducibility and lineage in machine learning (ML) is three-fold: code lineage, data lineage, and model lineage.
Amazon.com: Machine Learning in Finance: From Theory to Practice (9783030410674): Dixon, Matthew F., Halperin, Igor, Bilokon, Paul: Books
This book introduces machine learning methods in finance. It presents a unified treatment of machine learning and various statistical and computational disciplines in quantitative finance, such as financial econometrics and discrete time stochastic control, with an emphasis on how theory and hypothesis tests inform the choice of algorithm for financial data modeling and decision making. With the trend towards increasing computational resources and larger datasets, machine learning has grown into an important skillset for the finance industry. This book is written for advanced graduate students and academics in financial econometrics, mathematical finance and applied statistics, in addition to quants and data scientists in the field of quantitative finance. Machine Learning in Finance: From Theory to Practice is divided into three parts, each part covering theory and applications.
Top 4 Books for AI Driven Investing
As AI and machine learning have regained popularity over the last two decades, so has an interest in their application to financial prediction tasks. The two seem like a natural fit as data generated by markets have been scrutinized by investors for over a century in hopes of forecasting their way to financial success. A casual survey of the associated literature reveals there are generally two broad approaches to the topic. In one corner sits the astute STEM practitioners who view the task at hand as an engineering problem, preferring complex and novel architectures that minimize a nominated error metric. Whereas in the opposite corner resides the learned financial practitioner, who remains innately cognizant of efficient markets (EMH) and the need for explainability, in doing so, favoring simpler models infused with domain insights.
Machine Learning: An Applied Mathematics Introduction: Paul Wilmott: 9781916081604: Amazon.com: Books
Paul Wilmott studied mathematics at St Catherine's College, Oxford, where he also received his D.Phil. He is the author of Paul Wilmott Introduces Quantitative Finance (Wiley 2007), Paul Wilmott On Quantitative Finance (Wiley 2006), Frequently Asked Questions in Quantitative Finance (Wiley 2009), The Money Formula (with David Orrell) (Wiley 2017) and other financial textbooks. He has written over 100 research articles on finance and mathematics. Paul Wilmott was a founding partner of the volatility arbitrage hedge fund Caissa Capital which managed $170 million. His responsibilities included forecasting, derivatives pricing, and risk management.