AITopics

2502.01495

Country: North America > United States > New York (0.14)

Genre: Research Report (0.82)

Industry: Banking & Finance > Trading (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

arXiv.org Artificial IntelligenceDec-19-2024

A Comparative Study of DSPy Teleprompter Algorithms for Aligning Large Language Models Evaluation Metrics to Human Evaluation

Sarmah, Bhaskarjit, Dutta, Kriti, Grigoryan, Anna, Tiwari, Sachin, Pasquali, Stefano, Mehta, Dhagash

We argue that the Declarative Self-improving Python (DSPy) optimizers are a way to align the large language model (LLM) prompts and their evaluations to the human annotations. We present a comparative analysis of five teleprompter algorithms, namely, Cooperative Prompt Optimization (COPRO), Multi-Stage Instruction Prompt Optimization (MIPRO), BootstrapFewShot, BootstrapFewShot with Optuna, and K-Nearest Neighbor Few Shot, within the DSPy framework with respect to their ability to align with human evaluations. As a concrete example, we focus on optimizing the prompt to align hallucination detection (using LLM as a judge) to human annotated ground truth labels for a publicly available benchmark dataset. Our experiments demonstrate that optimized prompts can outperform various benchmark methods to detect hallucination, and certain telemprompters outperform the others in at least these experiments.

large language model, machine learning, natural language, (17 more...)

2412.15298

Genre:

Research Report (0.50)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.54)

arXiv.org Machine LearningDec-10-2024

How to Choose a Threshold for an Evaluation Metric for Large Language Models

Sarmah, Bhaskarjit, Li, Mingshu, Lyu, Jingrao, Frank, Sebastian, Castellanos, Nathalia, Pasquali, Stefano, Mehta, Dhagash

To ensure and monitor large language models (LLMs) reliably, various evaluation metrics have been proposed in the literature. However, there is little research on prescribing a methodology to identify a robust threshold on these metrics even though there are many serious implications of an incorrect choice of the thresholds during deployment of the LLMs. Translating the traditional model risk management (MRM) guidelines within regulated industries such as the financial industry, we propose a step-by-step recipe for picking a threshold for a given LLM evaluation metric. We emphasize that such a methodology should start with identifying the risks of the LLM application under consideration and risk tolerance of the stakeholders. We then propose concrete and statistically rigorous procedures to determine a threshold for the given LLM evaluation metric using available ground-truth data. As a concrete example to demonstrate the proposed methodology at work, we employ it on the Faithfulness metric, as implemented in various publicly available libraries, using the publicly available HaluBench dataset. We also lay a foundation for creating systematic approaches to select thresholds, not only for LLMs but for any GenAI applications.

large language model, machine learning, natural language, (17 more...)

2412.12148

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.70)

Industry:

Banking & Finance (1.00)
Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceSep-30-2024

AI versus AI in Financial Crimes and Detection: GenAI Crime Waves to Co-Evolutionary AI

Kurshan, Eren, Mehta, Dhagash, Bruss, Bayan, Balch, Tucker

Adoption of AI by criminal entities across traditional and emerging financial crime paradigms has been a disturbing recent trend. Particularly concerning is the proliferation of generative AI, which has empowered criminal activities ranging from sophisticated phishing schemes to the creation of hard-to-detect deep fakes, and to advanced spoofing attacks to biometric authentication systems. The exploitation of AI by criminal purposes continues to escalate, presenting an unprecedented challenge. AI adoption causes an increasingly complex landscape of fraud typologies intertwined with cybersecurity vulnerabilities. Overall, GenAI has a transformative effect on financial crimes and fraud. According to some estimates, GenAI will quadruple the fraud losses by 2027 with a staggering annual growth rate of over 30% [27]. As crime patterns become more intricate, personalized, and elusive, deploying effective defensive AI strategies becomes indispensable. However, several challenges hinder the necessary progress of AI-based fincrime detection systems. This paper examines the latest trends in AI/ML-driven financial crimes and detection systems. It underscores the urgent need for developing agile AI defenses that can effectively counteract the rapidly emerging threats. It also aims to highlight the need for cooperation across the financial services industry to tackle the GenAI induced crime waves.

large language model, machine learning, natural language, (18 more...)

2410.09066

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(2 more...)

Technology:

Information Technology > e-Commerce > Financial Technology (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

arXiv.org Machine LearningOct-18-2023

Towards Enhanced Local Explainability of Random Forests: a Proximity-Based Approach

Rosaler, Joshua, Desai, Dhruv, Sarmah, Bhaskarjit, Vamvourellis, Dimitrios, Onay, Deran, Mehta, Dhagash, Pasquali, Stefano

We initiate a novel approach to explain the out of sample performance of random forest (RF) models by exploiting the fact that any RF can be formulated as an adaptive weighted K nearest-neighbors model. Specifically, we use the proximity between points in the feature space learned by the RF to re-write random forest predictions exactly as a weighted average of the target labels of training data points. This linearity facilitates a local notion of explainability of RF predictions that generates attributions for any model prediction across observations in the training set, and thereby complements established methods like SHAP, which instead generates attributions for a model prediction across dimensions of the feature space. We demonstrate this approach in the context of a bond pricing model trained on US corporate bond trades, and compare our approach to various existing approaches to model explainability.

artificial intelligence, machine learning, prediction, (16 more...)

2310.12428

Country: North America > United States (0.31)

Genre: Research Report (0.70)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.55)

arXiv.org Artificial IntelligenceOct-16-2023

Towards reducing hallucination in extracting information from financial reports using Large Language Models

Sarmah, Bhaskarjit, Zhu, Tianjie, Mehta, Dhagash, Pasquali, Stefano

For a financial analyst, the question and answer (Q\&A) segment of the company financial report is a crucial piece of information for various analysis and investment decisions. However, extracting valuable insights from the Q\&A section has posed considerable challenges as the conventional methods such as detailed reading and note-taking lack scalability and are susceptible to human errors, and Optical Character Recognition (OCR) and similar techniques encounter difficulties in accurately processing unstructured transcript text, often missing subtle linguistic nuances that drive investor decisions. Here, we demonstrate the utilization of Large Language Models (LLMs) to efficiently and rapidly extract information from earnings report transcripts while ensuring high accuracy transforming the extraction process as well as reducing hallucination by combining retrieval-augmented generation technique as well as metadata. We evaluate the outcomes of various LLMs with and without using our proposed approach based on various objective metrics for evaluating Q\&A systems, and empirically demonstrate superiority of our method.

information, large language model, machine learning, (17 more...)

2310.1076

Country: North America > United States (0.94)

Genre: Research Report > New Finding (0.46)

Industry: Banking & Finance > Financial Services (0.61)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

arXiv.org Artificial IntelligenceAug-13-2023

Quantifying Outlierness of Funds from their Categories using Supervised Similarity

Desai, Dhruv, Dhiman, Ashmita, Sharma, Tushar, Sharma, Deepika, Mehta, Dhagash, Pasquali, Stefano

Mutual fund categorization has become a standard tool for the investment management industry and is extensively used by allocators for portfolio construction and manager selection, as well as by fund managers for peer analysis and competitive positioning. As a result, a (unintended) miscategorization or lack of precision can significantly impact allocation decisions and investment fund managers. Here, we aim to quantify the effect of miscategorization of funds utilizing a machine learning based approach. We formulate the problem of miscategorization of funds as a distance-based outlier detection problem, where the outliers are the data-points that are far from the rest of the data-points in the given feature space. We implement and employ a Random Forest (RF) based method of distance metric learning, and compute the so-called class-wise outlier measures for each data-point to identify outliers in the data. We test our implementation on various publicly available data sets, and then apply it to mutual fund data. We show that there is a strong relationship between the outlier measures of the funds and their future returns and discuss the implications of our findings.

artificial intelligence, category, machine learning, (17 more...)

2308.06882

Country:

North America > United States (1.00)
Asia (0.68)

Genre: Research Report > New Finding (0.66)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

arXiv.org Machine LearningJul-11-2022

Learning Mutual Fund Categorization using Natural Language Processing

Vamvourellis, Dimitrios, Toth, Mate Attila, Desai, Dhruv, Mehta, Dhagash, Pasquali, Stefano

These categorization systems go deeper than the broader asset class based classification (equity, fixed income, etc) and provide Categorization of mutual funds or Exchange-Traded-funds (ETFs) further granular categories based on the portfolio breakdown. They have long served the financial analysts to perform peer analysis have been used to identify the top performing as well as worst for various purposes starting from competitor analysis, to quantifying performing funds within their peer groups, called peer analysis portfolio diversification. The categorization methodology of funds; to identify a home-grown fund to recommend against a usually relies on fund composition data in the structured format competitor's fund; to explain similarities and advantages of homegrown extracted from the Form N-1A. Here, we initiate a study to learn products compared to competitors' products for marketing the categorization system directly from the unstructured data as purposes; to quantify portfolio diversification of a given fund of depicted in the forms using natural language processing (NLP).

category, machine learning, natural language, (15 more...)

2207.04959

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Banking & Finance > Trading (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

arXiv.org Machine LearningJun-24-2020

Machine learning the real discriminant locus

Bernal, Edgar A., Hauenstein, Jonathan D., Mehta, Dhagash, Regan, Margaret H., Tang, Tingting

Parameterized systems of polynomial equations arise in many applications in science and engineering with the real solutions describing, for example, equilibria of a dynamical system, linkages satisfying design constraints, and scene reconstruction in computer vision. Since different parameter values can have a different number of real solutions, the parameter space is decomposed into regions whose boundary forms the real discriminant locus. This article views locating the real discriminant locus as a supervised classification problem in machine learning where the goal is to determine classification boundaries over the parameter space, with the classes being the number of real solutions. For multidimensional parameter spaces, this article presents a novel sampling method which carefully samples the parameter space. At each sample point, homotopy continuation is used to obtain the number of real solutions to the corresponding polynomial system. Machine learning techniques including nearest neighbor and deep learning are used to efficiently approximate the real discriminant locus. One application of having learned the real discriminant locus is to develop a real homotopy method that only tracks the real solution paths unlike traditional methods which track all~complex~solution~paths. Examples show that the proposed approach can efficiently approximate complicated solution boundaries such as those arising from the equilibria of the Kuramoto model.

deep learning, discriminant locus, neural network, (18 more...)

2006.14078

Country:

Europe > United Kingdom (0.67)
North America > United States > California (0.46)
North America > United States > New York (0.28)

Genre:

Research Report (0.64)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

arXiv.org Machine LearningMay-29-2020

Machine Learning Fund Categorizations

Mehta, Dhagash, Desai, Dhruv, Pradeep, Jithin

Given the surge in popularity of mutual funds (including exchange-traded funds (ETFs)) as a diversified financial investment, a vast variety of mutual funds from various investment management firms and diversification strategies have become available in the market. Identifying similar mutual funds among such a wide landscape of mutual funds has become more important than ever because of many applications ranging from sales and marketing to portfolio replication, portfolio diversification and tax loss harvesting. The current best method is data-vendor provided categorization which usually relies on curation by human experts with the help of available data. In this work, we establish that an industry wide well-regarded categorization system is learnable using machine learning and largely reproducible, and in turn constructing a truly data-driven categorization. We discuss the intellectual challenges in learning this man-made system, our results and their implications.

banking & finance, categorization, neural network, (17 more...)

2006.00123

Genre: Research Report (0.70)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)