AITopics | Rosenberg, David

Collaborating Authors

Rosenberg, David

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

BloombergGPT: A Large Language Model for Finance

Wu, Shijie, Irsoy, Ozan, Lu, Steven, Dabravolski, Vadim, Dredze, Mark, Gehrmann, Sebastian, Kambadur, Prabhanjan, Rosenberg, David, Mann, Gideon

arXiv.org Artificial IntelligenceDec-21-2023

The use of NLP in the realm of financial technology is broad and complex, with applications ranging from sentiment analysis and named entity recognition to question answering. Large Language Models (LLMs) have been shown to be effective on a variety of tasks; however, no LLM specialized for the financial domain has been reported in literature. In this work, we present BloombergGPT, a 50 billion parameter language model that is trained on a wide range of financial data. We construct a 363 billion token dataset based on Bloomberg's extensive data sources, perhaps the largest domain-specific dataset yet, augmented with 345 billion tokens from general purpose datasets. We validate BloombergGPT on standard LLM benchmarks, open financial benchmarks, and a suite of internal benchmarks that most accurately reflect our intended usage. Our mixed dataset training leads to a model that outperforms existing models on financial tasks by significant margins without sacrificing performance on general LLM benchmarks. Additionally, we explain our modeling choices, training process, and evaluation methodology. We release Training Chronicles (Appendix C) detailing our experience in training BloombergGPT.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2303.17564

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Maryland (0.27)
(2 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.45)

Industry:

Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Education (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DP-TBART: A Transformer-based Autoregressive Model for Differentially Private Tabular Data Generation

Castellon, Rodrigo, Gopal, Achintya, Bloniarz, Brian, Rosenberg, David

arXiv.org Artificial IntelligenceJul-19-2023

The generation of synthetic tabular data that preserves differential privacy is a problem of growing importance. While traditional marginal-based methods have achieved impressive results, recent work has shown that deep learning-based approaches tend to lag behind. In this work, we present Differentially-Private TaBular AutoRegressive Transformer (DP-TBART), a transformer-based autoregressive model that maintains differential privacy and achieves performance competitive with marginal-based methods on a wide variety of datasets, capable of even outperforming state-of-the-art methods in certain settings. We also provide a theoretical framework for understanding the limitations of marginal-based approaches and where deep learning-based approaches stand to contribute most. These results suggest that deep learning-based techniques should be considered as a viable alternative to marginal-based methods in the generation of differentially private synthetic tabular data.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2307.1043

Country:

North America > United States > California > Santa Clara County (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.88)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MixCE: Training Autoregressive Language Models by Mixing Forward and Reverse Cross-Entropies

Zhang, Shiyue, Wu, Shijie, Irsoy, Ozan, Lu, Steven, Bansal, Mohit, Dredze, Mark, Rosenberg, David

arXiv.org Artificial IntelligenceMay-26-2023

Autoregressive language models are trained by minimizing the cross-entropy of the model distribution Q relative to the data distribution P -- that is, minimizing the forward cross-entropy, which is equivalent to maximum likelihood estimation (MLE). We have observed that models trained in this way may "over-generalize", in the sense that they produce non-human-like text. Moreover, we believe that reverse cross-entropy, i.e., the cross-entropy of P relative to Q, is a better reflection of how a human would evaluate text generated by a model. Hence, we propose learning with MixCE, an objective that mixes the forward and reverse cross-entropies. We evaluate models trained with this objective on synthetic data settings (where P is known) and real data, and show that the resulting models yield better generated text without complex decoding strategies. Our code and models are publicly available at https://github.com/bloomberg/mixce-acl2023

justification, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2305.16958

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

Visual Attention Model for Cross-Sectional Stock Return Prediction and End-to-End Multimodal Market Representation Learning

AAAI ConferencesMay-15-2019

Technical and fundamental analysis are traditional tools used to analyze individual stocks; however, the finance literature has shown that the price movement of each individual stock correlates heavily with other stocks, especially those within the same sector. In this paper we propose a general-purpose market representation that incorporates fundamental and technical indicators and relationships between individual stocks. We treat the daily stock market as a ‘market image’ where rows (grouped by market sector) represent individual stocks and columns represent indicators. We apply a convolutional neural network over this market image to build market features in a hierarchical way. We use a recurrent neural network, with an attention mechanism over the market feature maps, to model temporal dynamics in the market. We show that our proposed model outperforms strong baselines in both short-term and long-term stock return prediction tasks. We also show another use for our market image: to construct concise and dense market embeddings suitable for downstream prediction tasks.

deep learning, market image, neural network, (21 more...)

AAAI Conferences

The Thirty-Second International Flairs Conference

Country:

North America > United States (0.14)
Asia > Middle East > Iran (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Improving Grey-Box Fuzzing by Modeling Program Behavior

Karamcheti, Siddharth, Mann, Gideon, Rosenberg, David

arXiv.org Artificial IntelligenceNov-21-2018

Grey-box fuzzers such as American Fuzzy Lop (AFL) are popular tools for finding bugs and potential vulnerabilities in programs. While these fuzzers have been able to find vulnerabilities in many widely used programs, they are not efficient; of the millions of inputs executed by AFL in a typical fuzzing run, only a handful discover unseen behavior or trigger a crash. The remaining inputs are redundant, exhibiting behavior that has already been observed. Here, we present an approach to increase the efficiency of fuzzers like AFL by applying machine learning to directly model how programs behave. We learn a forward prediction model that maps program inputs to execution traces, training on the thousands of inputs collected during standard fuzzing. This learned model guides exploration by focusing on fuzzing inputs on which our model is the most uncertain (measured via the entropy of the predicted execution trace distribution). By focusing on executing inputs our learned model is unsure about, and ignoring any input whose behavior our model is certain about, we show that we can significantly limit wasteful execution. Through testing our approach on a set of binaries released as part of the DARPA Cyber Grand Challenge, we show that our approach is able to find a set of inputs that result in more code coverage and discovered crashes than baseline fuzzers with significantly fewer executions.

artificial intelligence, execution, machine learning, (15 more...)

arXiv.org Artificial Intelligence

1811.08973

Country: North America > United States (0.50)

Genre: Research Report > New Finding (0.71)

Industry:

Information Technology > Security & Privacy (0.46)
Government > Military (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Adaptive Grey-Box Fuzz-Testing with Thompson Sampling

Karamcheti, Siddharth, Mann, Gideon, Rosenberg, David

arXiv.org Artificial IntelligenceAug-24-2018

Fuzz testing, or "fuzzing," refers to a widely deployed class of techniques for testing programs by generating a set of inputs for the express purpose of finding bugs and identifying security flaws. Grey-box fuzzing, the most popular fuzzing strategy, combines light program instrumentation with a data driven process to generate new program inputs. In this work, we present a machine learning approach that builds on AFL, the preeminent grey-box fuzzer, by adaptively learning a probability distribution over its mutation operators on a program-specific basis. These operators, which are selected uniformly at random in AFL and mutational fuzzers in general, dictate how new inputs are generated, a core part of the fuzzer's efficacy. Our main contributions are two-fold: First, we show that a sampling distribution over mutation operators estimated from training programs can significantly improve performance of AFL. Second, we introduce a Thompson Sampling, bandit-based optimization approach that fine-tunes the mutator distribution adaptively, during the course of fuzzing an individual program. A set of experiments across complex programs demonstrates that tuning the mutational operator distribution generates sets of inputs that yield significantly higher code coverage and finds more crashes faster and more reliably than both baseline versions of AFL as well as other AFL-based learning approaches.

artificial intelligence, optimization problem, thompson, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3270101.3270108

1808.08256

Country:

North America > United States (0.30)
Europe > United Kingdom > Scotland (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Add feedback

Scatteract: Automated extraction of data from scatter plots

Cliche, Mathieu, Rosenberg, David, Madeka, Dhruv, Yee, Connie

arXiv.org Machine LearningApr-21-2017

Charts are an excellent way to convey patterns and trends in data, but they do not facilitate further modeling of the data or close inspection of individual data points. We present a fully automated system for extracting the numerical values of data points from images of scatter plots. We use deep learning techniques to identify the key components of the chart, and optical character recognition together with robust regression to map from pixels to the coordinate system of the chart. We focus on scatter plots with linear scales, which already have several interesting challenges. Previous work has done fully automatic extraction for other types of charts, but to our knowledge this is the first approach that is fully automatic for scatter plots. Our method performs well, achieving successful data extraction on 89% of the plots in our test set.

deep learning, neural network, scatteract, (22 more...)

arXiv.org Machine Learning

1704.06687

Country: North America > United States > New York (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback