Towards a Foundation Purchasing Model: Pretrained Generative Autoregression on Transaction Sequences
Skalski, Piotr, Sutton, David, Burrell, Stuart, Perez, Iker, Wong, Jason
–arXiv.org Artificial Intelligence
Their Machine learning models underpin many modern financial systems rapid success has been in no small part due to the development of for use cases such as fraud detection and churn prediction. Most self-supervised learning (SSL) methods such as autoregressive [27] are based on supervised learning with hand-engineered features, and masked [13] language modelling which have allowed models which relies heavily on the availability of labelled data. Large selfsupervised to learn contextual representations of input tokens without relying generative models have shown tremendous success on labels. in natural language processing and computer vision, yet so far While these methods have already been successfully used with they haven't been adapted to multivariate time series of financial different modalities such as natural language [4, 11, 22, 27, 28], transactions. In this paper, we present a generative pretraining computer vision [26, 30], audio [3, 12], and tabular data [1, 20, 31] method that can be used to obtain contextualised embeddings of there has been little work to adapt them to the case of multivariate financial transactions. Benchmarks on public datasets demonstrate time series data. One example of such data modality of particular that it outperforms state-of-the-art self-supervised methods on a interest in this work is streams of financial transactions - sequences range of downstream tasks. We additionally perform large-scale of events representing transfers of funds between two entities. Each pretraining of an embedding model using a corpus of data from 180 event can be described by a set of numerical or categorical features, issuing banks containing 5.1 billion transactions and apply it to the such as the timestamp, card number, transaction amount, merchant card fraud detection problem on hold-out datasets.
arXiv.org Artificial Intelligence
Jan-4-2024
- Country:
- Genre:
- Research Report (0.51)
- Industry:
- Law Enforcement & Public Safety > Fraud (1.00)
- Banking & Finance (1.00)
- Transportation > Air (0.93)
- Consumer Products & Services > Travel (0.93)
- Information Technology > Security & Privacy (0.88)
- Technology: