AITopics | Altman, Erik

Collaborating Authors

Altman, Erik

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Graph Feature Preprocessor: Real-time Extraction of Subgraph-based Features from Transaction Graphs

Blanuša, Jovan, Baraja, Maximo Cravero, Anghel, Andreea, von Niederhäusern, Luc, Altman, Erik, Pozidis, Haris, Atasu, Kubilay

arXiv.org Artificial IntelligenceFeb-13-2024

In this paper, we present "Graph Feature Preprocessor", a software library for detecting typical money laundering and fraud patterns in financial transaction graphs in real time. These patterns are used to produce a rich set of transaction features for downstream machine learning training and inference tasks such as money laundering detection. We show that our enriched transaction features dramatically improve the prediction accuracy of gradient-boosting-based machine learning models. Our library exploits multicore parallelism, maintains a dynamic in-memory graph, and efficiently mines subgraph patterns in the incoming transaction stream, which enables it to be operated in a streaming manner. We evaluate our library using highly-imbalanced synthetic anti-money laundering (AML) and real-life Ethereum phishing datasets. In these datasets, the proportion of illicit transactions is very small, which makes the learning process challenging. Our solution, which combines our Graph Feature Preprocessor and gradient-boosting-based machine learning models, is able to detect these illicit transactions with higher minority-class F1 scores than standard graph neural networks. In addition, the end-to-end throughput rate of our solution executed on a multicore CPU outperforms the graph neural network baselines executed on a powerful V100 GPU. Overall, the combination of high accuracy, a high throughput rate, and low latency of our solution demonstrates the practical value of our library in real-world applications. Graph Feature Preprocessor has been integrated into IBM mainframe software products, namely "IBM Cloud Pak for Data on Z" and "AI Toolkit for IBM Z and LinuxONE".

artificial intelligence, machine learning, transaction, (15 more...)

arXiv.org Artificial Intelligence

2402.08593

Country:

Europe (1.00)
Asia (0.67)
North America > United States > California > San Francisco County > San Francisco (0.28)

Genre: Research Report (0.50)

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Information Technology (1.00)
Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Realistic Synthetic Financial Transactions for Anti-Money Laundering Models

Altman, Erik, Blanuša, Jovan, von Niederhäusern, Luc, Egressy, Béni, Anghel, Andreea, Atasu, Kubilay

arXiv.org Artificial IntelligenceJan-25-2024

With the widespread digitization of finance and the increasing popularity of cryptocurrencies, the sophistication of fraud schemes devised by cybercriminals is growing. Money laundering -- the movement of illicit funds to conceal their origins -- can cross bank and national boundaries, producing complex transaction patterns. The UN estimates 2-5\% of global GDP or \$0.8 - \$2.0 trillion dollars are laundered globally each year. Unfortunately, real data to train machine learning models to detect laundering is generally not available, and previous synthetic data generators have had significant shortcomings. A realistic, standardized, publicly-available benchmark is needed for comparing models and for the advancement of the area. To this end, this paper contributes a synthetic financial transaction dataset generator and a set of synthetically generated AML (Anti-Money Laundering) datasets. We have calibrated this agent-based generator to match real transactions as closely as possible and made the datasets public. We describe the generator in detail and demonstrate how the datasets generated can help compare different machine learning models in terms of their AML abilities. In a key way, using synthetic data in these comparisons can be even better than using real data: the ground truth labels are complete, whilst many laundering transactions in real data are never detected.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2306.16424

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (1.00)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Fraud (1.00)
Information Technology > Security & Privacy (1.00)
(3 more...)

Technology:

Information Technology > e-Commerce > Financial Technology (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Provably Powerful Graph Neural Networks for Directed Multigraphs

Egressy, Béni, von Niederhäusern, Luc, Blanusa, Jovan, Altman, Erik, Wattenhofer, Roger, Atasu, Kubilay

arXiv.org Artificial IntelligenceJan-4-2024

This paper analyses a set of simple adaptations that transform standard message-passing Graph Neural Networks (GNN) into provably powerful directed multigraph neural networks. The adaptations include multigraph port numbering, ego IDs, and reverse message passing. We prove that the combination of these theoretically enables the detection of any directed subgraph pattern. To validate the effectiveness of our proposed adaptations in practice, we conduct experiments on synthetic subgraph detection tasks, which demonstrate outstanding performance with almost perfect results. Moreover, we apply our proposed adaptations to two financial crime analysis tasks. We observe dramatic improvements in detecting money laundering transactions, improving the minority-class F1 score of a standard message-passing GNN by up to 30%, and closely matching or outperforming tree-based and GNN baselines. Similarly impressive results are observed on a real-world phishing detection dataset, boosting three standard GNNs' F1 scores by around 15% and outperforming all baselines.

artificial intelligence, machine learning, node, (17 more...)

arXiv.org Artificial Intelligence

2306.11586

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.50)

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Information Technology (1.00)
Banking & Finance (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Tabular Transformers for Modeling Multivariate Time Series

Padhi, Inkit, Schiff, Yair, Melnyk, Igor, Rigotti, Mattia, Mroueh, Youssef, Dognin, Pierre, Ross, Jerret, Nair, Ravi, Altman, Erik

arXiv.org Artificial IntelligenceNov-3-2020

Tabular datasets are ubiquitous across many industries, especially in vital sectors such as healthcare and finance. Such industrial datasets often contain sensitive information, raising privacy and confidentiality issues that preclude their public release and limit their analysis to methods that are compatible with an appropriate anonymization process. We can distinguish between two types of tabular data: static tabular data that corresponds to independent rows in a table, and dynamic tabular data that corresponds to tabular time series, also referred to also as multivariate time series. The machine learning and deep learning communities have devoted considerable effort to learning from static tabular data, as well as generating synthetic static tabular data that can be released as a privacy compliant surrogate of the original data. On the other hand, less effort has been devoted to the more challenging dynamic case, where it is important to also account for the temporal component of the data.

deep learning, neural network, transaction, (20 more...)

arXiv.org Artificial Intelligence

2011.01843

Country:

Asia (0.14)
Europe > Italy (0.14)

Genre: Research Report (0.40)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.88)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Understanding AI Data Repositories with Automatic Query Generation

Altman, Erik

arXiv.org Artificial IntelligenceApr-20-2018

We describe a set of techniques to generate queries automatically based on one or more ingested, input corpuses. These queries require no a priori domain knowledge, and hence no human domain experts. Thus, these auto-generated queries help address the epistemological question of how we know what we know, or more precisely in this case, how an AI system with ingested data knows what it knows. These auto-generated queries can also be used to identify and remedy problem areas in ingested material -- areas for which the knowledge of the AI system is incomplete or even erroneous. Similarly, the proposed techniques facilitate tests of AI capability -- both in terms of coverage and accuracy. By removing humans from the main learning loop, our approach also allows more effective scaling of AI and cognitive capabilities to provide (1) broader coverage in a single domain such as health or geology; and (2) more rapid deployment to new domains. The proposed techniques also allow ingested knowledge to be extended naturally. Our investigations are early, and this paper provides a description of the techniques. Assessment of their efficacy is our next step for future work.

information management, query, social media, (19 more...)

arXiv.org Artificial Intelligence

1804.07819

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback