AITopics | Gomes, Heitor Murilo

Collaborating Authors

Gomes, Heitor Murilo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CapyMOA: Efficient Machine Learning for Data Streams in Python

Gomes, Heitor Murilo, Lee, Anton, Gunasekara, Nuwan, Sun, Yibin, Cassales, Guilherme Weigert, Liu, Justin, Heyden, Marco, Cerqueira, Vitor, Bahri, Maroua, Koh, Yun Sing, Pfahringer, Bernhard, Bifet, Albert

arXiv.org Artificial IntelligenceFeb-11-2025

CapyMOA is an open-source library designed for efficient machine learning on streaming data. It provides a structured framework for real-time learning and evaluation, featuring a flexible data representation. CapyMOA includes an extensible architecture that allows integration with external frameworks such as MOA and PyTorch, facilitating hybrid learning approaches that combine traditional online algorithms with deep learning techniques. By emphasizing adaptability, scalability, and usability, CapyMOA allows researchers and practitioners to tackle dynamic learning challenges across various domains.

artificial intelligence, evaluation, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2502.07432

Country:

Europe (0.94)
Oceania > New Zealand > North Island (0.32)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Evaluation for Regression Analyses on Evolving Data Streams

Sun, Yibin, Gomes, Heitor Murilo, Pfahringer, Bernhard, Bifet, Albert

arXiv.org Artificial IntelligenceFeb-10-2025

The paper explores the challenges of regression analysis in evolving data streams, an area that remains relatively underexplored compared to classification. We propose a standardized evaluation process for regression and prediction interval tasks in streaming contexts. Additionally, we introduce an innovative drift simulation strategy capable of synthesizing various drift types, including the less-studied incremental drift. Comprehensive experiments with state-of-the-art methods, conducted under the proposed process, validate the effectiveness and robustness of our approach.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2502.07213

Country: Oceania > New Zealand > North Island > Waikato (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

CLOFAI: A Dataset of Real And Fake Image Classification Tasks for Continual Learning

Doherty, William, Lee, Anton, Gomes, Heitor Murilo

arXiv.org Artificial IntelligenceJan-19-2025

The rapid advancement of generative AI models capable of creating realistic media has led to a need for classifiers that can accurately distinguish between genuine and artificially-generated images. A significant challenge for these classifiers emerges when they encounter images from generative models that are not represented in their training data, usually resulting in diminished performance. A typical approach is to periodically update the classifier's training data with images from the new generative models then retrain the classifier on the updated dataset. However, in some real-life scenarios, storage, computational, or privacy constraints render this approach impractical. Additionally, models used in security applications may be required to rapidly adapt. In these circumstances, continual learning provides a promising alternative, as the classifier can be updated without retraining on the entire dataset. In this paper, we introduce a new dataset called CLOFAI (Continual Learning On Fake and Authentic Images), which takes the form of a domain-incremental image classification problem. Moreover, we showcase the applicability of this dataset as a benchmark for evaluating continual learning methodologies. In doing this, we set a baseline on our novel dataset using three foundational continual learning methods -- EWC, GEM, and Experience Replay -- and find that EWC performs poorly, while GEM and Experience Replay show promise, performing significantly better than a Naive baseline. The dataset and code to run the experiments can be accessed from the following GitHub repository: https://github.com/Will-Doherty/CLOFAI.

accuracy, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2501.1114

Country:

Europe (0.93)
North America > United States > California > Los Angeles County > Long Beach (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.87)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.75)

Add feedback

Look At Me, No Replay! SurpriseNet: Anomaly Detection Inspired Class Incremental Learning

Lee, Anton, Zhang, Yaqian, Gomes, Heitor Murilo, Bifet, Albert, Pfahringer, Bernhard

arXiv.org Artificial IntelligenceOct-30-2023

Continual learning aims to create artificial neural networks capable of accumulating knowledge and skills through incremental training on a sequence of tasks. The main challenge of continual learning is catastrophic interference, wherein new knowledge overrides or interferes with past knowledge, leading to forgetting. An associated issue is the problem of learning "cross-task knowledge," where models fail to acquire and retain knowledge that helps differentiate classes across task boundaries. A common solution to both problems is "replay," where a limited buffer of past instances is utilized to learn cross-task knowledge and mitigate catastrophic interference. However, a notable drawback of these methods is their tendency to overfit the limited replay buffer. In contrast, our proposed solution, SurpriseNet, addresses catastrophic interference by employing a parameter isolation method and learning cross-task knowledge using an auto-encoder inspired by anomaly detection. SurpriseNet is applicable to both structured and unstructured data, as it does not rely on image-specific inductive biases. We have conducted empirical experiments demonstrating the strengths of SurpriseNet on various traditional vision continual-learning benchmarks, as well as on structured data datasets. Source code made available at https://doi.org/10.5281/zenodo.8247906 and https://github.com/tachyonicClock/SurpriseNet-CIKM-23

artificial intelligence, machine learning, surprisenet, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3583780.3615236

2310.20052

Country:

North America > United States (0.47)
Europe > United Kingdom (0.29)
Oceania > New Zealand > North Island > Waikato (0.16)

Genre: Research Report (0.65)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Machine Learning (In) Security: A Stream of Problems

Ceschin, Fabrício, Botacin, Marcus, Bifet, Albert, Pfahringer, Bernhard, Oliveira, Luiz S., Gomes, Heitor Murilo, Grégio, André

arXiv.org Artificial IntelligenceSep-4-2023

Machine Learning (ML) has been widely applied to cybersecurity and is considered state-of-the-art for solving many of the open issues in that field. However, it is very difficult to evaluate how good the produced solutions are, since the challenges faced in security may not appear in other areas. One of these challenges is the concept drift, which increases the existing arms race between attackers and defenders: malicious actors can always create novel threats to overcome the defense solutions, which may not consider them in some approaches. Due to this, it is essential to know how to properly build and evaluate an ML-based security solution. In this paper, we identify, detail, and discuss the main challenges in the correct application of ML techniques to cybersecurity data. We evaluate how concept drift, evolution, delayed labels, and adversarial ML impact the existing solutions. Moreover, we address how issues related to data collection affect the quality of the results presented in the security literature, showing that new strategies are needed to improve current solutions. Finally, we present how existing solutions may fail under certain circumstances, and propose mitigations to them, presenting a novel checklist to help the development of future ML solutions for cybersecurity.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3617897

2010.16045

Country:

Europe (1.00)
North America > United States > New York > New York County > New York City (0.14)
Oceania > New Zealand > North Island > Waikato (0.14)
(3 more...)

Genre: Research Report > New Finding (0.45)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.91)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Advances on Concept Drift Detection in Regression Tasks using Social Networks Theory

Barddal, Jean Paul, Gomes, Heitor Murilo, Enembreck, Fabrício

arXiv.org Artificial IntelligenceApr-19-2023

Mining data streams is one of the main studies in machine learning area due to its application in many knowledge areas. One of the major challenges on mining data streams is concept drift, which requires the learner to discard the current concept and adapt to a new one. Ensemble-based drift detection algorithms have been used successfully to the classification task but usually maintain a fixed size ensemble of learners running the risk of needlessly spending processing time and memory. In this paper we present improvements to the Scale-free Network Regressor (SFNR), a dynamic ensemble-based method for regression that employs social networks theory. In order to detect concept drifts SFNR uses the Adaptive Window (ADWIN) algorithm. Results show improvements in accuracy, especially in concept drift situations and better performance compared to other state-of-the-art algorithms in both real and synthetic data.

algorithm, artificial intelligence, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.4018/ijncr.2015010102

2304.09788

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology > Services (0.62)
Banking & Finance > Trading (0.47)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

STUDD: A Student-Teacher Method for Unsupervised Concept Drift Detection

Cerqueira, Vitor, Gomes, Heitor Murilo, Bifet, Albert, Torgo, Luis

arXiv.org Machine LearningMar-1-2021

Concept drift detection is a crucial task in data stream evolving environments. Most of state of the art approaches designed to tackle this problem monitor the loss of predictive models. However, this approach falls short in many real-world scenarios, where the true labels are not readily available to compute the loss. In this context, there is increasing attention to approaches that perform concept drift detection in an unsupervised manner, i.e., without access to the true labels. We propose a novel approach to unsupervised concept drift detection based on a student-teacher learning paradigm. Essentially, we create an auxiliary model (student) to mimic the behaviour of the primary model (teacher). At run-time, our approach is to use the teacher for predicting new instances and monitoring the mimicking loss of the student for concept drift detection. In a set of experiments using 19 data streams, we show that the proposed approach can detect concept drift and present a competitive behaviour relative to the state of the art approaches.

artificial intelligence, detection, neural network, (18 more...)

arXiv.org Machine Learning

2103.00903

Country:

Europe (0.46)
North America (0.28)
Oceania > New Zealand > North Island > Waikato (0.14)

Genre: Research Report > Promising Solution (1.00)

Industry:

Education (0.69)
Energy (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

River: machine learning for streaming data in Python

Montiel, Jacob, Halford, Max, Mastelini, Saulo Martiello, Bolmier, Geoffrey, Sourty, Raphael, Vaysse, Robin, Zouitine, Adil, Gomes, Heitor Murilo, Read, Jesse, Abdessalem, Talel, Bifet, Albert

arXiv.org Artificial IntelligenceDec-8-2020

River is a machine learning library for dynamic data streams and continual learning. It provides multiple state-of-the-art learning methods, data generators/transformers, performance metrics and evaluators for different stream learning problems. It is the result from the merger of the two most popular packages for stream learning in Python: Creme and scikit-multiflow. River introduces a revamped architecture based on the lessons learnt from the seminal packages. River's ambition is to be the go-to library for doing machine learning on streaming data. Additionally, this open source package brings under the same umbrella a large community of practitioners and researchers. The source code is available at https://github.com/online-ml/river.

artificial intelligence, machine learning, river, (13 more...)

arXiv.org Artificial Intelligence

2012.0474

Country:

Europe > France (0.36)
Oceania > New Zealand > North Island > Waikato (0.19)

Genre: Research Report (0.53)

Industry: Education (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback