AITopics | Montiel, Jacob

Collaborating Authors

Montiel, Jacob

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Improving Predictions of Tail-end Labels using Concatenated BioMed-Transformers for Long Medical Documents

Yogarajan, Vithya, Pfahringer, Bernhard, Smith, Tony, Montiel, Jacob

arXiv.org Artificial IntelligenceDec-3-2021

Multi-label learning predicts a subset of labels from a given label set for an unseen instance while considering label correlations. A known challenge with multi-label classification is the long-tailed distribution of labels. Many studies focus on improving the overall predictions of the model and thus do not prioritise tail-end labels. Improving the tail-end label predictions in multi-label classifications of medical text enables the potential to understand patients better and improve care. The knowledge gained by one or more infrequent labels can impact the cause of medical decisions and treatment plans. This research presents variations of concatenated domain-specific language models, including multi-BioMed-Transformers, to achieve two primary goals. First, to improve F1 scores of infrequent labels across multi-label problems, especially with long-tail labels; second, to handle long medical text and multi-sourced electronic health records (EHRs), a challenging task for standard transformers designed to work on short input sequences. A vital contribution of this research is new state-of-the-art (SOTA) results obtained using TransformerXL for predicting medical codes. A variety of experiments are performed on the Medical Information Mart for Intensive Care (MIMIC-III) database. Results show that concatenated BioMed-Transformers outperform standard transformers in terms of overall micro and macro F1 scores and individual F1 scores of tail-end labels, while incurring lower training times than existing transformer-based solutions for long input sequences.

information retrieval, machine learning, natural language, (25 more...)

arXiv.org Artificial Intelligence

2112.01718

Country:

Oceania > New Zealand > North Island (0.14)
North America > United States (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Health Care Technology > Medical Record (0.68)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.79)

Add feedback

Predicting COVID-19 Patient Shielding: A Comprehensive Study

Yogarajan, Vithya, Montiel, Jacob, Smith, Tony, Pfahringer, Bernhard

arXiv.org Artificial IntelligenceSep-30-2021

There are many ways machine learning and big data analytics are used in the fight against the COVID-19 pandemic, including predictions, risk management, diagnostics, and prevention. This study focuses on predicting COVID-19 patient shielding -- identifying and protecting patients who are clinically extremely vulnerable from coronavirus. This study focuses on techniques used for the multi-label classification of medical text. Using the information published by the United Kingdom NHS and the World Health Organisation, we present a novel approach to predicting COVID-19 patient shielding as a multi-label classification problem. We use publicly available, de-identified ICU medical text data for our experiments. The labels are derived from the published COVID-19 patient shielding data. We present an extensive comparison across 12 multi-label classifiers from the simple binary relevance to neural networks and the most recent transformers. To the best of our knowledge this is the first comprehensive study, where such a range of multi-label classifiers for medical text are considered. We highlight the benefits of various approaches, and argue that, for the task at hand, both predictive accuracy and processing time are essential.

artificial intelligence, health & medicine, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2110.00183

Country: Europe > United Kingdom (0.54)

Genre: Research Report > Experimental Study (0.49)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

River: machine learning for streaming data in Python

Montiel, Jacob, Halford, Max, Mastelini, Saulo Martiello, Bolmier, Geoffrey, Sourty, Raphael, Vaysse, Robin, Zouitine, Adil, Gomes, Heitor Murilo, Read, Jesse, Abdessalem, Talel, Bifet, Albert

arXiv.org Artificial IntelligenceDec-8-2020

River is a machine learning library for dynamic data streams and continual learning. It provides multiple state-of-the-art learning methods, data generators/transformers, performance metrics and evaluators for different stream learning problems. It is the result from the merger of the two most popular packages for stream learning in Python: Creme and scikit-multiflow. River introduces a revamped architecture based on the lessons learnt from the seminal packages. River's ambition is to be the go-to library for doing machine learning on streaming data. Additionally, this open source package brings under the same umbrella a large community of practitioners and researchers. The source code is available at https://github.com/online-ml/river.

artificial intelligence, machine learning, river, (13 more...)

arXiv.org Artificial Intelligence

2012.0474

Country:

Europe > France (0.36)
Oceania > New Zealand > North Island > Waikato (0.19)

Genre: Research Report (0.53)

Industry: Education (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Adaptive XGBoost for Evolving Data Streams

Montiel, Jacob, Mitchell, Rory, Frank, Eibe, Pfahringer, Bernhard, Abdessalem, Talel, Bifet, Albert

arXiv.org Machine LearningMay-15-2020

Boosting is an ensemble method that combines base models in a sequential manner to achieve high predictive accuracy. A popular learning algorithm based on this ensemble method is eXtreme Gradient Boosting (XGB). We present an adaptation of XGB for classification of evolving data streams. In this setting, new data arrives over time and the relationship between the class and the features may change in the process, thus exhibiting concept drift. The proposed method creates new members of the ensemble from mini-batches of data as new data becomes available. The maximum ensemble size is fixed, but learning does not stop when this size is reached because the ensemble is updated on new data to ensure consistency with the current concept. We also explore the use of concept drift detection to trigger a mechanism to update the ensemble. We test our method on real and synthetic data with concept drift and compare it against batch-incremental and instance-incremental classification methods for data streams.

artificial intelligence, ensemble, machine learning, (18 more...)

arXiv.org Machine Learning

2005.07353

Country:

Oceania > New Zealand > North Island > Waikato (0.14)
North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Scikit-Multiflow: A Multi-output Streaming Framework

Montiel, Jacob, Read, Jesse, Bifet, Albert, Abdessalem, Talel

arXiv.org Machine LearningJul-12-2018

Scikit-multiflow is a multi-output/multi-label and stream data mining framework for the Python programming language. Conceived to serve as a platform to encourage democratization of stream learning research, it provides multiple state of the art methods for stream learning, stream generators and evaluators. scikit-multiflow builds upon popular open source frameworks including scikit-learn, MOA and MEKA. Development follows the FOSS principles and quality is enforced by complying with PEP8 guidelines and using continuous integration and automatic testing. The source code is publicly available at https://github.com/scikit-multiflow/scikit-multiflow.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Machine Learning

1807.04662

Country: Europe > France (0.17)

Genre: Research Report (0.70)

Technology:

Information Technology > Software (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)

Add feedback