AITopics | scikit-learn

Collaborating Authors

scikit-learn

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

What ' sagoodimputationtopredictwithmissing values?

Neural Information Processing SystemsFeb-8-2026, 23:13:49 GMT

The standpoint, regression opposed regression number 4.3 F Anothercorr suchf?

artificial intelligence, machine learning, theorem 3, (17 more...)

Neural Information Processing Systems

Country: Europe > France > Occitanie > Hérault > Montpellier (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Even Faster Hyperbolic Random Forests: A Beltrami-Klein Wrapper Approach

Chlenski, Philippe, Pe'er, Itsik

arXiv.org Artificial IntelligenceJun-6-2025

Decision trees and models that use them as primitives are workhorses of machine learning in Euclidean spaces. Recent work has further extended these models to the Lorentz model of hyperbolic space by replacing axis-parallel hyperplanes with homogeneous hyperplanes when partitioning the input space. In this paper, we show how the hyperDT algorithm can be elegantly reexpressed in the Beltrami-Klein model of hyperbolic spaces. This preserves the thresholding operation used in Euclidean decision trees, enabling us to further rewrite hyperDT as simple pre- and post-processing steps that form a wrapper around existing tree-based models designed for Euclidean spaces. The wrapper approach unlocks many optimizations already available in Euclidean space models, improving flexibility, speed, and accuracy while offering a simpler, more maintainable, and extensible codebase. Our implementation is available at https://github.com/pchlenski/hyperdt.

artificial intelligence, fast-hyperdt, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2506.0436

Genre: Research Report (0.43)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

AllMetrics: A Unified Python Library for Standardized Metric Evaluation and Robust Data Validation in Machine Learning

Alizadeh, Morteza, Oveisi, Mehrdad, Falahati, Sonya, Mousavi, Ghazal, Meybodi, Mohsen Alambardar, Mehrnia, Somayeh Sadat, Hacihaliloglu, Ilker, Rahmim, Arman, Salmanpour, Mohammad R.

arXiv.org Artificial IntelligenceMay-23-2025

Machine learning (ML) models rely heavily on consistent and accurate performance metrics to evaluate and compare their effectiveness. However, existing libraries often suffer from fragmentation, inconsistent implementations, and insufficient data validation protocols, leading to unreliable results. Existing libraries have often been developed independently and without adherence to a unified standard, particularly concerning the specific tasks they aim to support. As a result, each library tends to adopt its conventions for metric computation, input/output formatting, error handling, and data validation protocols. This lack of standardization leads to both implementation differences (ID) and reporting differences (RD), making it difficult to compare results across frameworks or ensure reliable evaluations. To address these issues, we introduce AllMetrics, an open-source unified Python library designed to standardize metric evaluation across diverse ML tasks, including regression, classification, clustering, segmentation, and image-to-image translation. The library implements class-specific reporting for multi-class tasks through configurable parameters to cover all use cases, while incorporating task-specific parameters to resolve metric computation discrepancies across implementations. Various datasets from domains like healthcare, finance, and real estate were applied to our library and compared with Python, Matlab, and R components to identify which yield similar results. AllMetrics combines a modular Application Programming Interface (API) with robust input validation mechanisms to ensure reproducibility and reliability in model evaluation. This paper presents the design principles, architectural components, and empirical analyses demonstrating the ability to mitigate evaluation errors and to enhance the trustworthiness of ML workflows.

data quality, library, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2505.15931

Country:

North America > United States (0.46)
Asia > Middle East > Iran (0.29)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.15)

Genre: Research Report > New Finding (0.69)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.68)

Technology:

Information Technology > Software (1.00)
Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Add feedback

Enriching the Machine Learning Workloads in BigBench

Polag, Matthias, Ivanov, Todor, Eichhorn, Timo

arXiv.org Artificial IntelligenceJun-16-2024

In the era of Big Data and the growing support for Machine Learning, Deep Learning and Artificial Intelligence algorithms in the current software systems, there is an urgent need of standardized application benchmarks that stress test and evaluate these new technologies. Relying on the standardized BigBench (TPCx-BB) benchmark, this work enriches the improved BigBench V2 with three new workloads and expands the coverage of machine learning algorithms. Our workloads utilize multiple algorithms and compare different implementations for the same algorithm across several popular libraries like MLlib, SystemML, Scikit-learn and Pandas, demonstrating the relevance and usability of our benchmark extension.

algorithm, implementation, library, (15 more...)

arXiv.org Artificial Intelligence

2406.10843

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Germany > Hesse (0.04)
Europe > Germany > Berlin (0.04)

Genre: Research Report (0.86)

Industry: Information Technology (0.68)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

BARMPy: Bayesian Additive Regression Models Python Package

Van Boxel, Danielle

arXiv.org Machine LearningApr-6-2024

We make Bayesian Additive Regression Networks (BARN) available as a Python package, \texttt{barmpy}, with documentation at \url{https://dvbuntu.github.io/barmpy/} for general machine learning practitioners. Our object-oriented design is compatible with SciKit-Learn, allowing usage of their tools like cross-validation. To ease learning to use \texttt{barmpy}, we produce a companion tutorial that expands on reference information in the documentation. Any interested user can \texttt{pip install barmpy} from the official PyPi repository. \texttt{barmpy} also serves as a baseline Python library for generic Bayesian Additive Regression Models.

barmpy, library, neural network, (17 more...)

arXiv.org Machine Learning

2404.04738

Country:

North America > United States > Arizona > Pima County > Tucson (0.14)
North America > United States > Wisconsin (0.04)
North America > United States > Texas > Travis County > Austin (0.04)

Genre:

Research Report (0.82)
Instructional Material > Course Syllabus & Notes (0.34)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Active Learning: Learning with Limited Labeled Data in Python (Scikit-learn, Active Learning Lib) - Code Armada, LLC

#artificialintelligenceApr-11-2023, 11:35:28 GMT

Active Learning: Learning with Limited Labeled Data in Python (Scikit-learn, Active Learning Lib) Active Learning is a machine learning approach that enables the selection of the most informative data points to be labeled by an oracle, thereby reducing the number of labeled data points required to train a model. Active Learning is useful in scenarios where labeled data is limited or expensive to acquire. Active Learning can help improve the accuracy of machine learning models with fewer labeled data points. Learning with Limited Labeled Data in Python Python is a popular language for machine learning, and several libraries support Active Learning. In this tutorial, we will use the Scikit-learn library to train a model and the Active Learning library to select informative data points to be labeled. Import Libraries We will start by importing the necessary libraries, including Scikit-learn for training the model, NumPy for numerical computations, and the Active Learning library for selecting informative data points to be labeled. import numpy as np from sklearn.datasets import make_classification from sklearn.linear_model import LogisticRegression from sklearn.model_selection import train_test_split from modAL.uncertainty import uncertainty_sampling Generate Data Next, we will generate some random data for training and testing the model. # Generate random data for […]

active learning, artificial intelligence, machine learning, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.52)

Add feedback

Discover Top 4 AI Python Libraries with ChatGPT

#artificialintelligenceMar-29-2023, 05:10:24 GMT

In this section, we'll take a closer look at the top four AI and machine learning libraries in Python that you can discover with ChatGPT's help. These libraries are Scikit-learn, TensorFlow, Keras, and PyTorch. We'll explore their features, use cases, and advantages, as well as provide links to their official documentation and prerequisites. By the end of this section, you'll have a good understanding of the strengths and weaknesses of each library and be able to choose the one that best suits your needs. Scikit-learn is an essential library for machine learning in Python, and it can be used in a wide range of applications, including artificial intelligence and deep learning.

ai python library, chatgpt, library, (6 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Getting Started with AI: How to Use Python for Machine Learning

#artificialintelligenceMar-28-2023, 18:45:13 GMT

Artificial Intelligence (AI) and Machine Learning (ML) are two rapidly growing fields in technology, and Python has become the go-to programming language for both. Python has a vast array of libraries and tools available for AI and ML development, making it an ideal language for beginners to get started with these fields. In this article, we will discuss the basics of using Python for machine learning and provide some code samples to help you get started. Machine learning is a subset of AI that involves training machines to learn from data and make predictions or decisions. It is a form of statistical analysis that involves the use of algorithms to find patterns in data and use those patterns to make predictions.

machine learning, python, scikit-learn, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.37)

Add feedback

Unlocking the Potential of Artificial Intelligence with Python

#artificialintelligenceMar-24-2023, 08:35:35 GMT

Artificial Intelligence (AI) is a rapidly growing field that has the potential to revolutionize the world in the coming years. AI can be defined as the development of intelligent systems that can perform tasks that usually require human intelligence. These tasks include learning, reasoning, problem-solving, decision-making, and perception. Python is one of the most popular programming languages used for developing AI systems. Python is an interpreted, high-level programming language that has a simple syntax and easy-to-use libraries.

learning, neural network, python, (16 more...)

#artificialintelligence

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.44)

Add feedback

The Top 10 Machine Learning Packages: Which One is Right for Your Project?

#artificialintelligenceMar-9-2023, 22:38:31 GMT

Machine learning is an exciting and rapidly growing field that has become essential for data analysis and predictive modeling across various industries. There are numerous machine learning packages available that provide different algorithms and frameworks for building and deploying machine learning models. In this article, we'll rank the ten best machine learning packages based on their popularity, performance, ease of use, and community support, with TensorFlow taking the top spot and getting extra attention. TensorFlow is an open-source machine learning framework developed by Google that has quickly become one of the most popular and widely used machine learning packages. It provides a flexible and scalable platform for building and training machine learning models, supporting deep learning, reinforcement learning, and other advanced techniques.

application, speed and efficiency, tensorflow, (11 more...)

#artificialintelligence

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback