AITopics

2509.15419

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)

arXiv.org Artificial IntelligenceApr-5-2024

Model Selection with Model Zoo via Graph Learning

Li, Ziyu, van der Wilk, Hilco, Zhan, Danning, Khosla, Megha, Bozzon, Alessandro, Hai, Rihan

Pre-trained deep learning (DL) models are increasingly accessible in public repositories, i.e., model zoos. Given a new prediction task, finding the best model to fine-tune can be computationally intensive and costly, especially when the number of pre-trained models is large. Selecting the right pre-trained models is crucial, yet complicated by the diversity of models from various model families (like ResNet, Vit, Swin) and the hidden relationships between models and datasets. Existing methods, which utilize basic information from models and datasets to compute scores indicating model performance on target datasets, overlook the intrinsic relationships, limiting their effectiveness in model selection. In this study, we introduce TransferGraph, a novel framework that reformulates model selection as a graph learning problem. TransferGraph constructs a graph using extensive metadata extracted from models and datasets, while capturing their inherent relationships. Through comprehensive experiments across 16 real datasets, both images and texts, we demonstrate TransferGraph's effectiveness in capturing essential model-dataset relationships, yielding up to a 32% improvement in correlation between predicted performance and the actual fine-tuning results compared to the state-of-the-art methods.

artificial intelligence, deep learning, machine learning, (19 more...)

2404.03988

Country:

North America > United States > Arizona > Maricopa County > Scottsdale (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJan-18-2024

Keeping Deep Learning Models in Check: A History-Based Approach to Mitigate Overfitting

Li, Hao, Rajbahadur, Gopi Krishnan, Lin, Dayi, Bezemer, Cor-Paul, Ming, Zhen, Jiang, null

In software engineering, deep learning models are increasingly deployed for critical tasks such as bug detection and code review. However, overfitting remains a challenge that affects the quality, reliability, and trustworthiness of software systems that utilize deep learning models. Overfitting can be (1) prevented (e.g., using dropout or early stopping) or (2) detected in a trained model (e.g., using correlation-based approaches). Both overfitting detection and prevention approaches that are currently used have constraints (e.g., requiring modification of the model structure, and high computing resources). In this paper, we propose a simple, yet powerful approach that can both detect and prevent overfitting based on the training history (i.e., validation losses). Our approach first trains a time series classifier on training histories of overfit models. This classifier is then used to detect if a trained model is overfit. In addition, our trained classifier can be used to prevent overfitting by identifying the optimal point to stop a model's training. We evaluate our approach on its ability to identify and prevent overfitting in real-world samples. We compare our approach against correlation-based detection approaches and the most commonly used prevention approach (i.e., early stopping). Our approach achieves an F1 score of 0.91 which is at least 5% higher than the current best-performing non-intrusive overfitting detection approach. Furthermore, our approach can stop training to avoid overfitting at least 32% of the times earlier than early stopping and has the same or a better rate of returning the best model.

dataset, epoch, training history, (14 more...)

2401.10359

Country:

North America > Canada > Alberta (0.14)
Europe > Middle East > Cyprus > Limassol > Limassol (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Fraboni, Yann, Van Waerebeke, Martin, Scaman, Kevin, Vidal, Richard, Kameni, Laetitia, Lorenzi, Marco

Sequential Informed Federated Unlearning: Efficient and Provable Client Unlearning in Federated Optimization

arXiv.org Artificial IntelligenceAug-31-2023

The aim of Machine Unlearning (MU) is to provide theoretical guarantees on the removal of the contribution of a given data point from a training procedure. Federated Unlearning (FU) consists in extending MU to unlearn a given client's contribution from a federated training routine. Current FU approaches are generally not scalable, and do not come with sound theoretical quantification of the effectiveness of unlearning. In this work we present Informed Federated Unlearning (IFU), a novel efficient and quantifiable FU approach. Upon unlearning request from a given client, IFU identifies the optimal FL iteration from which FL has to be reinitialized, with unlearning guarantees obtained through a randomized perturbation mechanism. The theory of IFU is also extended to account for sequential unlearning requests. Experimental results on different tasks and dataset show that IFU leads to more efficient unlearning procedures as compared to basic re-training and state-of-the-art FU approaches.

efficient and provable client unlearning, sequential informed federated unlearning, unlearning, (12 more...)

2211.11656

Country:

North America > United States > California (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
Europe > France > Provence-Alpes-Côte d'Azur (0.04)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.93)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

#artificialintelligenceSep-18-2021, 12:00:09 GMT

Policy Optimizations: TRPO/PPO

In this post, I will be talking about policy optimization methods from the papers Trust Region Policy Optimization (Schulman et al. 2015) and Proximal Policy Optimization Algorithms (Schulman et al. 2017). I will then briefly go over the Trust Region Policy Optimization method and two types of Proximal Policy Optimization methods: adaptive KL (Kullback-Leibler) penalties to the surrogate objective and clipped surrogate objective. In a traditional policy gradient method, we sample a trajectory of states, actions, and rewards, then update the policy using the sampled trajectories. While this method is great and solves basic control problems, the algorithm tends to be unstable and is inconsistent in solving an environment. A problem is that as we are updating the policy, the distribution of the inputs and outputs of the approximated policy distribution will change, resulting in instability.

new policy, objective, surrogate objective, (15 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

#artificialintelligenceSep-17-2021, 22:20:13 GMT

Policy Optimizations: TRPO/PPO

new policy, objective, surrogate objective, (15 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

#artificialintelligenceDec-14-2020, 02:41:18 GMT

Siamese networks with Keras, TensorFlow, and Deep Learning - PyImageSearch

In this tutorial you will learn how to implement and train siamese networks using Keras, TensorFlow, and Deep Learning. Practical, real-world use cases of siamese networks include face recognition, signature verification, prescription pill identification, and more! Furthermore, siamese networks can be trained with astoundingly little data, making more advanced applications such as one-shot learning and few-shot learning possible. To learn how to implement and train siamese networks with Keras and TenorFlow, just keep reading. In the first part of this tutorial, we will discuss siamese networks, how they work, and why you may want to use them in your own deep learning applications. From there, you'll learn how to configure your development environment such that you can follow along with this tutorial and learn how to train your own siamese networks.

siamese network, siamese network architecture, sister network, (11 more...)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

#artificialintelligenceOct-14-2019, 14:19:46 GMT

Why is my validation loss lower than my training loss? - PyImageSearch

In this tutorial, you will learn the three primary reasons your validation loss may be lower than your training loss when training your own custom deep neural networks. I first became interested in studying machine learning and neural networks in late high school. Back then there weren't many accessible machine learning libraries -- and there certainly was no scikit-learn. Every school day at 2:35 PM I would leave high school, hop on the bus home, and within 15 minutes I would be in front of my laptop, studying machine learning, and attempting to implement various algorithms by hand. I rarely stopped for a break, more than occasionally skipping dinner just so I could keep working and studying late into the night.

neural network, training loss, validation loss, (14 more...)

Genre: Instructional Material > Course Syllabus & Notes (0.49)

Industry: Education > Educational Setting (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.38)

Ye, Xiaomeng (Indiana University Bloomington)

C2C Trace Retrieval: Fast Classification Using Class-to-Class Weighting

AAAI ConferencesMay-15-2019

Traditional case-based classification methods are based on feature similarity. In contrast, class-to-class (C2C) weighting also considers whether the difference between two cases has been seen before. Combined with instance-specific weighting, C2C weighting learns the local patterns of both similarities and differences (shortened as patterns). Once C2C weightings has learned the pattern between case A of class C_1 and some set of cases R of class C_2, given a query Q whose difference from A matches the pattern between A and R, then we can skip cases around A and continue the search for near neighbors around R. Based on this, we developed an algorithm, C2C trace retrieval, which quickly traverses promising cases, retrieves relevant cases from different classes, and provides an informed hypothesis of the query's class. C2C trace retrieval achieves great efficiency at a reasonable cost of accuracy. Therefore, C2C trace retrieval can be used as a fast classification method or as the first pass for a more sophisticated method.

artificial intelligence, machine learning, weighting, (18 more...)

AAAI Conferences

The Thirty-Second International Flairs Conference

Country:

North America > United States > Wisconsin (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Indiana (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)

Wulff, N. H., Hertz, J A.

Learning Cellular Automaton Dynamics with Neural Networks

Neural Information Processing SystemsDec-31-1993

We have trained networks of E - II units with short-range connections to simulate simple cellular automata that exhibit complex or chaotic behaviour. Three levels of learning are possible (in decreasing order of difficulty): learning the underlying automaton rule, learning asymptotic dynamical behaviour, and learning to extrapolate the training history. The levels of learning achieved with and without weight sharing for different automata provide new insight into their dynamics.

history, training history, weight sharing, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > New Mexico > Los Alamos County > Los Alamos (0.05)
Europe > Denmark > Capital Region > Copenhagen (0.05)
North America > United States > Maryland > Montgomery County > Bethesda (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)