AITopics | Banff

Collaborating Authors

Banff

ALPBench: A Benchmark for Active Learning Pipelines on Tabular Data

Margraf, Valentin, Wever, Marcel, Gilhuber, Sandra, Tavares, Gabriel Marques, Seidl, Thomas, Hüllermeier, Eyke

arXiv.org Artificial IntelligenceJun-25-2024

In settings where only a budgeted amount of labeled data can be afforded, active learning seeks to devise query strategies for selecting the most informative data points to be labeled, aiming to enhance learning algorithms' efficiency and performance. Numerous such query strategies have been proposed and compared in the active learning literature. However, the community still lacks standardized benchmarks for comparing the performance of different query strategies. This particularly holds for the combination of query strategies with different learning algorithms into active learning pipelines and examining the impact of the learning algorithm choice. To close this gap, we propose ALPBench, which facilitates the specification, execution, and performance monitoring of active learning pipelines. It has built-in measures to ensure evaluations are done reproducibly, saving exact dataset splits and hyperparameter settings of used algorithms. In total, ALPBench consists of 86 real-world tabular classification datasets and 5 active learning settings, yielding 430 active learning problems. To demonstrate its usefulness and broad compatibility with various learning algorithms and query strategies, we conduct an exemplary study evaluating 9 query strategies paired with 8 learning algorithms in 2 different settings.

algorithm, dataset, learning, (13 more...)

arXiv.org Artificial Intelligence

2406.17322

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
Oceania > Australia > New South Wales > Sydney (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(22 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Instance-level quantitative saliency in multiple sclerosis lesion segmentation

Spagnolo, Federico, Molchanova, Nataliia, Schaer, Roger, Cuadra, Meritxell Bach, Pineda, Mario Ocampo, Melie-Garcia, Lester, Granziera, Cristina, Andrearczyk, Vincent, Depeursinge, Adrien

arXiv.org Artificial IntelligenceJun-25-2024

In recent years, explainable methods for artificial intelligence (XAI) have tried to reveal and describe models' decision mechanisms in the case of classification tasks. However, XAI for semantic segmentation and in particular for single instances has been little studied to date. Understanding the process underlying automatic segmentation of single instances is crucial to reveal what information was used to detect and segment a given object of interest. In this study, we proposed two instance-level explanation maps for semantic segmentation based on SmoothGrad and Grad-CAM++ methods. Then, we investigated their relevance for the detection and segmentation of white matter lesions (WML), a magnetic resonance imaging (MRI) biomarker in multiple sclerosis (MS). 687 patients diagnosed with MS for a total of 4043 FLAIR and MPRAGE MRI scans were collected at the University Hospital of Basel, Switzerland. Data were randomly split into training, validation and test sets to train a 3D U-Net for MS lesion segmentation. We observed 3050 true positive (TP), 1818 false positive (FP), and 789 false negative (FN) cases. We generated instance-level explanation maps for semantic segmentation, by developing two XAI methods based on SmoothGrad and Grad-CAM++. We investigated: 1) the distribution of gradients in saliency maps with respect to both input MRI sequences; 2) the model's response in the case of synthetic lesions; 3) the amount of perilesional tissue needed by the model to segment a lesion. Saliency maps (based on SmoothGrad) in FLAIR showed positive values inside a lesion and negative in its neighborhood. Peak values of saliency maps generated for these four groups of volumes presented distributions that differ significantly from one another, suggesting a quantitative nature of the proposed saliency. Contextual information of 7mm around the lesion border was required for their segmentation.

lesion, saliency map, segmentation, (14 more...)

arXiv.org Artificial Intelligence

2406.09335

Country:

Europe > Switzerland > Basel-City > Basel (0.25)
Europe > Switzerland > Vaud > Lausanne (0.05)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Multiple Sclerosis (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.89)

Add feedback

A New Perspective on Shampoo's Preconditioner

Morwani, Depen, Shapira, Itai, Vyas, Nikhil, Malach, Eran, Kakade, Sham, Janson, Lucas

arXiv.org Machine LearningJun-25-2024

Shampoo, a second-order optimization algorithm which uses a Kronecker product preconditioner, has recently garnered increasing attention from the machine learning community. The preconditioner used by Shampoo can be viewed either as an approximation of the Gauss--Newton component of the Hessian or the covariance matrix of the gradients maintained by Adagrad. We provide an explicit and novel connection between the $\textit{optimal}$ Kronecker product approximation of these matrices and the approximation made by Shampoo. Our connection highlights a subtle but common misconception about Shampoo's approximation. In particular, the $\textit{square}$ of the approximation used by the Shampoo optimizer is equivalent to a single step of the power iteration algorithm for computing the aforementioned optimal Kronecker product approximation. Across a variety of datasets and architectures we empirically demonstrate that this is close to the optimal Kronecker product approximation. Additionally, for the Hessian approximation viewpoint, we empirically study the impact of various practical tricks to make Shampoo more computationally efficient (such as using the batch gradient and the empirical Fisher) on the quality of Hessian approximation.

approximation, kronecker product approximation, shampoo, (14 more...)

arXiv.org Machine Learning

2406.17748

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > Canada > Ontario > Toronto (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Diffusion Spectral Representation for Reinforcement Learning

Shribak, Dmitry, Gao, Chen-Xiao, Li, Yitong, Xiao, Chenjun, Dai, Bo

arXiv.org Artificial IntelligenceJun-23-2024

Diffusion-based models have achieved notable empirical successes in reinforcement learning (RL) due to their expressiveness in modeling complex distributions. Despite existing methods being promising, the key challenge of extending existing methods for broader real-world applications lies in the computational cost at inference time, i.e., sampling from a diffusion model is considerably slow as it often requires tens to hundreds of iterations to generate even one sample. To circumvent this issue, we propose to leverage the flexibility of diffusion models for RL from a representation learning perspective. In particular, by exploiting the connection between diffusion model and energy-based model, we develop Diffusion Spectral Representation (Diff-SR), a coherent algorithm framework that enables extracting sufficient representations for value functions in Markov decision processes (MDP) and partially observable Markov decision processes (POMDP). We further demonstrate how Diff-SR facilitates efficient policy optimization and practical algorithms while explicitly bypassing the difficulty and inference cost of sampling from the diffusion model. Finally, we provide comprehensive empirical studies to verify the benefits of Diff-SR in delivering robust and advantageous performance across various benchmarks with both fully and partially observable settings.

arxiv preprint arxiv, diffusion model, representation, (11 more...)

arXiv.org Artificial Intelligence

2406.16121

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Austria (0.04)
(9 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Impact of Decentralized Learning on Player Utilities in Stackelberg Games

Donahue, Kate, Immorlica, Nicole, Jagadeesan, Meena, Lucier, Brendan, Slivkins, Aleksandrs

arXiv.org Artificial IntelligenceJun-21-2024

When deployed in the world, a learning agent such as a recommender system or a chatbot often repeatedly interacts with another learning agent (such as a user) over time. In many such two-agent systems, each agent learns separately and the rewards of the two agents are not perfectly aligned. To better understand such cases, we examine the learning dynamics of the two-agent system and the implications for each agent's objective. We model these systems as Stackelberg games with decentralized learning and show that standard regret benchmarks (such as Stackelberg equilibrium payoffs) result in worst-case linear regret for at least one player. To better capture these systems, we construct a relaxed regret benchmark that is tolerant to small learning errors by agents. We show that standard learning algorithms fail to provide sublinear regret, and we develop algorithms to achieve near-optimal $O(T^{2/3})$ regret for both players with respect to these benchmarks. We further design relaxed environments under which faster learning ($O(\sqrt{T})$) is possible. Altogether, our results take a step towards assessing how two-agent interactions in sequential and decentralized learning environments affect the utility of both agents.

benchmark, follower, logt, (14 more...)

arXiv.org Artificial Intelligence

2403.00188

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Middle East > Jordan (0.04)
(17 more...)

Genre: Research Report (0.70)

Industry: Leisure & Entertainment (0.67)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Towards evolution of Deep Neural Networks through contrastive Self-Supervised learning

Vinhas, Adriano, Correia, João, Machado, Penousal

arXiv.org Artificial IntelligenceJun-20-2024

Deep Neural Networks (DNNs) have been successfully applied to a wide range of problems. However, two main limitations are commonly pointed out. The first one is that they require long time to design. The other is that they heavily rely on labelled data, which can sometimes be costly and hard to obtain. In order to address the first problem, neuroevolution has been proved to be a plausible option to automate the design of DNNs. As for the second problem, self-supervised learning has been used to leverage unlabelled data to learn representations. Our goal is to study how neuroevolution can help self-supervised learning to bridge the gap to supervised learning in terms of performance. In this work, we propose a framework that is able to evolve deep neural networks using self-supervised learning. Our results on the CIFAR-10 dataset show that it is possible to evolve adequate neural networks while reducing the reliance on labelled data. Moreover, an analysis to the structure of the evolved networks suggests that the amount of labelled data fed to them has less effect on the structure of networks that learned via self-supervised learning, when compared to individuals that relied on supervised learning.

labelled data, learning, representation, (14 more...)

arXiv.org Artificial Intelligence

2406.14525

Country:

Europe > Portugal > Coimbra > Coimbra (0.05)
North America > United States > Montana (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Maintenance Required: Updating and Extending Bootstrapped Human Activity Recognition Systems for Smart Homes

Hiremath, Shruthi K., Ploetz, Thomas

arXiv.org Artificial IntelligenceJun-20-2024

Developing human activity recognition (HAR) systems for smart homes is not straightforward due to varied layouts of the homes and their personalized settings, as well as idiosyncratic behaviors of residents. As such, off-the-shelf HAR systems are effective in limited capacity for an individual home, and HAR systems often need to be derived "from scratch", which comes with substantial efforts and often is burdensome to the resident. Previous work has successfully targeted the initial phase. At the end of this initial phase, we identify seed points. We build on bootstrapped HAR systems and introduce an effective updating and extension procedure for continuous improvement of HAR systems with the aim of keeping up with ever changing life circumstances. Our method makes use of the seed points identified at the end of the initial bootstrapping phase. A contrastive learning framework is trained using these seed points and labels obtained for the same. This model is then used to improve the segmentation accuracy of the identified prominent activities. Improvements in the activity recognition system through this procedure help model the majority of the routine activities in the smart home. We demonstrate the effectiveness of our procedure through experiments on the CASAS datasets that show the practical value of our approach.

procedure, resident, smart home, (16 more...)

arXiv.org Artificial Intelligence

2406.14446

Country:

North America > Aruba (0.05)
Africa > Middle East > Egypt > Cairo Governorate > Cairo (0.05)
North America > United States (0.04)
(5 more...)

Genre: Research Report (0.50)

Industry:

Information Technology > Smart Houses & Appliances (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Internet of Things (0.95)
Information Technology > Data Science > Data Mining (0.94)
(3 more...)

Add feedback

Certified Robust Accuracy of Neural Networks Are Bounded due to Bayes Errors

Zhang, Ruihan, Sun, Jun

arXiv.org Machine LearningJun-20-2024

Adversarial examples pose a security threat to many critical systems built on neural networks. While certified training improves robustness, it also decreases accuracy noticeably. Despite various proposals for addressing this issue, the significant accuracy drop remains. More importantly, it is not clear whether there is a certain fundamental limit on achieving robustness whilst maintaining accuracy. In this work, we offer a novel perspective based on Bayes errors. By adopting Bayes error to robustness analysis, we investigate the limit of certified robust accuracy, taking into account data distribution uncertainties. We first show that the accuracy inevitably decreases in the pursuit of robustness due to changed Bayes error in the altered data distribution. Subsequently, we establish an upper bound for certified robust accuracy, considering the distribution of individual classes and their boundaries. Our theoretical results are empirically evaluated on real-world datasets and are shown to be consistent with the limited success of existing certified training results, e.g., for CIFAR10, our analysis results in an upper bound (of certified robust accuracy) of 67.49\%, meanwhile existing approaches are only able to increase it from 53.89\% in 2017 to 62.84\% in 2023.

bayes error, classifier, robustness, (13 more...)

arXiv.org Machine Learning

2405.11547

Country:

North America > United States > Maryland > Baltimore (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
(8 more...)

Genre: Research Report > New Finding (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

Demystifying Forgetting in Language Model Fine-Tuning with Statistical Analysis of Example Associations

Jin, Xisen, Ren, Xiang

arXiv.org Machine LearningJun-20-2024

Language models (LMs) are known to suffer from forgetting of previously learned examples when fine-tuned, breaking stability of deployed LM systems. Despite efforts on mitigating forgetting, few have investigated whether, and how forgotten upstream examples are associated with newly learned tasks. Insights on such associations enable efficient and targeted mitigation of forgetting. In this paper, we empirically analyze forgetting that occurs in $N$ upstream examples while the model learns $M$ new tasks and visualize their associations with a $M \times N$ matrix. We empirically demonstrate that the degree of forgetting can often be approximated by simple multiplicative contributions of the upstream examples and newly learned tasks. We also reveal more complicated patterns where specific subsets of examples are forgotten with statistics and visualization. Following our analysis, we predict forgetting that happens on upstream examples when learning a new task with matrix completion over the empirical associations, outperforming prior approaches that rely on trainable LMs. Project website: https://inklab.usc.edu/lm-forgetting-prediction/

language model, learning representation, upstream example, (14 more...)

arXiv.org Machine Learning

2406.14026

Country:

North America > United States > California (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Maryland > Baltimore (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Sampling 3D Gaussian Scenes in Seconds with Latent Diffusion Models

Henderson, Paul, de Almeida, Melonie, Ivanova, Daniela, Anciukevičius, Titas

arXiv.org Artificial IntelligenceJun-18-2024

We present a latent diffusion model over 3D scenes, that can be trained using only 2D image data. To achieve this, we first design an autoencoder that maps multi-view images to 3D Gaussian splats, and simultaneously builds a compressed latent representation of these splats. Then, we train a multi-view diffusion model over the latent space to learn an efficient generative model. This pipeline does not require object masks nor depths, and is suitable for complex scenes with arbitrary camera positions. We conduct careful experiments on two large-scale datasets of complex real-world scenes - MVImgNet and RealEstate10K. We show that our approach enables generating 3D scenes in as little as 0.2 seconds, either from scratch, from a single input view, or from sparse input views. It produces diverse and high-quality results while running an order of magnitude faster than non-latent diffusion models and earlier NeRF-based generative models.

computer vision and pattern recognition, diffusion model, reconstruction, (10 more...)

arXiv.org Artificial Intelligence

2406.13099

Country:

Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry:

Media > Television (0.34)
Media > Photography (0.34)
Media > Film (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback