Goto

Collaborating Authors

 rate


A Experimental Setup

Neural Information Processing Systems

ASIN (id): ASIN stands for Amazon Standard Identification Number. Title: The Title attribute represents the name or title given to a product, book, or creative work. Size: Size indicates the dimensions or physical size of the product. Model: The Model attribute refers to a specific model or version of a product. It provides information about the product's primary material, such as metal, plastic, Color Text: Color Text describes the color or color variation of the product.


The Curse of Unrolling: Rate of Differentiating Through Optimization

Neural Information Processing Systems

Computing the Jacobian of the solution of an optimization problem is a central problem in machine learning, with applications in hyperparameter optimization, meta-learning, optimization as a layer, and dataset distillation, to name a few. Unrolled differentiation is a popular heuristic that approximates the solution using an iterative solver and differentiates it through the computational path. This work provides a non-asymptotic convergence-rate analysis of this approach on quadratic objectives for gradient descent and the Chebyshev method. We show that to ensure convergence of the Jacobian, we can either 1) choose a large learning rate leading to a fast asymptotic convergence but accept that the algorithm may have an arbitrarily long burn-in phase or 2) choose a smaller learning rate leading to an immediate but slower convergence. We refer to this phenomenon as the curse of unrolling.Finally, we discuss open problems relative to this approach, such as deriving a practical update rule for the optimal unrolling strategy and making novel connections with the field of Sobolev orthogonal polynomials.


A Experimental Setup

Neural Information Processing Systems

ASIN (id): ASIN stands for Amazon Standard Identification Number. Title: The Title attribute represents the name or title given to a product, book, or creative work. Size: Size indicates the dimensions or physical size of the product. Model: The Model attribute refers to a specific model or version of a product. It provides information about the product's primary material, such as metal, plastic, Color Text: Color Text describes the color or color variation of the product.



FairDeFace: Evaluating the Fairness and Adversarial Robustness of Face Obfuscation Methods

Khorzooghi, Seyyed Mohammad Sadegh Moosavi, Thota, Poojitha, Singhal, Mohit, Asudeh, Abolfazl, Das, Gautam, Nilizadeh, Shirin

arXiv.org Artificial Intelligence

The lack of a common platform and benchmark datasets for evaluating face obfuscation methods has been a challenge, with every method being tested using arbitrary experiments, datasets, and metrics. While prior work has demonstrated that face recognition systems exhibit bias against some demographic groups, there exists a substantial gap in our understanding regarding the fairness of face obfuscation methods. Providing fair face obfuscation methods can ensure equitable protection across diverse demographic groups, especially since they can be used to preserve the privacy of vulnerable populations. To address these gaps, this paper introduces a comprehensive framework, named FairDeFace, designed to assess the adversarial robustness and fairness of face obfuscation methods. The framework introduces a set of modules encompassing data benchmarks, face detection and recognition algorithms, adversarial models, utility detection models, and fairness metrics. FairDeFace serves as a versatile platform where any face obfuscation method can be integrated, allowing for rigorous testing and comparison with other state-of-the-art methods. In its current implementation, FairDeFace incorporates 6 attacks, and several privacy, utility and fairness metrics. Using FairDeFace, and by conducting more than 500 experiments, we evaluated and compared the adversarial robustness of seven face obfuscation methods. This extensive analysis led to many interesting findings both in terms of the degree of robustness of existing methods and their biases against some gender or racial groups. FairDeFace also uses visualization of focused areas for both obfuscation and verification attacks to show not only which areas are mostly changed in the obfuscation process for some demographics, but also why they failed through focus area comparison of obfuscation and verification.


Training a Generally Curious Agent

Tajwar, Fahim, Jiang, Yiding, Thankaraj, Abitha, Rahman, Sumaita Sadia, Kolter, J Zico, Schneider, Jeff, Salakhutdinov, Ruslan

arXiv.org Artificial Intelligence

Efficient exploration is essential for intelligent systems interacting with their environment, but existing language models often fall short in scenarios that require strategic information gathering. In this paper, we present PAPRIKA, a fine-tuning approach that enables language models to develop general decision-making capabilities that are not confined to particular environments. By training on synthetic interaction data from different tasks that require diverse strategies, PAPRIKA teaches models to explore and adapt their behavior on a new task based on environment feedback in-context without more gradient updates. Experimental results show that models fine-tuned with PAPRIKA can effectively transfer their learned decision-making capabilities to entirely unseen tasks without additional training. Unlike traditional training, our approach's primary bottleneck lies in sampling useful interaction data instead of model updates. To improve sample efficiency, we propose a curriculum learning strategy that prioritizes sampling trajectories from tasks with high learning potential. These results suggest a promising path towards AI systems that can autonomously solve novel sequential decision-making problems that require interactions with the external world.


Challenging the Performance-Interpretability Trade-off: An Evaluation of Interpretable Machine Learning Models

Kruschel, Sven, Hambauer, Nico, Weinzierl, Sven, Zilker, Sandra, Kraus, Mathias, Zschech, Patrick

arXiv.org Artificial Intelligence

Machine learning is permeating every conceivable domain to promote data-driven decision support. The focus is often on advanced black-box models due to their assumed performance advantages, whereas interpretable models are often associated with inferior predictive qualities. More recently, however, a new generation of generalized additive models (GAMs) has been proposed that offer promising properties for capturing complex, non-linear patterns while remaining fully interpretable. To uncover the merits and limitations of these models, this study examines the predictive performance of seven different GAMs in comparison to seven commonly used machine learning models based on a collection of twenty tabular benchmark datasets. To ensure a fair and robust model comparison, an extensive hyperparameter search combined with cross-validation was performed, resulting in 68,500 model runs. In addition, this study qualitatively examines the visual output of the models to assess their level of interpretability. Based on these results, the paper dispels the misconception that only black-box models can achieve high accuracy by demonstrating that there is no strict trade-off between predictive performance and model interpretability for tabular data. Furthermore, the paper discusses the importance of GAMs as powerful interpretable models for the field of information systems and derives implications for future work from a socio-technical perspective.


Analisis Eksploratif Dan Augmentasi Data NSL-KDD Menggunakan Deep Generative Adversarial Networks Untuk Meningkatkan Performa Algoritma Extreme Gradient Boosting Dalam Klasifikasi Jenis Serangan Siber

Santoso, K. P., Madany, F. A., Suryotrisongko, H.

arXiv.org Artificial Intelligence

This study proposes the implementation of Deep Generative Adversarial Networks (GANs) for augmenting the NSL-KDD dataset. The primary objective is to enhance the efficacy of eXtreme Gradient Boosting (XGBoost) in the classification of cyber-attacks on the NSL-KDD dataset. As a result, the method proposed in this research achieved an accuracy of 99.53% using the XGBoost model without data augmentation with GAN, and 99.78% with data augmentation using GAN.


Gradient Boosting Hyperparameters Tuning : Classifier Example

#artificialintelligence

There are various machine learning algorithms that at the last make a weak model. You think to apply other algorithms and still, you get the weak model. If I say there is a method to make all the weak models to a strong model, then do you believe it. At first, you will not believe it, but After reading the entire post you will definitely learn the method to convert the weak model to a strong model using boosting. You will know to tune the Gradient Boosting Hyperparameters. Boosting is an ensemble method to aggregate all the weak models to make them better and the strong model.


Survival regression with accelerated failure time model in XGBoost

Barnwal, Avinash, Cho, Hyunsu, Hocking, Toby Dylan

arXiv.org Machine Learning

Survival regression is used to estimate the relation between time-to-event and feature variables, and is important in application domains such as medicine, marketing, risk management and sales management. Nonlinear tree based machine learning algorithms as implemented in libraries such as XGBoost, scikit-learn, LightGBM, and CatBoost are often more accurate in practice than linear models. However, existing implementations of tree-based models have offered limited support for survival regression. In this work, we propose and implement loss functions for learning accelerated failure time (AFT) models in XGBoost, to increase the support for survival modeling for different kinds of label censoring. The AFT model assumes effects that directly accelerate or decelerate the survival time for different kinds of censored data sets. We demonstrate with real and simulated experiments the effectiveness of AFT in XGBoost with respect to a number of baselines, in two respects: generalization performance and training speed. Furthermore, we take advantage of the support for NVIDIA GPUs in XGBoost to achieve substantial speedup over multi-coreCPUs. To our knowledge, our work is the first implementation of AFT that utilizes the processing power of NVIDIA GPUs.