AITopics | Williamson, Robert C.

Collaborating Authors

Williamson, Robert C.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Tailoring to the Tails: Risk Measures for Fine-Grained Tail Sensitivity

Fröhlich, Christian, Williamson, Robert C.

arXiv.org Artificial IntelligenceJan-23-2023

Expected risk minimization (ERM) is at the core of many machine learning systems. This means that the risk inherent in a loss distribution is summarized using a single number - its average. In this paper, we propose a general approach to construct risk measures which exhibit a desired tail sensitivity and may replace the expectation operator in ERM. Our method relies on the specification of a reference distribution with a desired tail behaviour, which is in a one-to-one correspondence to a coherent upper probability. Any risk measure, which is compatible with this upper probability, displays a tail sensitivity which is finely tuned to the reference distribution. As a concrete example, we focus on divergence risk measures based on f-divergence ambiguity sets, which are a widespread tool used to foster distributional robustness of machine learning systems. For instance, we show how ambiguity sets based on the Kullback-Leibler divergence are intimately tied to the class of subexponential random variables. We elaborate the connection between divergence risk measures and rearrangement invariant Banach norms.

artificial intelligence, machine learning, risk measure, (18 more...)

arXiv.org Artificial Intelligence

2208.03066

Country: Europe > Germany (0.28)

Genre: Research Report (0.40)

Industry: Banking & Finance (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Fairness and Randomness in Machine Learning: Statistical Independence and Relativization

Derr, Rabanus, Williamson, Robert C.

arXiv.org Artificial IntelligenceNov-16-2022

Fair Machine Learning endeavors to prevent unfairness arising in the context of machine learning applications embedded in society. Despite the variety of definitions of fairness and proposed "fair algorithms", there remain unresolved conceptual problems regarding fairness. In this paper, we dissect the role of statistical independence in fairness and randomness notions regularly used in machine learning. Thereby, we are led to a suprising hypothesis: randomness and fairness can be considered equivalent concepts in machine learning. In particular, we obtain a relativized notion of randomness expressed as statistical independence by appealing to Von Mises' century-old foundations for probability. This notion turns out to be "orthogonal" in an abstract sense to the commonly used i.i.d.-randomness. Using standard fairness notions in machine learning, which are defined via statistical independence, we then link the ex ante randomness assumptions about the data to the ex post requirements for fair predictions. This connection proves fruitful: we use it to argue that randomness and fairness are essentially relative and that both concepts should reflect their nature as modeling assumptions in machine learning.

artificial intelligence, independence, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2207.13596

Country:

North America > United States (0.67)
Europe > Germany (0.46)

Genre: Research Report (0.40)

Industry:

Law (1.00)
Government (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

PAC-Bayesian Bound for the Conditional Value at Risk

Mhammedi, Zakaria, Guedj, Benjamin, Williamson, Robert C.

arXiv.org Machine LearningJun-25-2020

Conditional Value at Risk (CVaR) is a family of "coherent risk measures" which generalize the traditional mathematical expectation. Widely used in mathematical finance, it is garnering increasing interest in machine learning, e.g., as an alternate approach to regularization, and as a means for ensuring fairness. This paper presents a generalization bound for learning algorithms that minimize the CVaR of the empirical loss. The bound is of PAC-Bayesian type and is guaranteed to be small when the empirical CVaR is small. We achieve this by reducing the problem of estimating CVaR to that of merely estimating an expectation. This then enables us, as a by-product, to obtain concentration inequalities for CVaR even when the random variable in question is unbounded.

artificial intelligence, inequality, machine learning, (16 more...)

arXiv.org Machine Learning

2006.14763

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

f-GANs in an Information Geometric Nutshell

Nock, Richard, Cranko, Zac, Menon, Aditya K., Qu, Lizhen, Williamson, Robert C.

Neural Information Processing SystemsFeb-14-2020, 05:43:08 GMT

The approach is elegant but falls short of a full description of the supervised game, and says little about the key player, the generator: for example, what does the generator actually converge to if solving the GAN game means convergence in some space of parameters? How does that provide hints on the generator's design and compare to the flourishing but almost exclusively experimental literature on the subject? In this paper, we unveil a broad class of distributions for which such convergence happens --- namely, deformed exponential families, a wide superset of exponential families ---. We show that current deep architectures are able to factorize a very large number of such densities using an especially compact design, hence displaying the power of deep architectures and their concinnity in the $f$-GAN game. This result holds given a sufficient condition on \textit{activation functions} --- which turns out to be satisfied by popular choices.

artificial intelligence, information geometric nutshell, neural network, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

Add feedback

Proper-Composite Loss Functions in Arbitrary Dimensions

Cranko, Zac, Williamson, Robert C., Nock, Richard

arXiv.org Machine LearningFeb-18-2019

The study of a machine learning problem is in many ways is difficult to separate from the study of the loss function being used. One avenue of inquiry has been to look at these loss functions in terms of their properties as scoring rules via the proper-composite representation, in which predictions are mapped to probability distributions which are then scored via a scoring rule. However, recent research so far has primarily been concerned with analysing the (typically) finite-dimensional conditional risk problem on the output space, leaving aside the larger total risk minimisation. We generalise a number of these results to an infinite dimensional setting and in doing so we are able to exploit the familial resemblance of density and conditional density estimation to provide a simple characterisation of the canonical link.

artificial intelligence, loss function, machine learning, (18 more...)

arXiv.org Machine Learning

1902.06881

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (0.59)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.48)

Add feedback

Adversarial Networks and Autoencoders: The Primal-Dual Relationship and Generalization Bounds

Husain, Hisham, Nock, Richard, Williamson, Robert C.

arXiv.org Machine LearningFeb-3-2019

Since the introduction of Generative Adversarial Networks (GANs) and Variational Autoencoders (VAE), the literature on generative modelling has witnessed an overwhelming resurgence. The impressive, yet elusive empirical performance of GANs has lead to the rise of many GAN-VAE hybrids, with the hopes of GAN level performance and additional benefits of VAE, such as an encoder for feature reduction, which is not offered by GANs. Recently, the Wasserstein Autoencoder (WAE) was proposed, achieving performance similar to that of GANs, yet it is still unclear whether the two are fundamentally different or can be further improved into a unified model. In this work, we study the $f$-GAN and WAE models and make two main discoveries. First, we find that the $f$-GAN objective is equivalent to an autoencoder-like objective, which has close links, and is in some cases equivalent to the WAE objective - we refer to this as the $f$-WAE. This equivalence allows us to explicate the success of WAE. Second, the equivalence result allows us to, for the first time, prove generalization bounds for Autoencoder models (WAE and $f$-WAE), which is a pertinent problem when it comes to theoretical analyses of generative models. Furthermore, we show that the $f$-WAE objective is related to other statistical quantities such as the $f$-divergence and in particular, upper bounded by the Wasserstein distance, which then allows us to tap into existing efficient (regularized) OT solvers to minimize $f$-WAE. Our findings thus recommend the $f$-WAE as a tighter alternative to WAE, comment on generalization abilities and make a step towards unifying these models.

artificial intelligence, autoencoder, neural network, (17 more...)

arXiv.org Machine Learning

1902.00985

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Fairness risk measures

Williamson, Robert C., Menon, Aditya Krishna

arXiv.org Machine LearningJan-24-2019

Ensuring that classifiers are non-discriminatory or fair with respect to a sensitive feature (e.g., race or gender) is a topical problem. Progress in this task requires fixing a definition of fairness, and there have been several proposals in this regard over the past few years. Several of these, however, assume either binary sensitive features (thus precluding categorical or real-valued sensitive groups), or result in non-convex objectives (thus adversely affecting the optimisation landscape). In this paper, we propose a new definition of fairness that generalises some existing proposals, while allowing for generic sensitive features and resulting in a convex objective. The key idea is to enforce that the expected losses (or risks) across each subgroup induced by the sensitive feature are commensurate. We show how this relates to the rich literature on risk measures from mathematical finance. As a special case, this leads to a new convex fairness-aware objective based on minimising the conditional value at risk (CVaR).

artificial intelligence, fairness, machine learning, (17 more...)

arXiv.org Machine Learning

1901.08665

Country:

Oceania > Australia (0.14)
North America > United States (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Constant Regret, Generalized Mixability, and Mirror Descent

Mhammedi, Zakaria, Williamson, Robert C.

Neural Information Processing SystemsDec-31-2018

We consider the setting of prediction with expert advice; a learner makes predictions by aggregating those of a group of experts. Under this setting, and for the right choice of loss function and ``mixing'' algorithm, it is possible for the learner to achieve a constant regret regardless of the number of prediction rounds. For example, a constant regret can be achieved for \emph{mixable} losses using the \emph{aggregating algorithm}. The \emph{Generalized Aggregating Algorithm} (GAA) is a name for a family of algorithms parameterized by convex functions on simplices (entropies), which reduce to the aggregating algorithm when using the \emph{Shannon entropy} $\operatorname{S}$. For a given entropy $\Phi$, losses for which a constant regret is possible using the \textsc{GAA} are called $\Phi$-mixable. Which losses are $\Phi$-mixable was previously left as an open question. We fully characterize $\Phi$-mixability and answer other open questions posed by \cite{Reid2015}. We show that the Shannon entropy $\operatorname{S}$ is fundamental in nature when it comes to mixability; any $\Phi$-mixable loss is necessarily $\operatorname{S}$-mixable, and the lowest worst-case regret of the \textsc{GAA} is achieved using the Shannon entropy. Finally, by leveraging the connection between the \emph{mirror descent algorithm} and the update step of the GAA, we suggest a new \emph{adaptive} generalized aggregating algorithm and analyze its performance in terms of the regret bound.

artificial intelligence, machine learning, prediction, (17 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Constant Regret, Generalized Mixability, and Mirror Descent

Mhammedi, Zakaria, Williamson, Robert C.

Neural Information Processing SystemsDec-31-2018

algorithm, artificial intelligence, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
North America > Canada (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Minimax Lower Bounds for Cost Sensitive Classification

Kamalaruban, Parameswaran, Williamson, Robert C.

arXiv.org Machine LearningMay-20-2018

The cost-sensitive classification problem plays a crucial role in mission-critical machine learning applications, and differs with traditional classification by taking the misclassification costs into consideration. Although being studied extensively in the literature, the fundamental limits of this problem are still not well understood. We investigate the hardness of this problem by extending the standard minimax lower bound of balanced binary classification problem (due to \cite{massart2006risk}), and emphasize the impact of cost terms on the hardness.

artificial intelligence, bayesian inference, classification problem, (18 more...)

arXiv.org Machine Learning

1805.07723

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
(2 more...)

Add feedback