AITopics

2306.03387

Country:

Asia > China > Beijing > Beijing (0.05)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Panda, Atasi, Louis, Anand, Nimbhorkar, Prajakta

Individual fairness under Varied Notions of Group Fairness in Bipartite Matching -- One Framework to Approximate Them Al

arXiv.org Artificial IntelligenceJun-6-2023

We consider the problem of assigning items to platforms while satisfying group and individual fairness constraints. Each item is associated with certain groups and has a preference ordering over platforms. Each platform enforces group fairness by specifying an upper and a lower bound on the number of items that can be matched to it from each group. Although there may be multiple optimal solutions that satisfy the group fairness constraints, we aim to achieve `probabilistic individual fairness' by computing a distribution over `group fair' matchings such that each item has a reasonable probability of being matched to one of its top choices. When each item can belong to multiple groups, the problem of finding a maximum size group-fair matching is NP-hard even when all the group lower bounds are 0, and there are no individual fairness constraints. Given a total of $n$ items, we achieve a $O(\Delta \log n)$ approximation algorithm when an item can belong to at most $\Delta$ groups, and all the group lower bounds are 0. We also provide two approximation algorithms in terms of the total number of groups that have items in the neighborhood of a platform. When each item belongs to a single group, we provide a polynomial-time algorithm that computes a probabilistic individually fair distribution over group fair matching. We further extend our model and algorithms to address the following notions of fairness: `maxmin group fairness', which maximizes the representation of the worst-off groups, and `mindom group fairness', which minimizes the representation of the most dominant groups.

artificial intelligence, constraint, machine learning, (18 more...)

2208.09951

Country:

Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)
(2 more...)

arXiv.org Artificial IntelligenceJun-6-2023

A Communication-efficient Algorithm with Linear Convergence for Federated Minimax Learning

Sun, Zhenyu, Wei, Ermin

In this paper, we study a large-scale multi-agent minimax optimization problem, which models many interesting applications in statistical learning and game theory, including Generative Adversarial Networks (GANs). The overall objective is a sum of agents' private local objective functions. We first analyze an important special case, empirical minimax problem, where the overall objective approximates a true population minimax risk by statistical samples. We provide generalization bounds for learning with this objective through Rademacher complexity analysis. Then, we focus on the federated setting, where agents can perform local computation and communicate with a central server. Most existing federated minimax algorithms either require communication per iteration or lack performance guarantees with the exception of Local Stochastic Gradient Descent Ascent (SGDA), a multiple-local-update descent ascent algorithm which guarantees convergence under a diminishing stepsize. By analyzing Local SGDA under the ideal condition of no gradient noise, we show that generally it cannot guarantee exact convergence with constant stepsizes and thus suffers from slow rates of convergence. To tackle this issue, we propose FedGDA-GT, an improved Federated (Fed) Gradient Descent Ascent (GDA) method based on Gradient Tracking (GT). When local objectives are Lipschitz smooth and strongly-convex-strongly-concave, we prove that FedGDA-GT converges linearly with a constant stepsize to global $\epsilon$-approximation solution with $\mathcal{O}(\log (1/\epsilon))$ rounds of communication, which matches the time complexity of centralized GDA method. Finally, we numerically show that FedGDA-GT outperforms Local SGDA.

artificial intelligence, local sgda, machine learning, (13 more...)

2206.01132

Country:

North America > United States > Illinois > Cook County > Evanston (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.75)

Kocák, Tomáš, Carpentier, Alexandra

Online Learning with Feedback Graphs: The True Shape of Regret

Sequential learning with feedback graphs is a natural extension of the multi-armed bandit problem where the problem is equipped with an underlying graph structure that provides additional information - playing an action reveals the losses of all the neighbors of the action. This problem was introduced by \citet{mannor2011} and received considerable attention in recent years. It is generally stated in the literature that the minimax regret rate for this problem is of order $\sqrt{\alpha T}$, where $\alpha$ is the independence number of the graph, and $T$ is the time horizon. However, this is proven only when the number of rounds $T$ is larger than $\alpha^3$, which poses a significant restriction for the usability of this result in large graphs. In this paper, we define a new quantity $R^*$, called the \emph{problem complexity}, and prove that the minimax regret is proportional to $R^*$ for any graph and time horizon $T$. Introducing an intricate exploration strategy, we define the \mainAlgorithm algorithm that achieves the minimax optimal regret bound and becomes the first provably optimal algorithm for this setting, even if $T$ is smaller than $\alpha^3$.

artificial intelligence, data mining, machine learning, (18 more...)

2306.02971

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Brandenburg > Potsdam (0.04)

Genre: Research Report (0.40)

Industry: Education > Educational Setting > Online (0.42)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.76)

COMET: Learning Cardinality Constrained Mixture of Experts with Trees and Local Search

Ibrahim, Shibal, Chen, Wenyu, Hazimeh, Hussein, Ponomareva, Natalia, Zhao, Zhe, Mazumder, Rahul

The sparse Mixture-of-Experts (Sparse-MoE) framework efficiently scales up model capacity in various domains, such as natural language processing and vision. Sparse-MoEs select a subset of the "experts" (thus, only a portion of the overall network) for each input sample using a sparse, trainable gate. Existing sparse gates are prone to convergence and performance issues when training with first-order optimization methods. In this paper, we introduce two improvements to current MoE approaches. First, we propose a new sparse gate: COMET, which relies on a novel tree-based mechanism. COMET is differentiable, can exploit sparsity to speed up computation, and outperforms state-of-the-art gates. Second, due to the challenging combinatorial nature of sparse expert selection, first-order methods are typically prone to low-quality solutions. To deal with this challenge, we propose a novel, permutation-based local search method that can complement first-order methods in training any sparse gate, e.g., Hash routing, Top-k, DSelect-k, and COMET. We show that local search can help networks escape bad initializations or solutions. We performed large-scale experiments on various domains, including recommender systems, vision, and natural language processing. On standard vision and recommender systems benchmarks, COMET+ (COMET with local search) achieves up to 13% improvement in ROC AUC over popular gates, e.g., Hash routing and Top-k, and up to 9% over prior differentiable gates e.g., DSelect-k. When Top-k and Hash gates are combined with local search, we see up to $100\times$ reduction in the budget needed for hyperparameter tuning. Moreover, for language modeling, our approach improves over the state-of-the-art MoEBERT model for distilling BERT on 5/7 GLUE benchmarks as well as SQuAD dataset.

comet, machine learning, natural language, (18 more...)

doi: 10.1145/3580305.3599278

2306.02824

Country:

North America > United States > California > Los Angeles County > Long Beach (0.15)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Washington > King County > Seattle (0.14)
(13 more...)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Chowdhury, Animesh Basak, Romanelli, Marco, Tan, Benjamin, Karri, Ramesh, Garg, Siddharth

INVICTUS: Optimizing Boolean Logic Circuit Synthesis via Synergistic Learning and Search

Logic synthesis is the first and most vital step in chip design. This steps converts a chip specification written in a hardware description language (such as Verilog) into an optimized implementation using Boolean logic gates. State-of-the-art logic synthesis algorithms have a large number of logic minimization heuristics, typically applied sequentially based on human experience and intuition. The choice of the order greatly impacts the quality (e.g., area and delay) of the synthesized circuit. In this paper, we propose INVICTUS, a model-based offline reinforcement learning (RL) solution that automatically generates a sequence of logic minimization heuristics ("synthesis recipe") based on a training dataset of previously seen designs. A key challenge is that new designs can range from being very similar to past designs (e.g., adders and multipliers) to being completely novel (e.g., new processor instructions). %Compared to prior work, INVICTUS is the first solution that uses a mix of RL and search methods joint with an online out-of-distribution detector to generate synthesis recipes over a wide range of benchmarks. Our results demonstrate significant improvement in area-delay product (ADP) of synthesized circuits with up to 30\% improvement over state-of-the-art techniques. Moreover, INVICTUS achieves up to $6.3\times$ runtime reduction (iso-ADP) compared to the state-of-the-art.

benchmark, machine learning, reinforcement learning, (17 more...)

2305.13164

Country:

North America > United States > New York (0.04)
North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Semiconductors & Electronics (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Scaling Multi-Objective Security Games Provably via Space Discretization Based Evolutionary Search

Wu, Yu-Peng, Qian, Hong, Qin, Rong-Jun, Chen, Yi, Zhou, Aimin

In the field of security, multi-objective security games (MOSGs) allow defenders to simultaneously protect targets from multiple heterogeneous attackers. MOSGs aim to simultaneously maximize all the heterogeneous payoffs, e.g., life, money, and crime rate, without merging heterogeneous attackers. In real-world scenarios, the number of heterogeneous attackers and targets to be protected may exceed the capability of most existing state-of-the-art methods, i.e., MOSGs are limited by the issue of scalability. To this end, this paper proposes a general framework called SDES based on many-objective evolutionary search to scale up MOSGs to large-scale targets and heterogeneous attackers. SDES consists of four consecutive key components, i.e., discretization, optimization, evaluation, and refinement. Specifically, SDES first discretizes the originally high-dimensional continuous solution space to the low-dimensional discrete one by the maximal indifference property in game theory. This property helps evolutionary algorithms (EAs) bypass the high-dimensional step function and ensure a well-convergent Pareto front. Then, a many-objective EA is used for optimization in the low-dimensional discrete solution space to obtain a well-spaced Pareto front. To evaluate solutions, SDES restores solutions back to the original space via greedily optimizing a novel divergence measurement. Finally, the refinement in SDES boosts the optimization performance with acceptable cost. Theoretically, we prove the optimization consistency and convergence of SDES. Experiment results show that SDES is the first linear-time MOSG algorithm for both large-scale attackers and targets. SDES is able to solve up to 20 attackers and 100 targets MOSG problems, while the state-of-the-art (SOTA) methods can only solve up to 8 attackers and 25 targets ones. Ablation study verifies the necessity of all components in SDES.

artificial intelligence, evolutionary algorithm, machine learning, (20 more...)

2303.15821

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)
Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)
(9 more...)

Genre:

Research Report > New Finding (0.34)
Research Report > Promising Solution (0.34)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Commercial Services & Supplies > Security & Alarm Services (0.85)
Leisure & Entertainment > Games > Computer Games (0.61)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(2 more...)

Abbas, Ahmed, Swoboda, Paul

ClusterFuG: Clustering Fully connected Graphs by Multicut

We propose a graph clustering formulation based on multicut (a.k.a. weighted correlation clustering) on the complete graph. Our formulation does not need specification of the graph topology as in the original sparse formulation of multicut, making our approach simpler and potentially better performing. In contrast to unweighted correlation clustering we allow for a more expressive weighted cost structure. In dense multicut, the clustering objective is given in a factorized form as inner products of node feature vectors. This allows for an efficient formulation and inference in contrast to multicut/weighted correlation clustering, which has at least quadratic representation and computation complexity when working on the complete graph. We show how to rewrite classical greedy algorithms for multicut in our dense setting and how to modify them for greater efficiency and solution quality. In particular, our algorithms scale to graphs with tens of thousands of nodes. Empirical evidence on instance segmentation on Cityscapes and clustering of ImageNet datasets shows the merits of our approach.

algorithm, artificial intelligence, machine learning, (18 more...)

2301.12159

Country:

North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Europe > Germany > Saarland (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)

arXiv.org Artificial IntelligenceJun-4-2023

Barriers for the performance of graph neural networks (GNN) in discrete random structures. A comment on~\cite{schuetz2022combinatorial},\cite{angelini2023modern},\cite{schuetz2023reply}

Gamarnik, David

Recently graph neural network (GNN) based algorithms were proposed to solve a variety of combinatorial optimization problems, including Maximum Cut problem, Maximum Independent Set problem and similar other problems~\cite{schuetz2022combinatorial},\cite{schuetz2022graph}. The publication~\cite{schuetz2022combinatorial} stirred a debate whether GNN based method was adequately benchmarked against best prior methods. In particular, critical commentaries~\cite{angelini2023modern} and~\cite{boettcher2023inability} point out that simple greedy algorithm performs better than GNN in the setting of random graphs, and in fact stronger algorithmic performance can be reached with more sophisticated methods. A response from the authors~\cite{schuetz2023reply} pointed out that GNN performance can be improved further by tuning up the parameters better. We do not intend to discuss the merits of arguments and counter-arguments in~\cite{schuetz2022combinatorial},\cite{angelini2023modern},\cite{boettcher2023inability},\cite{schuetz2023reply}. Rather in this note we establish a fundamental limitation for running GNN on random graphs considered in these references, for a broad range of choices of GNN architecture. These limitations arise from the presence of the Overlap Gap Property (OGP) phase transition, which is a barrier for many algorithms, both classical and quantum. As we demonstrate in this paper, it is also a barrier to GNN due to its local structure. We note that at the same time known algorithms ranging from simple greedy algorithms to more sophisticated algorithms based on message passing, provide best results for these problems \emph{up to} the OGP phase transition. This leaves very little space for GNN to outperform the known algorithms, and based on this we side with the conclusions made in~\cite{angelini2023modern} and~\cite{boettcher2023inability}.

algorithm, artificial intelligence, machine learning, (17 more...)

2306.02555

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Africa > Sudan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.80)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)

Akhauri, Yash, Abdelfattah, Mohamed S.

Multi-Predict: Few Shot Predictors For Efficient Neural Architecture Search

arXiv.org Artificial IntelligenceJun-4-2023

Many hardware-aware neural architecture search (NAS) methods have been developed to optimize the topology of neural networks (NN) with the joint objectives of higher accuracy and lower latency. Recently, both accuracy and latency predictors have been used in NAS with great success, achieving high sample efficiency and accurate modeling of hardware (HW) device latency respectively. However, a new accuracy predictor needs to be trained for every new NAS search space or NN task, and a new latency predictor needs to be additionally trained for every new HW device. In this paper, we explore methods to enable multi-task, multi-search-space, and multi-HW adaptation of accuracy and latency predictors to reduce the cost of NAS. We introduce a novel search-space independent NN encoding based on zero-cost proxies that achieves sample-efficient prediction on multiple tasks and NAS search spaces, improving the end-to-end sample efficiency of latency and accuracy predictors by over an order of magnitude in multiple scenarios. For example, our NN encoding enables multi-search-space transfer of latency predictors from NASBench-201 to FBNet (and vice-versa) in under 85 HW measurements, a 400$\times$ improvement in sample efficiency compared to a recent meta-learning approach. Our method also improves the total sample efficiency of accuracy predictors by over an order of magnitude. Finally, we demonstrate the effectiveness of our method for multi-search-space and multi-task accuracy prediction on 28 NAS search spaces and tasks.

artificial intelligence, machine learning, predictor, (14 more...)

2306.02459

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)