AITopics

2110.06787

Country:

Europe > Finland > Central Finland > Jyväskylä (0.04)
North America > United States > New York > Nassau County > Mineola (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Telecommunications (0.93)
Information Technology > Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Stohy, Ahmed, Abdelhakam, Heba-Tullah, Ali, Sayed, Elhenawy, Mohammed, Hassan, Abdallah A, Masoud, Mahmoud, Glaser, Sebastien, Rakotonirainy, Andry

Hybrid Pointer Networks for Traveling Salesman Problems Optimization

arXiv.org Artificial IntelligenceOct-13-2021

In this work, a novel idea is presented for combinatorial optimization problems, a hybrid network, which results in a superior outcome. We applied this method to graph pointer networks [1], expanding its capabilities to a higher level. We proposed a hybrid pointer network (HPN) to solve the travelling salesman problem trained by reinforcement learning. Furthermore, HPN builds upon graph pointer networks which is an extension of pointer networks with an additional graph embedding layer. HPN outperforms the graph pointer network in solution quality due to the hybrid encoder, which provides our model with a verity encoding type, allowing our model to converge to a better policy. Our network significantly outperforms the original graph pointer network for small and large-scale problems increasing its performance for TSP50 from 5.959 to 5.706 without utilizing 2opt, Pointer networks, Attention model, and a wide range of models, producing results comparable to highly tuned and specialized algorithms. We make our data, models, and code publicly available [2].

encoder, graph, pointer network, (14 more...)

2110.03104

Country:

Africa > Middle East > Egypt (0.04)
Oceania > Australia > Queensland (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.50)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.35)

arXiv.org Machine LearningOct-13-2021

Boosting the Certified Robustness of L-infinity Distance Nets

Zhang, Bohang, Jiang, Du, He, Di, Wang, Liwei

Recently, Zhang et al. (2021) developed a new neural network architecture based on $\ell_\infty$-distance functions, which naturally possesses certified robustness by its construction. Despite the excellent theoretical properties, the model so far can only achieve comparable performance to conventional networks. In this paper, we significantly boost the certified robustness of $\ell_\infty$-distance nets through a careful analysis of its training process. In particular, we show the $\ell_p$-relaxation, a crucial way to overcome the non-smoothness of the model, leads to an unexpected large Lipschitz constant at the early training stage. This makes the optimization insufficient using hinge loss and produces sub-optimal solutions. Given these findings, we propose a simple approach to address the issues above by using a novel objective function that combines a scaled cross-entropy loss with clipped hinge loss. Our experiments show that using the proposed training strategy, the certified accuracy of $\ell_\infty$-distance net can be dramatically improved from 33.30% to 40.06% on CIFAR-10 ($\epsilon=8/255$), meanwhile significantly outperforming other approaches in this area. Such a result clearly demonstrates the effectiveness and potential of $\ell_\infty$-distance net for certified robustness.

accuracy, international conference, robustness, (14 more...)

2110.0685

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Thomas, Antony, Mastrogiovanni, Fulvio, Baglietto, Marco

Exact and Bounded Collision Probability for Motion Planning under Gaussian Uncertainty

arXiv.org Artificial IntelligenceOct-12-2021

Computing collision-free trajectories is of prime importance for safe navigation. We present an approach for computing the collision probability under Gaussian distributed motion and sensing uncertainty with the robot and static obstacle shapes approximated as ellipsoids. The collision condition is formulated as the distance between ellipsoids and unlike previous approaches we provide a method for computing the exact collision probability. Furthermore, we provide a tight upper bound that can be computed much faster during online planning. Comparison to other state-of-the-art methods is also provided. The proposed method is evaluated in simulation under varying configuration and number of obstacles.

collision probability, obstacle, probability, (14 more...)

2110.06348

Country:

North America > United States > Oregon (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts (0.04)
Europe > Italy > Liguria > Genoa (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.84)

Michel, Paul, Ruder, Sebastian, Yogatama, Dani

Balancing Average and Worst-case Accuracy in Multitask Learning

arXiv.org Artificial IntelligenceOct-12-2021

When training and evaluating machine learning models on a large number of tasks, it is important to not only look at average task accuracy--which may be biased by easy or redundant tasks--but also worst-case accuracy (i.e. the performance on the task with the lowest accuracy). In this work, we show how to use techniques from the distributionally robust optimization (DRO) literature to improve worst-case performance in multitask learning. We highlight several failure cases of DRO when applied off-the-shelf and present an improved method, Lookahead-DRO (L-DRO), which mitigates these issues. The core idea of L-DRO is to anticipate the interaction between tasks during training in order to choose a dynamic re-weighting of the various task losses, which will (i) lead to minimal worst-case loss and (ii) train on as many tasks as possible. After demonstrating the efficacy of L-DRO on a small controlled synthetic setting, we evaluate it on two realistic benchmarks: a multitask version of the CIFAR-100 image classification dataset and a large-scale multilingual language modeling experiment. Our empirical results show that L-DRO achieves a better trade-off between average and worst-case accuracy with little computational overhead compared to several strong baselines. Multitask learning--the process by which a single model is trained to perform a variety of different tasks--has become a subject of increasing interest with many successful applications in a variety of domains (Ruder, 2017; Yu et al., 2020; Wang et al., 2021).

accuracy, l-dro, proceedings, (13 more...)

2110.05838

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > Canada > Ontario > Toronto (0.04)
(3 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.91)

arXiv.org Machine LearningOct-12-2021

Information Theoretic Structured Generative Modeling

Hu, Bo, Yu, Shujian, Principe, Jose C.

R\'enyi's information provides a theoretical foundation for tractable and data-efficient non-parametric density estimation, based on pair-wise evaluations in a reproducing kernel Hilbert space (RKHS). This paper extends this framework to parametric probabilistic modeling, motivated by the fact that R\'enyi's information can be estimated in closed-form for Gaussian mixtures. Based on this special connection, a novel generative model framework called the structured generative model (SGM) is proposed that makes straightforward optimization possible, because costs are scale-invariant, avoiding high gradient variance while imposing less restrictions on absolute continuity, which is a huge advantage in parametric information theoretic optimization. The implementation employs a single neural network driven by an orthonormal input appended to a single white noise source adapted to learn an infinite Gaussian mixture model (IMoG), which provides an empirically tractable model distribution in low dimensions. To train SGM, we provide three novel variational cost functions, based on R\'enyi's second-order entropy and divergence, to implement minimization of cross-entropy, minimization of variational representations of $f$-divergence, and maximization of the evidence lower bound (conditional probability). We test the framework for estimation of mutual information and compare the results with the mutual information neural estimation (MINE), for density estimation, for conditional probability estimation in Markov models as well as for training adversarial networks. Our preliminary results show that SGM significantly improves MINE estimation in terms of data efficiency and variance, conventional and variational Gaussian mixture models, as well as the performance of generative adversarial networks.

estimation, nyi, sgm, (14 more...)

2110.05794

Country:

North America > United States > Florida > Alachua County > Gainesville (0.14)
Asia > Middle East > Jordan (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
(2 more...)

Hudson, Benjamin, Li, Qingbiao, Malencia, Matthew, Prorok, Amanda

Graph Neural Network Guided Local Search for the Traveling Salesperson Problem

arXiv.org Machine LearningOct-12-2021

Solutions to the Traveling Salesperson Problem (TSP) have practical applications to processes in transportation, logistics, and automation, yet must be computed with minimal delay to satisfy the real-time nature of the underlying tasks. However, solving large TSP instances quickly without sacrificing solution quality remains challenging for current approximate algorithms. To close this gap, we present a hybrid data-driven approach for solving the TSP based on Graph Neural Networks (GNNs) and Guided Local Search (GLS). Our model predicts the regret of including each edge of the problem graph in the solution; GLS uses these predictions in conjunction with the original problem graph to find solutions. Our experiments demonstrate that this approach converges to optimal solutions at a faster rate than state-of-the-art learning-based approaches and non-learning GLS algorithms for the TSP, notably finding optimal solutions to 96% of the 50-node problem set, 7% more than the next best benchmark, and to 20% of the 100-node problem set, 4.5x more than the next best benchmark. When generalizing from 20-node problems to the 100-node problem set, our approach finds solutions with an average optimality gap of 2.5%, a 10x improvement over the next best learning-based benchmark.

algorithm, computation time, optimal solution, (14 more...)

2110.05291

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Pennsylvania (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)

Genre: Instructional Material > Course Syllabus & Notes (0.77)

Industry: Transportation (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningOct-10-2021

Structure learning in polynomial time: Greedy algorithms, Bregman information, and exponential families

Rajendran, Goutham, Kivva, Bohdan, Gao, Ming, Aragam, Bryon

Greedy algorithms have long been a workhorse for learning graphical models, and more broadly for learning statistical models with sparse structure. In the context of learning directed acyclic graphs, greedy algorithms are popular despite their worst-case exponential runtime. In practice, however, they are very efficient. We provide new insight into this phenomenon by studying a general greedy score-based algorithm for learning DAGs. Unlike edge-greedy algorithms such as the popular GES and hill-climbing algorithms, our approach is vertex-greedy and requires at most a polynomial number of score evaluations. We then show how recent polynomial-time algorithms for learning DAG models are a special case of this algorithm, thereby illustrating how these order-based algorithms can be rigourously interpreted as score-based algorithms. This observation suggests new score functions and optimality conditions based on the duality between Bregman divergences and exponential families, which we explore in detail. Explicit sample and computational complexity bounds are derived. Finally, we provide extensive experiments suggesting that this algorithm indeed optimizes the score in a variety of settings.

algorithm, assumption, score function, (16 more...)

2110.04719

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Spain > Canary Islands (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

arXiv.org Artificial IntelligenceOct-10-2021

Multi-objective Optimization by Learning Space Partitions

Zhao, Yiyang, Wang, Linnan, Yang, Kevin, Zhang, Tianjun, Guo, Tian, Tian, Yuandong

In contrast to single-objective optimization (SOO), multi-objective optimization (MOO) requires an optimizer to find the Pareto frontier, a subset of feasible solutions that are not dominated by other feasible solutions. In this paper, we propose LaMOO, a novel multi-objective optimizer that learns a model from observed samples to partition the search space and then focus on promising regions that are likely to contain a subset of the Pareto frontier. The partitioning is based on the dominance number, which measures "how close" a data point is to the Pareto frontier among existing samples. To account for possible partition errors due to limited samples and model mismatch, we leverage Monte Carlo Tree Search (MCTS) to exploit promising regions while exploring suboptimal regions that may turn out to contain good solutions later. Theoretically, we prove the efficacy of learning space partitioning via LaMOO under certain assumptions. Empirically, on the HyperVolume (HV) benchmark, a popular MOO metric, LaMOO substantially outperforms strong baselines on multiple real-world MOO tasks, by up to 225% in sample efficiency for neural architecture search on Nasbench201, and up to 10% for molecular design. Multi-objective optimization (MOO) has been extensively used in many practical scenarios involving trade-offs between multiple objectives. For example, in automobile design (Chang, 2015), we must maximize the performance of the engine while simultaneously minimizing emissions and fuel consumption.

algorithm, objective, optimization, (14 more...)

2110.03173

Country: Europe > Portugal > Lisbon > Lisbon (0.04)

Genre:

Research Report (0.64)
Workflow (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

arXiv.org Artificial IntelligenceOct-8-2021

Fair Regression under Sample Selection Bias

Du, Wei, Wu, Xintao, Tong, Hanghang

Recent research on fair regression focused on developing new fairness notions and approximation methods as target variables and even the sensitive attribute are continuous in the regression setting. However, all previous fair regression research assumed the training data and testing data are drawn from the same distributions. This assumption is often violated in real world due to the sample selection bias between the training and testing data. In this paper, we develop a framework for fair regression under sample selection bias when dependent variable values of a set of samples from the training data are missing as a result of another hidden process. Our framework adopts the classic Heckman model for bias correction and the Lagrange duality to achieve fairness in regression based on a variety of fairness notions. Heckman model describes the sample selection process and uses a derived variable called the Inverse Mills Ratio (IMR) to correct sample selection bias. We use fairness inequality and equality constraints to describe a variety of fairness notions and apply the Lagrange duality theory to transform the primal problem into the dual convex optimization. For the two popular fairness notions, mean difference and mean squared error difference, we derive explicit formulas without iterative optimization, and for Pearson correlation, we derive its conditions of achieving strong duality. We conduct experiments on three real-world datasets and the experimental results demonstrate the approach's effectiveness in terms of both utility and fairness metrics.

constraint, regression, sample selection bia, (14 more...)

2110.04372

Country:

North America > United States > Arkansas > Washington County > Fayetteville (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Education (0.46)
Law (0.46)
Information Technology (0.46)

Technology:

Information Technology > Enterprise Applications > Customer Relationship Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.66)