AITopics

2209.13139

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Hubei Province > Wuhan (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.63)

arXiv.org Artificial IntelligenceSep-26-2022

A Contrastive Framework for Neural Text Generation

Su, Yixuan, Lan, Tian, Wang, Yan, Yogatama, Dani, Kong, Lingpeng, Collier, Nigel

Text generation is of great importance to many natural language processing applications. However, maximization-based decoding methods (e.g., beam search) of neural language models often lead to degenerate solutions--the generated text is unnatural and contains undesirable repetitions. Existing approaches introduce stochasticity via sampling or modify training objectives to decrease the probabilities of certain tokens (e.g., unlikelihood training). However, they often lead to solutions that lack coherence. In this work, we show that an underlying reason for model degeneration is the anisotropic distribution of token representations. We present a contrastive solution: (i) SimCTG, a contrastive training objective to calibrate the model's representation space, and (ii) a decoding method--contrastive search--to encourage diversity while maintaining coherence in the generated text. Extensive experiments and analyses on three benchmarks from two languages demonstrate that our proposed approach significantly outperforms current state-of-the-art text generation methods as evaluated by both human and automatic metrics.

contrastive search, machine learning, natural language, (18 more...)

2202.06417

Country:

North America > Mexico (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > Central America (0.14)
(28 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment > Games (0.93)
Leisure & Entertainment > Sports (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

arXiv.org Artificial IntelligenceSep-24-2022

Deep Attentive Belief Propagation: Integrating Reasoning and Learning for Solving Constraint Optimization Problems

Deng, Yanchen, Kong, Shufeng, Liu, Caihua, An, Bo

Belief Propagation (BP) is an important message-passing algorithm for various reasoning tasks over graphical models, including solving the Constraint Optimization Problems (COPs). It has been shown that BP can achieve state-of-the-art performance on various benchmarks by mixing old and new messages before sending the new one, i.e., damping. However, existing methods of tuning a static damping factor for BP not only are laborious but also harm their performance. Moreover, existing BP algorithms treat each variable node's neighbors equally when composing a new message, which also limits their exploration ability. To address these issues, we seamlessly integrate BP, Gated Recurrent Units (GRUs), and Graph Attention Networks (GATs) within the message-passing framework to reason about dynamic weights and damping factors for composing new BP messages. Our model, Deep Attentive Belief Propagation (DABP), takes the factor graph and the BP messages in each iteration as the input and infers the optimal weights and damping factors through GRUs and GATs, followed by a multi-head attention layer. Furthermore, unlike existing neural-based BP variants, we propose a novel self-supervised learning algorithm for DABP with a smoothed solution cost, which does not require expensive training labels and also avoids the common out-of-distribution issue through efficient online learning. Extensive experiments show that our model significantly outperforms state-of-the-art baselines.

artificial intelligence, belief revision, machine learning, (19 more...)

2209.12

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(4 more...)

arXiv.org Artificial IntelligenceSep-24-2022

Efficient Speed Planning for Autonomous Driving in Dynamic Environment with Interaction Point Model

Chen, Yingbing, Xin, Ren, Cheng, Jie, Zhang, Qingwen, Mei, Xiaodong, Liu, Ming, Wang, Lujia

Safely interacting with other traffic participants is one of the core requirements for autonomous driving, especially in intersections and occlusions. Most existing approaches are designed for particular scenarios and require significant human labor in parameter tuning to be applied to different situations. To solve this problem, we first propose a learning-based Interaction Point Model (IPM), which describes the interaction between agents with the protection time and interaction priority in a unified manner. We further integrate the proposed IPM into a novel planning framework, demonstrating its effectiveness and robustness through comprehensive simulations in highly dynamic environments.

artificial intelligence, interaction point, machine learning, (15 more...)

doi: 10.1109/LRA.2022.3207555

2209.09013

Country: Asia > China > Hong Kong (0.04)

Genre: Research Report (0.64)

Industry:

Transportation > Ground > Road (1.00)
Information Technology (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.94)

#artificialintelligenceSep-23-2022, 00:12:34 GMT

How Good Is Neural Combinatorial Optimization?

Traditional solvers for tackling combinatorial optimization (CO) problems are usually designed by human experts. Recently, there has been a surge of interest in utilizing Deep Learning, especially Deep Reinforcement Learning, to automatically learn effective solvers for CO. The resultant new paradigm is termed Neural Combinatorial Optimization (NCO). However, the advantages and disadvantages of NCO over other approaches have not been well studied empirically or theoretically. In this work, we present a comprehensive comparative study of NCO solvers and alternative solvers. Specifically, taking the Traveling Salesman Problem as the testbed problem, we assess the performance of the solvers in terms of five aspects, i.e., effectiveness, efficiency, stability, scalability and generalization ability. Our results show that in general the solvers learned by NCO approaches still fall short of traditional solvers in nearly all these aspects. A potential benefit of the former would be their superior time and energy efficiency on small-size problem instances when sufficient training instances are available. We hope this work would help better understand the strengths and weakness of NCO, and provide a comprehensive evaluation protocol for further benchmarking NCO approaches against other approaches.

neural combinatorial optimization

#artificialintelligence

Genre: Research Report > New Finding (0.53)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Nayak, Sharan, Otte, Michael W.

Bidirectional Sampling Based Search Without Two Point Boundary Value Solution

Bidirectional motion planning approaches decrease planning time, on average, compared to their unidirectional counterparts. In single-query feasible motion planning, using bidirectional search to find a continuous motion plan requires an edge connection between the forward and reverse search trees. Such a tree-tree connection requires solving a two-point Boundary Value Problem (BVP). However, a two-point BVP solution can be difficult or impossible to calculate for many systems. We present a novel bidirectional search strategy that does not require solving the two-point BVP. Instead of connecting the forward and reverse trees directly, the reverse tree's cost information is used as a guiding heuristic for the forward search. This enables the forward search to quickly converge to a feasible solution without solving the two-point BVP. We propose two new algorithms (GBRRT and GABRRT) that use this strategy and run multiple software simulations using multiple dynamical systems and real-world hardware experiments to show that our algorithms perform on-par or better than existing state-of-the-art methods in quickly finding an initial feasible solution.

algorithm, artificial intelligence, reverse tree, (15 more...)

2010.14692

Genre:

Research Report > Experimental Study (0.46)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Local AdaGrad-Type Algorithm for Stochastic Convex-Concave Optimization

Liao, Luofeng, Shen, Li, Duan, Jia, Kolar, Mladen, Tao, Dacheng

Large scale convex-concave minimax problems arise in numerous applications, including game theory, robust training, and training of generative adversarial networks. Despite their wide applicability, solving such problems efficiently and effectively is challenging in the presence of large amounts of data using existing stochastic minimax methods. We study a class of stochastic minimax methods and develop a communication-efficient distributed stochastic extragradient algorithm, LocalAdaSEG, with an adaptive learning rate suitable for solving convex-concave minimax problems in the Parameter-Server model. LocalAdaSEG has three main features: (i) a periodic communication strategy that reduces the communication cost between workers and the server; (ii) an adaptive learning rate that is computed locally and allows for tuning-free implementation; and (iii) theoretically, a nearly linear speed-up with respect to the dominant variance term, arising from the estimation of the stochastic gradient, is proven in both the smooth and nonsmooth convex-concave settings. LocalAdaSEG is used to solve a stochastic bilinear game, and train a generative adversarial network. We compare LocalAdaSEG against several existing optimizers for minimax problems and demonstrate its efficacy through several experiments in both homogeneous and heterogeneous settings.

artificial intelligence, localadaseg, machine learning, (16 more...)

2106.10022

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.63)

Industry: Leisure & Entertainment (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)

Grunau, Christoph, Rozhoň, Václav

Adapting $k$-means algorithms for outliers

This paper shows how to adapt several simple and classical sampling-based algorithms for the $k$-means problem to the setting with outliers. Recently, Bhaskara et al. (NeurIPS 2019) showed how to adapt the classical $k$-means++ algorithm to the setting with outliers. However, their algorithm needs to output $O(\log (k) \cdot z)$ outliers, where $z$ is the number of true outliers, to match the $O(\log k)$-approximation guarantee of $k$-means++. In this paper, we build on their ideas and show how to adapt several sequential and distributed $k$-means algorithms to the setting with outliers, but with substantially stronger theoretical guarantees: our algorithms output $(1+\varepsilon)z$ outliers while achieving an $O(1 / \varepsilon)$-approximation to the objective function. In the sequential world, we achieve this by adapting a recent algorithm of Lattanzi and Sohler (ICML 2019). In the distributed setting, we adapt a simple algorithm of Guha et al. (IEEE Trans. Know. and Data Engineering 2003) and the popular $k$-means$\|$ of Bahmani et al. (PVLDB 2012). A theoretical application of our techniques is an algorithm with running time $\tilde{O}(nk^2/z)$ that achieves an $O(1)$-approximation to the objective function while outputting $O(z)$ outliers, assuming $k \ll z \ll n$. This is complemented with a matching lower bound of $\Omega(nk^2/z)$ for this problem in the oracle model.

artificial intelligence, data mining, machine learning, (21 more...)

2007.01118

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Information Management (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.73)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
Information Technology > Data Science > Data Mining (0.67)

Hu, Jiaheng, Whiman, Julian, Choset, Howie

GLSO: Grammar-guided Latent Space Optimization for Sample-efficient Robot Design Automation

Robots have been used in all sorts of automation, and yet the design of robots remains mainly a manual task. We seek to provide design tools to automate the design of robots themselves. An important challenge in robot design automation is the large and complex design search space which grows exponentially with the number of components, making optimization difficult and sample inefficient. In this work, we present Grammar-guided Latent Space Optimization (GLSO), a framework that transforms design automation into a low-dimensional continuous optimization problem by training a graph variational autoencoder (VAE) to learn a mapping between the graph-structured design space and a continuous latent space. This transformation allows optimization to be conducted in a continuous latent space, where sample efficiency can be significantly boosted by applying algorithms such as Bayesian Optimization. GLSO guides training of the VAE using graph grammar rules and robot world space features, such that the learned latent space focus on valid robots and is easier for the optimization algorithm to explore. Importantly, the trained VAE can be reused to search for designs specialized to multiple different tasks without retraining. We evaluate GLSO by designing robots for a set of locomotion tasks in simulation, and demonstrate that our method outperforms related state-of-the-art robot design automation methods.

artificial intelligence, machine learning, optimization problem, (17 more...)

2209.11748

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
North America > United States > Washington > King County > Seattle (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(2 more...)

Virgolin, Marco, Fracaros, Saverio

On the Robustness of Sparse Counterfactual Explanations to Adverse Perturbations

Counterfactual explanations (CEs) are a powerful means for understanding how decisions made by algorithms can be changed. Researchers have proposed a number of desiderata that CEs should meet to be practically useful, such as requiring minimal effort to enact, or complying with causal models. We consider a further aspect to improve the usability of CEs: robustness to adverse perturbations, which may naturally happen due to unfortunate circumstances. Since CEs typically prescribe a sparse form of intervention (i.e., only a subset of the features should be changed), we study the effect of addressing robustness separately for the features that are recommended to be changed and those that are not. Our definitions are workable in that they can be incorporated as penalty terms in the loss functions that are used for discovering CEs. To experiment with robustness, we create and release code where five data sets (commonly used in the field of fair and explainable machine learning) have been enriched with feature-specific annotations that can be used to sample meaningful perturbations. Our experiments show that CEs are often not robust and, if adverse perturbations take place (even if not worst-case), the intervention they prescribe may require a much larger cost than anticipated, or even become impossible. However, accounting for robustness in the search process, which can be done rather easily, allows discovering robust CEs systematically. Robust CEs make additional intervention to contrast perturbations much less costly than non-robust CEs. We also find that robustness is easier to achieve for the features to change, posing an important point of consideration for the choice of what counterfactual explanation is best for the user. Our code is available at: https://github.com/marcovirgolin/robust-counterfactuals.

artificial intelligence, machine learning, natural language, (15 more...)

2201.09051

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry:

Health & Medicine (1.00)
Banking & Finance (0.93)
Government > Regional Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.93)