AITopics | perp

Collaborating Authors

perp

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Replay-Guided Adversarial Environment Design

Neural Information Processing SystemsDec-23-2025, 18:34:30 GMT

Deep reinforcement learning (RL) agents may successfully generalize to new settings if trained on an appropriately diverse set of environment and task configurations. Unsupervised Environment Design (UED) is a promising self-supervised RL paradigm, wherein the free parameters of an underspecified environment are automatically adapted during training to the agent's capabilities, leading to the emergence of diverse training environments. Here, we cast Prioritized Level Replay (PLR), an empirically successful but theoretically unmotivated method that selectively samples randomly-generated training levels, as UED. We argue that by curating completely random levels, PLR, too, can generate novel and complex levels for effective training. This insight reveals a natural class of UED methods we call Dual Curriculum Design (DCD). Crucially, DCD includes both PLR and a popular UED algorithm, PAIRED, as special cases and inherits similar theoretical guarantees. This connection allows us to develop novel theory for PLR, providing a version with a robustness guarantee at Nash equilibria. Furthermore, our theory suggests a highly counterintuitive improvement to PLR: by stopping the agent from updating its policy on uncurated levels (training on less data), we can improve the convergence to Nash equilibria. Indeed, our experiments confirm that our new method, PLR$^{\perp}$, obtains better results on a suite of out-of-distribution, zero-shot transfer tasks, in addition to demonstrating that PLR$^{\perp}$ improves the performance of PAIRED, from which it inherited its theoretical framework.

name change, plr, replay-guided adversarial environment design, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)

Add feedback

Replay-Guided Adversarial Environment Design

Neural Information Processing SystemsOct-9-2024, 12:48:50 GMT

artificial intelligence, machine learning, reinforcement learning, (6 more...)

Neural Information Processing Systems

Industry: Education (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.99)

Add feedback

Reviews: On Neuronal Capacity

Neural Information Processing SystemsOct-7-2024, 22:12:38 GMT

The main result of the paper is giving tight bounds on the number of n-variable boolean functions that can be expressed as degree d-polynomial threshold functions for any fixed d (d growing very mildly with n) also works. To me the rest of the results while interesting seem to be mostly easy applications of the key result to other more fashionable neural network models. However, the tightness of the main result does not translate to tight bounds when other neuroidal models are considered, because of the kind of non-linearities or weight constraints involved. The main result is highly non-trivial, the proof quite lengthy though elegant, and resolves a 25 year old open problem. Although the proof uses a lot of heavy mathematics, the key contribution seems to be generalizing random matrix theory to a random tensor theory -- the key result being that a large number of stochastically independent random vectors in low-dimension (and hence clearly not linearly indpendent) still yield a high degree of linear independence when tensorized.

main result, neuronal capacity, review, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.58)

Add feedback

Convergence analysis of t-SNE as a gradient flow for point cloud on a manifold

Jeong, Seonghyeon, Wu, Hau-Tieng

arXiv.org Artificial IntelligenceJan-31-2024

We present a theoretical foundation regarding the boundedness of the t-SNE algorithm. t-SNE employs gradient descent iteration with Kullback-Leibler (KL) divergence as the objective function, aiming to identify a set of points that closely resemble the original data points in a high-dimensional space, minimizing KL divergence. Investigating t-SNE properties such as perplexity and affinity under a weak convergence assumption on the sampled dataset, we examine the behavior of points generated by t-SNE under continuous gradient flow. Demonstrating that points generated by t-SNE remain bounded, we leverage this insight to establish the existence of a minimizer for KL divergence.

exp, inull, lemma 6, (16 more...)

arXiv.org Artificial Intelligence

2401.17675

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > Taiwan > Taiwan Province > Taipei (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Data Science (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

Towards Automated Readable Proofs of Ruler and Compass Constructions

Marinković, Vesna, Šukilović, Tijana, Marić, Filip

arXiv.org Artificial IntelligenceJan-22-2024

Although there are several systems that successfully generate construction steps for ruler and compass construction problems, none of them provides readable synthetic correctness proofs for generated constructions. In the present work, we demonstrate how our triangle construction solver ArgoTriCS can cooperate with automated theorem provers for first order logic and coherent logic so that it generates construction correctness proofs, that are both human-readable and formal (can be checked by interactive theorem provers such as Coq or Isabelle/HOL). These proofs currently rely on many high-level lemmas and our goal is to have them all formally shown from the basic axioms of geometry.

construction, instantiation, perp, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.4204/EPTCS.398.5

2401.137

Country: Europe > Serbia > Central Serbia > Belgrade (0.05)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)

Add feedback

PeRP: Personalized Residual Policies For Congestion Mitigation Through Co-operative Advisory Systems

Hasan, Aamir, Chakraborty, Neeloy, Chen, Haonan, Cho, Jung-Hoon, Wu, Cathy, Driggs-Campbell, Katherine

arXiv.org Artificial IntelligenceAug-15-2023

Intelligent driving systems can be used to mitigate congestion through simple actions, thus improving many socioeconomic factors such as commute time and gas costs. However, these systems assume precise control over autonomous vehicle fleets, and are hence limited in practice as they fail to account for uncertainty in human behavior. Piecewise Constant (PC) Policies address these issues by structurally modeling the likeness of human driving to reduce traffic congestion in dense scenarios to provide action advice to be followed by human drivers. However, PC policies assume that all drivers behave similarly. To this end, we develop a co-operative advisory system based on PC policies with a novel driver trait conditioned Personalized Residual Policy, PeRP. PeRP advises drivers to behave in ways that mitigate traffic congestion. We first infer the driver's intrinsic traits on how they follow instructions in an unsupervised manner with a variational autoencoder. Then, a policy conditioned on the inferred trait adapts the action of the PC policy to provide the driver with a personalized recommendation. Our system is trained in simulation with novel driver modeling of instruction adherence. We show that our approach successfully mitigates congestion while adapting to different driver behaviors, with 4 to 22% improvement in average speed over baselines.

artificial intelligence, machine learning, pc policy, (16 more...)

arXiv.org Artificial Intelligence

2308.00864

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Industry: Transportation > Ground > Road (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback