AITopics | Search

Collaborating Authors

Search

"Search is a problem-solving technique that systematically explores a space of problem states, i.e., successive and alternative stages in the problem-solving process. Examples of problem states might include the different board configurations in a game or intermediate steps in a reasoning process. This space of alternative solutions is then searched to find an answer. Newell and Simon (1976) have argued that this is the essential basis of human problem solving. Indeed, when a chess player examines the effects of different moves or a doctor considers a number of alternative diagnoses, they are searching among alternatives."
– from Section 1.2 of Chapter One of George F. Luger's textbook, Artificial Intelligence: Structures and Strategies for Complex Problem Solving, 5th Edition (Addison-Wesley; 2005).

News Overviews Instructional Materials AI-Alerts Classics

Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning

Erwan Lecarpentier, Emmanuel Rachelson

Neural Information Processing SystemsOct-3-2025, 03:37:24 GMT

Our contribution can be presented in four points.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.97)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-3-2025, 03:27:02 GMT

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. Summary: The paper presents a sample-efficient policy search algorithm for large, continuous reinforcement learning problems. In contrast to existing model-based policy search algorithms, the approach presented in this paper tries to learn local models in form of linear Gaussian controllers. Given the information (rollouts) from these linear local models, a global, nonlinear policy can then be learned using an arbitrary parametrization scheme. The so-called Guided Policy Search approach alternates between (local) trajectory optimization and (global) policy search in an iterative fashion. In their experiments, the authors show that the approach outperforms various state-of-the-art Policy Search methods, e.g., REPS, PILCO etc. Experiments where conducted in (mostly 2D) dynamics simulations involving the continuous control of multi-linked agents.

algorithm, constraint, local model, (13 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.75)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)

Add feedback

Minimax

Neural Information Processing SystemsOct-3-2025, 02:59:20 GMT

We thank reviewers for appreciating the originality of our work and providing constructive feedback. We address specific concerns below. Random selection in Alg. 1 means sampling uniformly The intuition behind Thm. 2 in explained But to interpret Thm. 2 alone: for any algorithm considered, if There is no missing factor of 2 in Eq.(28) and Eq.(26) Thm. 3 is as following: for any Pareto optimal rate Alg. 1 is thus Pareto optimal. Eq. after line 115 defines the hardness level of a given problem, Alg. 1 is different from the Distilled Note that we are also comparing to an algorithm, i.e., QRM2, that allows the reuse of statistics [12]. The lower bound in Section 2 is in the minimax sense, so it suffices to reduce to the single-best arm case.

algorithm, artificial intelligence, optimal, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.64)

Add feedback

Minimax Optimal Estimation of Approximate Differential Privacy on Neighboring Databases

Xiyang Liu, Sewoong Oh

Neural Information Processing SystemsOct-3-2025, 01:34:20 GMT

Neural Information Processing Systems http://nips.cc/

approximate differential privacy, artificial intelligence, minimax optimal estimation, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.40)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-3-2025, 00:47:47 GMT

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper introduces a new approach to sampling from continuous probability distributions. The method extends prior work on using a combination of Gumbel perturbations and optimization to the continuous case. This is technically challenging, and they devise several interesting ideas to deal with continuous spaces, e.g. to produce an exponentially large or even infinite number of random variables (one per point of the continuous/discrete space) with the right distribution in an implicit way. Finally, they highlight an interesting connection with adaptive rejection sampling.

algorithm, continuous space, dimension, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.48)

Add feedback

743c41a921516b04afde48bb48e28ce6-AuthorFeedback.pdf

Neural Information Processing SystemsOct-3-2025, 00:26:25 GMT

HOOF is robust to settings within this range. We could not present results for Ant and Walker due to space constraints. Thus we are restricted to zero order optimisers. For natural gradients like TNPG, HOOF does not add any new hyperparameters beyond those used by grid search - i.e. Other methods like PBT introduce more hyperparameters than these.

artificial intelligence, constraint, hyperparameter, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.32)

Add feedback

Response to Reviewer 1: 3

Neural Information Processing SystemsOct-3-2025, 00:17:17 GMT

We thank all reviewers for their comments and acknowledgeme nt of our contribution. Below we address each reviewer's comments separately. The reviewer raised a very good point. We will add this clarification in the revised version. Our gradient-based method is much more efficient but only finds a stationary point.

artificial intelligence, machine learning, reviewer, (8 more...)

Neural Information Processing Systems

Technology: