AITopics | softmax value

The idea is to augment Monte-Carlo TreeSearch (MCTS) withmaximum entropypolicyoptimization, evaluatingeach search node bysoftmax values back-propagated from simulation.

artificial intelligence, machine learning, sft, (17 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.96)

Add feedback

7ffb4e0ece07869880d51662a2234143-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-12-2026, 17:47:43 GMT

reviewer, softmax value, theorem, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.32)

Add feedback

Maximum Entropy Monte-Carlo Planning

Neural Information Processing SystemsDec-25-2025, 15:35:33 GMT

We develop a new algorithm for online planning in large scale sequential decision problems that improves upon the worst case efficiency of UCT. The idea is to augment Monte-Carlo Tree Search (MCTS) with maximum entropy policy optimization, evaluating each search node by softmax values back-propagated from simulation. To establish the effectiveness of this approach, we first investigate the single-step decision problem, stochastic softmax bandits, and show that softmax values can be estimated at an optimal convergence rate in terms of mean squared error. We then extend this approach to general sequential decision making by developing a general MCTS algorithm, Maximum Entropy for Tree Search (MENTS). We prove that the probability of MENTS failing to identify the best decision at the root decays exponentially, which fundamentally improves the polynomial convergence rate of UCT. Our experimental results also demonstrate that MENTS is more sample efficient than UCT in both synthetic problems and Atari 2600 games.

electronic proceedings, maximum entropy monte-carlo planning, name change, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Maximum Entropy Monte-Carlo Planning

Chenjun Xiao, Ruitong Huang, Jincheng Mei, Dale Schuurmans, Martin Müller

Neural Information Processing SystemsOct-3-2025, 02:22:01 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > Canada (0.46)

Industry: Leisure & Entertainment > Games (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.43)

Add feedback

7ffb4e0ece07869880d51662a2234143-AuthorFeedback.pdf

Neural Information Processing SystemsOct-3-2025, 02:21:45 GMT

artificial intelligence, machine learning, theorem, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.38)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.32)

Add feedback

Maximum Entropy Monte-Carlo Planning

Neural Information Processing SystemsOct-10-2024, 09:15:38 GMT

We develop a new algorithm for online planning in large scale sequential decision problems that improves upon the worst case efficiency of UCT. The idea is to augment Monte-Carlo Tree Search (MCTS) with maximum entropy policy optimization, evaluating each search node by softmax values back-propagated from simulation. To establish the effectiveness of this approach, we first investigate the single-step decision problem, stochastic softmax bandits, and show that softmax values can be estimated at an optimal convergence rate in terms of mean squared error. We then extend this approach to general sequential decision making by developing a general MCTS algorithm, Maximum Entropy for Tree Search (MENTS). We prove that the probability of MENTS failing to identify the best decision at the root decays exponentially, which fundamentally improves the polynomial convergence rate of UCT.

convergence rate, decision problem, maximum entropy monte-carlo planning, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.91)

Add feedback

Detecting Rumor Veracity with Only Textual Information by Double-Channel Structure

Kim, Alex, Yoon, Sangwon

arXiv.org Artificial IntelligenceDec-5-2023

Kyle (1985) proposes two types of rumors: informed rumors which are based on some private information and uninformed rumors which are not based on any information (i.e. bluffing). Also, prior studies find that when people have credible source of information, they are likely to use a more confident textual tone in their spreading of rumors. Motivated by these theoretical findings, we propose a double-channel structure to determine the ex-ante veracity of rumors on social media. Our ultimate goal is to classify each rumor into true, false, or unverifiable category. We first assign each text into either certain (informed rumor) or uncertain (uninformed rumor) category. Then, we apply lie detection algorithm to informed rumors and thread-reply agreement detection algorithm to uninformed rumors. Using the dataset of SemEval 2019 Task 7, which requires ex-ante threefold classification (true, false, or unverifiable) of social media rumors, our model yields a macro-F1 score of 0.4027, outperforming all the baseline models and the second-place winner (Gorrell et al., 2019). Furthermore, we empirically validate that the double-channel structure outperforms single-channel structures which use either lie detection or agreement detection algorithm to all posts.

algorithm, information, veracity, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2022.socialnlp-1.3

2312.03195

Country:

North America > United States > Massachusetts (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Self-Training with Purpose Preserving Augmentation Improves Few-shot Generative Dialogue State Tracking

Lee, Jihyun, Lee, Chaebin, Kim, Yunsu, Lee, Gary Geunbae

arXiv.org Artificial IntelligenceNov-17-2022

In dialogue state tracking (DST), labeling the dataset involves considerable human labor. We propose a new self-training framework for fewshot generative DST that utilize unlabeled data. Our self-training method iteratively improves the model by pseudo labeling and employs Purpose Preserving augmentation (PPaug) to prevent overfitting. We increase the few-shot (10%) performance by approximately 4% on Figure 1: Dialogue example of DST dataset and its belief MultiWOZ 2.1 (Eric et al., 2019) and enhances state. The underlined part of the dialogue is the the slot-recall 8.34% for unseen values compared value of the belief state and has specific information to baseline.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2211.09379

Genre: Research Report (1.00)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.72)

Add feedback

Confident AI

Davis, Jim

arXiv.org Artificial IntelligenceFeb-11-2022

In this paper, we propose "Confident AI" as a means to designing Artificial Intelligence (AI) and Machine Learning (ML) systems with both algorithm and user confidence in model predictions and reported results. The 4 basic tenets of Confident AI are Repeatability, Believability, Sufficiency, and Adaptability. Each of the tenets is used to explore fundamental issues in current AI/ML systems and together provide an overall approach to Confident AI.

confident ai, prediction, softmax value, (16 more...)

arXiv.org Artificial Intelligence

2202.05957

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Ohio > Franklin County > Columbus (0.04)
North America > United States > New York (0.04)

Genre: Research Report > Experimental Study (0.68)

Industry:

Automobiles & Trucks > Manufacturer (0.48)
Transportation > Passenger (0.31)
Transportation > Ground > Road (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

Maximum Entropy Monte-Carlo Planning

Xiao, Chenjun, Huang, Ruitong, Mei, Jincheng, Schuurmans, Dale, Müller, Martin

Neural Information Processing SystemsMar-19-2020, 00:30:53 GMT

We develop a new algorithm for online planning in large scale sequential decision problems that improves upon the worst case efficiency of UCT. The idea is to augment Monte-Carlo Tree Search (MCTS) with maximum entropy policy optimization, evaluating each search node by softmax values back-propagated from simulation. To establish the effectiveness of this approach, we first investigate the single-step decision problem, stochastic softmax bandits, and show that softmax values can be estimated at an optimal convergence rate in terms of mean squared error. We then extend this approach to general sequential decision making by developing a general MCTS algorithm, Maximum Entropy for Tree Search (MENTS). We prove that the probability of MENTS failing to identify the best decision at the root decays exponentially, which fundamentally improves the polynomial convergence rate of UCT.

convergence rate, decision problem, maximum entropy monte-carlo planning, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.91)

Add feedback

Filters

Collaborating Authors

softmax value

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Maximum Entropy Monte-Carlo Planning

7ffb4e0ece07869880d51662a2234143-AuthorFeedback.pdf

Maximum Entropy Monte-Carlo Planning

Maximum Entropy Monte-Carlo Planning

7ffb4e0ece07869880d51662a2234143-AuthorFeedback.pdf

Maximum Entropy Monte-Carlo Planning

Detecting Rumor Veracity with Only Textual Information by Double-Channel Structure

Self-Training with Purpose Preserving Augmentation Improves Few-shot Generative Dialogue State Tracking

Confident AI

Maximum Entropy Monte-Carlo Planning