AITopics | Search

Collaborating Authors

Search

"Search is a problem-solving technique that systematically explores a space of problem states, i.e., successive and alternative stages in the problem-solving process. Examples of problem states might include the different board configurations in a game or intermediate steps in a reasoning process. This space of alternative solutions is then searched to find an answer. Newell and Simon (1976) have argued that this is the essential basis of human problem solving. Indeed, when a chess player examines the effects of different moves or a doctor considers a number of alternative diagnoses, they are searching among alternatives."
– from Section 1.2 of Chapter One of George F. Luger's textbook, Artificial Intelligence: Structures and Strategies for Complex Problem Solving, 5th Edition (Addison-Wesley; 2005).

News Overviews Instructional Materials AI-Alerts Classics

Evolving Interpretable Visual Classifiers with Large Language Models

Chiquier, Mia, Mall, Utkarsh, Vondrick, Carl

arXiv.org Artificial IntelligenceApr-15-2024

Multimodal pre-trained models, such as CLIP, are popular for zero-shot classification due to their open-vocabulary flexibility and high performance. However, vision-language models, which compute similarity scores between images and class labels, are largely black-box, with limited interpretability, risk for bias, and inability to discover new visual concepts not written down. Moreover, in practical settings, the vocabulary for class names and attributes of specialized concepts will not be known, preventing these methods from performing well on images uncommon in large-scale vision-language datasets. To address these limitations, we present a novel method that discovers interpretable yet discriminative sets of attributes for visual recognition. We introduce an evolutionary search algorithm that uses a large language model and its in-context learning abilities to iteratively mutate a concept bottleneck of attributes for classification. Our method produces state-of-the-art, interpretable fine-grained classifiers. We outperform the latest baselines by 18.4% on five fine-grained iNaturalist datasets and by 22.2% on two KikiBouba datasets, despite the baselines having access to privileged information about class names.

classification, dataset, evolving interpretable visual classifier, (11 more...)

arXiv.org Artificial Intelligence

2404.09941

Country:

North America > United States > Virginia (0.04)
North America > Canada (0.04)
North America > United States > Arizona (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Transportation > Air (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Mitigating Hallucination in Abstractive Summarization with Domain-Conditional Mutual Information

Chae, Kyubyung, Choi, Jaepill, Jo, Yohan, Kim, Taesup

arXiv.org Artificial IntelligenceApr-15-2024

A primary challenge in abstractive summarization is hallucination -- the phenomenon where a model generates plausible text that is absent in the source text. We hypothesize that the domain (or topic) of the source text triggers the model to generate text that is highly probable in the domain, neglecting the details of the source text. To alleviate this model bias, we introduce a decoding strategy based on domain-conditional pointwise mutual information. This strategy adjusts the generation probability of each token by comparing it with the token's marginal probability within the domain of the source text. According to evaluation on the XSUM dataset, our method demonstrates improvement in terms of faithfulness and source relevance. The code is publicly available at \url{https://github.com/qqplot/dcpmi}.

hallucination, source text, summarization, (16 more...)

arXiv.org Artificial Intelligence

2404.0948

Country:

Asia > Middle East > Saudi Arabia (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > Canada > Ontario > Toronto (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)

Add feedback

TMPQ-DM: Joint Timestep Reduction and Quantization Precision Selection for Efficient Diffusion Models

Sun, Haojun, Tang, Chen, Wang, Zhi, Meng, Yuan, jiang, Jingyan, Ma, Xinzhu, Zhu, Wenwu

arXiv.org Artificial IntelligenceApr-15-2024

Diffusion models have emerged as preeminent contenders in the realm of generative models. Distinguished by their distinctive sequential generative processes, characterized by hundreds or even thousands of timesteps, diffusion models progressively reconstruct images from pure Gaussian noise, with each timestep necessitating full inference of the entire model. However, the substantial computational demands inherent to these models present challenges for deployment, quantization is thus widely used to lower the bit-width for reducing the storage and computing overheads. Current quantization methodologies primarily focus on model-side optimization, disregarding the temporal dimension, such as the length of the timestep sequence, thereby allowing redundant timesteps to continue consuming computational resources, leaving substantial scope for accelerating the generative process. In this paper, we introduce TMPQ-DM, which jointly optimizes timestep reduction and quantization to achieve a superior performance-efficiency trade-off, addressing both temporal and model optimization aspects. For timestep reduction, we devise a non-uniform grouping scheme tailored to the non-uniform nature of the denoising process, thereby mitigating the explosive combinations of timesteps. In terms of quantization, we adopt a fine-grained layer-wise approach to allocate varying bit-widths to different layers based on their respective contributions to the final generative performance, thus rectifying performance degradation observed in prior studies. To expedite the evaluation of fine-grained quantization, we further devise a super-network to serve as a precision solver by leveraging shared quantization results. These two design components are seamlessly integrated within our framework, enabling rapid joint exploration of the exponentially large decision space via a gradient-free evolutionary search algorithm.

diffusion model, quantization, timestep, (15 more...)

arXiv.org Artificial Intelligence

2404.09532

Country:

Asia > Indonesia > Bali (0.04)
Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)

Add feedback

Monte Carlo Search Algorithms Discovering Monte Carlo Tree Search Exploration Terms

Cazenave, Tristan

arXiv.org Artificial IntelligenceApr-14-2024

Monte Carlo Tree Search and Monte Carlo Search have good results for many combinatorial problems. In this paper we propose to use Monte Carlo Search to design mathematical expressions that are used as exploration terms for Monte Carlo Tree Search algorithms. The optimized Monte Carlo Tree Search algorithms are PUCT and SHUSS. We automatically design the PUCT and the SHUSS root exploration terms. For small search budgets of 32 evaluations the discovered root exploration terms make both algorithms competitive with usual PUCT.

algorithm, exploration term, expression, (13 more...)

arXiv.org Artificial Intelligence

2404.09304

Country:

South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
Europe > Italy > Piedmont > Turin Province > Turin (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Go (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Add feedback

Integrating Hyperparameter Search into Model-Free AutoML with Context-Free Grammars

Vázquez, Hernán Ceferino, Sanchez, Jorge, Carrascosa, Rafael

arXiv.org Artificial IntelligenceApr-13-2024

Automated Machine Learning (AutoML) has become increasingly popular in recent years due to its ability to reduce the amount of time and expertise required to design and develop machine learning systems. This is very important for the practice of machine learning, as it allows building strong baselines quickly, improving the efficiency of the data scientists, and reducing the time to production. However, despite the advantages of AutoML, it faces several challenges, such as defining the solutions space and exploring it efficiently. Recently, some approaches have been shown to be able to do it using tree-based search algorithms and context-free grammars. In particular, GramML presents a model-free reinforcement learning approach that leverages pipeline configuration grammars and operates using Monte Carlo tree search. However, one of the limitations of GramML is that it uses default hyperparameters, limiting the search problem to finding optimal pipeline structures for the available data preprocessors and models. In this work, we propose an extension to GramML that supports larger search spaces including hyperparameter search. We evaluated the approach using an OpenML benchmark and found significant improvements compared to other state-of-the-art techniques.

algorithm, configuration, grammar, (13 more...)

arXiv.org Artificial Intelligence

2404.03419

Country:

South America > Colombia (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Asia > India (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Hierarchical Experience-informed Navigation for Multi-modal Quadrupedal Rebar Grid Traversal

Asselmeier, Max, Ivanova, Jane, Zhou, Ziyi, Vela, Patricio A., Zhao, Ye

arXiv.org Artificial IntelligenceApr-13-2024

This study focuses on a layered, experience-based, multi-modal contact planning framework for agile quadrupedal locomotion over a constrained rebar environment. To this end, our hierarchical planner incorporates locomotion-specific modules into the high-level contact sequence planner and solves kinodynamically-aware trajectory optimization as the low-level motion planner. Through quantitative analysis of the experience accumulation process and experimental validation of the kinodynamic feasibility of the generated locomotion trajectories, we demonstrate that the experience planning heuristic offers an effective way of providing candidate footholds for a legged contact planner. Additionally, we introduce a guiding torso path heuristic at the global planning level to enhance the navigation success rate in the presence of environmental obstacles. Our results indicate that the torso-path guided experience accumulation requires significantly fewer offline trials to successfully reach the goal compared to regular experience accumulation. Finally, our planning framework is validated in both dynamics simulations and real hardware implementations on a quadrupedal robot provided by Skymul Inc.

international conference, sequence, transition, (14 more...)

arXiv.org Artificial Intelligence

2311.08354

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.70)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.47)

Add feedback

Box Facets and Cut Facets of Lifted Multicut Polytopes

Naumann, Lucas Fabian, Irmai, Jannik, Zhao, Shengxian, Andres, Bjoern

arXiv.org Artificial IntelligenceApr-12-2024

The lifted multicut problem is a combinatorial optimization problem whose feasible solutions relate one-to-one to the decompositions of a graph $G = (V, E)$. Given an augmentation $\widehat{G} = (V, E \cup F)$ of $G$ and given costs $c \in \mathbb{R}^{E \cup F}$, the objective is to minimize the sum of those $c_{uw}$ with $uw \in E \cup F$ for which $u$ and $w$ are in distinct components. For $F = \emptyset$, the problem specializes to the multicut problem, and for $E = \tbinom{V}{2}$ to the clique partitioning problem. We study a binary linear program formulation of the lifted multicut problem. More specifically, we contribute to the analysis of the associated lifted multicut polytopes: Firstly, we establish a necessary, sufficient and efficiently decidable condition for a lower box inequality to define a facet. Secondly, we show that deciding whether a cut inequality of the binary linear program defines a facet is NP-hard.

construction, inequality, node, (16 more...)

arXiv.org Artificial Intelligence

2402.16814

Country: Europe > Germany > Saxony > Leipzig (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.48)

Add feedback

Minimax Optimal Goodness-of-Fit Testing with Kernel Stein Discrepancy

Hagrass, Omar, Sriperumbudur, Bharath, Balasubramanian, Krishnakumar

arXiv.org Machine LearningApr-12-2024

We explore the minimax optimality of goodness-of-fit tests on general domains using the kernelized Stein discrepancy (KSD). The KSD framework offers a flexible approach for goodness-of-fit testing, avoiding strong distributional assumptions, accommodating diverse data structures beyond Euclidean spaces, and relying only on partial knowledge of the reference distribution, while maintaining computational efficiency. We establish a general framework and an operator-theoretic representation of the KSD, encompassing many existing KSD tests in the literature, which vary depending on the domain. We reveal the characteristics and limitations of KSD and demonstrate its non-optimality under a certain alternative space, defined over general domains when considering $\chi^2$-divergence as the separation metric. To address this issue of non-optimality, we propose a modified, minimax optimal test by incorporating a spectral regularizer, thereby overcoming the shortcomings of standard KSD tests. Our results are established under a weak moment condition on the Stein kernel, which relaxes the bounded kernel assumption required by prior work in the analysis of kernel-based hypothesis testing. Additionally, we introduce an adaptive test capable of achieving minimax optimality up to a logarithmic factor by adapting to unknown parameters. Through numerical experiments, we illustrate the superior performance of our proposed tests across various domains compared to their unregularized counterparts.

goodness-of-fit test, ksd test, nulla 1 2, (14 more...)

arXiv.org Machine Learning

2404.08278

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
North America > United States > Pennsylvania (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Add feedback

RoT: Enhancing Large Language Models with Reflection on Search Trees

Hui, Wenyang, Jiang, Chengyue, Wang, Yan, Tu, Kewei

arXiv.org Artificial IntelligenceApr-11-2024

Large language models (LLMs) have demonstrated impressive capability in reasoning and planning when integrated with tree-search-based prompting methods. However, since these methods ignore the previous search experiences, they often make the same mistakes in the search process. To address this issue, we introduce Reflection on search Trees (RoT), an LLM reflection framework designed to improve the performance of tree-search-based prompting methods. It uses a strong LLM to summarize guidelines from previous tree search experiences to enhance the ability of a weak LLM. The guidelines are instructions about solving this task through tree search which can prevent the weak LLMs from making similar mistakes in the past search process. In addition, we proposed a novel state selection method, which identifies the critical information from historical search processes to help RoT generate more specific and meaningful guidelines. In our extensive experiments, we find that RoT significantly improves the performance of LLMs in reasoning or planning tasks with various tree-search-based prompting methods (e.g., BFS and MCTS). Non-tree-search-based prompting methods such as Chain-of-Thought (CoT) can also benefit from RoT guidelines since RoT can provide task-specific knowledge collected from the search experience.

guideline, llm, mct, (15 more...)

arXiv.org Artificial Intelligence

2404.05449

Country:

Asia > Singapore (0.04)
Asia > Indonesia > Bali (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(7 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Monte Carlo Tree Search with Boltzmann Exploration

Painter, Michael, Baioumy, Mohamed, Hawes, Nick, Lacerda, Bruno

arXiv.org Artificial IntelligenceApr-11-2024

Monte-Carlo Tree Search (MCTS) methods, such as Upper Confidence Bound applied to Trees (UCT), are instrumental to automated planning techniques. However, UCT can be slow to explore an optimal action when it initially appears inferior to other actions. Maximum ENtropy Tree-Search (MENTS) incorporates the maximum entropy principle into an MCTS approach, utilising Boltzmann policies to sample actions, naturally encouraging more exploration. In this paper, we highlight a major limitation of MENTS: optimal actions for the maximum entropy objective do not necessarily correspond to optimal actions for the original objective. We introduce two algorithms, Boltzmann Tree Search (BTS) and Decaying ENtropy Tree-Search (DENTS), that address these limitations and preserve the benefits of Boltzmann policies, such as allowing actions to be sampled faster by using the Alias method. Our empirical analysis shows that our algorithms show consistent high performance across several benchmark domains, including the game of Go.

algorithm, dent, ment, (15 more...)

arXiv.org Artificial Intelligence

2404.07732

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Estonia > Tartu County > Tartu (0.04)

Genre: Research Report (0.63)

Industry: Leisure & Entertainment > Games > Go (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Add feedback