AITopics | input number

Collaborating Authors

input number

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

13e36f06c66134ad65f532e90d898545-Supplemental.pdf

Neural Information Processing SystemsOct-9-2025, 13:21:28 GMT

artificial intelligence, machine learning, nu-net, (15 more...)

Neural Information Processing Systems

Country: North America > Canada (0.06)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Seemingly Simple Planning Problems are Computationally Challenging: The Countdown Game

Katz, Michael, Kokel, Harsha, Sreedharan, Sarath

arXiv.org Artificial IntelligenceAug-6-2025

There is a broad consensus that the inability to form long-term plans is one of the key limitations of current foundational models and agents. However, the existing planning benchmarks remain woefully inadequate to truly measure their planning capabilities. Most existing benchmarks either focus on loosely defined tasks like travel planning or end up leveraging existing domains and problems from international planning competitions. While the former tasks are hard to formalize and verify, the latter were specifically designed to test and challenge the weaknesses of existing automated planners. To address these shortcomings, we propose a procedure for creating a planning benchmark centered around the game called Countdown, where a player is expected to form a target number from a list of input numbers through arithmetic operations. We discuss how this problem meets many of the desiderata associated with an ideal benchmark for planning capabilities evaluation. Specifically, the domain allows for an intuitive, natural language description for each problem instance, it is computationally challenging (NP-complete), and the instance space is rich enough that we do not have to worry about memorization. We perform an extensive theoretical analysis, establishing the computational complexity result and demonstrate the advantage of our instance generation procedure over public benchmarks. We evaluate a variety of existing LLM-assisted planning methods on instances generated using our procedure. Our results show that, unlike other domains like 24 Game (a special case of Countdown), our proposed dynamic benchmark remains extremely challenging for existing LLM-based approaches.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2508.029

Country:

North America > United States > Colorado > Larimer County > Fort Collins (0.04)
North America > United States > California > Santa Clara County > San Jose (0.04)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Graph-Partitioning Based Continuous Optimization Approach to Semi-supervised Clustering Problems

Liu, Wei, Liu, Xin, Ng, Michael K., Zhang, Zaikun

arXiv.org Artificial IntelligenceMar-6-2025

Semi-supervised clustering is a basic problem in various applications. Most existing methods require knowledge of the ideal cluster number, which is often difficult to obtain in practice. Besides, satisfying the must-link constraints is another major challenge for these methods. In this work, we view the semi-supervised clustering task as a partitioning problem on a graph associated with the given dataset, where the similarity matrix includes a scaling parameter to reflect the must-link constraints. Utilizing a relaxation technique, we formulate the graph partitioning problem into a continuous optimization model that does not require the exact cluster number, but only an overestimate of it. We then propose a block coordinate descent algorithm to efficiently solve this model, and establish its convergence result. Based on the obtained solution, we can construct the clusters that theoretically meet the must-link constraints under mild assumptions. Furthermore, we verify the effectiveness and efficiency of our proposed method through comprehensive numerical experiments.

algorithm, must-link constraint, theorem 2, (13 more...)

arXiv.org Artificial Intelligence

2503.04447

Country:

Asia > China > Hong Kong (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New York (0.04)
(5 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Stream of Search (SoS): Learning to Search in Language

Gandhi, Kanishk, Lee, Denise, Grand, Gabriel, Liu, Muxin, Cheng, Winson, Sharma, Archit, Goodman, Noah D.

arXiv.org Artificial IntelligenceApr-1-2024

Language models are rarely shown fruitful mistakes while training. They then struggle to look beyond the next token, suffering from a snowballing of errors and struggling to predict the consequence of their actions several steps ahead. In this paper, we show how language models can be taught to search by representing the process of search in language, as a flattened string -- a stream of search (SoS). We propose a unified language for search that captures an array of different symbolic search strategies. We demonstrate our approach using the simple yet difficult game of Countdown, where the goal is to combine input numbers with arithmetic operations to reach a target number. We pretrain a transformer-based language model from scratch on a dataset of streams of search generated by heuristic solvers. We find that SoS pretraining increases search accuracy by 25% over models trained to predict only the optimal search trajectory. We further finetune this model with two policy improvement methods: Advantage-Induced Policy Alignment (APA) and Self-Taught Reasoner (STaR). The finetuned SoS models solve 36% of previously unsolved problems, including problems that cannot be solved by any of the heuristic solvers. Our results indicate that language models can learn to solve problems via search, self-improve to flexibly use different search strategies, and potentially discover new ones.

exploring operation, opération, resulting number, (14 more...)

arXiv.org Artificial Intelligence

2404.03683

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
(2 more...)

Add feedback

Reverse That Number! Decoding Order Matters in Arithmetic Learning

Zhang-Li, Daniel, Lin, Nianyi, Yu, Jifan, Zhang, Zheyuan, Yao, Zijun, Zhang, Xiaokang, Hou, Lei, Zhang, Jing, Li, Juanzi

arXiv.org Artificial IntelligenceMar-9-2024

Recent advancements in pretraining have demonstrated that modern Large Language Models (LLMs) possess the capability to effectively learn arithmetic operations. However, despite acknowledging the significance of digit order in arithmetic computation, current methodologies predominantly rely on sequential, step-by-step approaches for teaching LLMs arithmetic, resulting in a conclusion where obtaining better performance involves fine-grained step-by-step. Diverging from this conventional path, our work introduces a novel strategy that not only reevaluates the digit order by prioritizing output from the least significant digit but also incorporates a step-by-step methodology to substantially reduce complexity. We have developed and applied this method in a comprehensive set of experiments. Compared to the previous state-of-the-art (SOTA) method, our findings reveal an overall improvement of in accuracy while requiring only a third of the tokens typically used during training. For the purpose of facilitating replication and further research, we have made our code and dataset publicly available at \url{https://anonymous.4open.science/r/RAIT-9FB7/}.

digit, opération, subtraction, (17 more...)

arXiv.org Artificial Intelligence

2403.05845

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education > Curriculum > Subject-Specific Education (0.85)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Language Models Understand Numbers, at Least Partially

Zhu, Fangwei, Dai, Damai, Sui, Zhifang

arXiv.org Artificial IntelligenceFeb-4-2024

Large language models (LLMs) have exhibited impressive competence in various tasks, but their opaque internal mechanisms hinder their use in mathematical problems. In this paper, we study a fundamental question: whether language models understand numbers, a basic element in math. Based on an assumption that LLMs should be capable of compressing numbers in their hidden states to solve mathematical problems, we construct a synthetic dataset comprising addition problems and utilize linear probes to read out input numbers from the hidden states. Experimental results support the existence of compressed numbers in LLMs. However, it is difficult to precisely reconstruct the original numbers, indicating that the compression process may not be lossless. Further experiments show that LLMs can utilize encoded numbers to perform arithmetic computations, and the computational ability scales up with the model size. Our preliminary research suggests that LLMs exhibit a partial understanding of numbers, offering insights for future investigations about the models' mathematical capability.

input number, language model, probe, (13 more...)

arXiv.org Artificial Intelligence

2401.03735

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Understanding uncertainty and the value of visualisation in AI

#artificialintelligenceAug-26-2021, 07:40:16 GMT

Maths PhD student, Alex Terenin, recently presented his group's work at the 2021 International Conference of Artificial Intelligence and Statistics. AISTATS is a prestigious event that brings together researchers from the machine learning and statistics communities. One of the group's papers, Matérn Gaussian Processes on Graphs, won the Best Student Paper award at the event – congratulations! We caught up with Alex to find out more about his experience at the conference and as a PhD student at Imperial, why he's fascinated with research into uncertainty, and to get his thoughts on why the visual aspect of machine learning is vital. My research focuses on artificial intelligence, particularly on learning-based decision-making systems.

error bar, gaussian process, visualisation, (15 more...)

#artificialintelligence

Country:

North America > United States > California > Santa Barbara County > Santa Barbara (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)

Genre: Personal > Interview (0.69)

Industry:

Education (0.49)
Government (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.95)

Add feedback

Generating Pusheen with AI

@machinelearnbotMay-5-2018, 09:20:59 GMT

The BEGAN model uses a loss equation based on the Wasserstein distance, except its goal is to minimize the absolute value of the autoencoder losses on the real and fake images instead of the images themselves. In practice it drops the absolute value and minimizes the reconstruction loss on real images minus the reconstruction loss on fake images. Additionally, it introduces a weighting term on the fake reconstruction loss which changes proportaional to the difference between the fake and real reconstruction losses; this serves to maintain a balance between the discriminator and generator so one does not easily win over the other. I'll be releasing the code soon but to summarize, the architectures and hyperparameters that typically worked well with for the discriminator and generator were: I did a fair amount of hyperparameter searching to get a model that worked (keep scrolling for some failures) but I think it ended up being a pretty decent cat generator. Below is a training video of one of the models that I use in the demos, where every 250 steps I take 16 samples from the generator.

artificial intelligence, generator, machine learning, (19 more...)

@machinelearnbot

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)

Add feedback