AITopics | thrice

Collaborating Authors

thrice

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

12b1e42dc0746f22cf361267de07073f-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-7-2026, 13:05:16 GMT

We thank all reviewers for constructive comments. We added an ablation study on the SCAN length split to demonstrate its importance. For example, in the test set, there is a new pattern "jump around right thrice" that does not appear in the training set. Recursion and sequence manipulation supported by NeSS are critical to learn such parsing rules to generalize. NeSS is 100% in 2 runs, and 62.5% in 3 runs. When the model predicts the alternative translation, the exact match accuracy becomes lower.

accuracy, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.57)

Add feedback

12b1e42dc0746f22cf361267de07073f-AuthorFeedback.pdf

Neural Information Processing SystemsOct-2-2025, 03:20:51 GMT

accuracy, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.57)

Add feedback

Least-to-Most Prompting Enables Complex Reasoning in Large Language Models

Zhou, Denny, Schärli, Nathanael, Hou, Le, Wei, Jason, Scales, Nathan, Wang, Xuezhi, Schuurmans, Dale, Cui, Claire, Bousquet, Olivier, Le, Quoc, Chi, Ed

arXiv.org Artificial IntelligenceApr-16-2023

Chain-of-thought prompting has demonstrated remarkable performance on various natural language reasoning tasks. However, it tends to perform poorly on tasks which requires solving problems harder than the exemplars shown in the prompts. To overcome this challenge of easy-to-hard generalization, we propose a novel prompting strategy, least-to-most prompting. The key idea in this strategy is to break down a complex problem into a series of simpler subproblems and then solve them in sequence. Solving each subproblem is facilitated by the answers to previously solved subproblems. Our experimental results on tasks related to symbolic manipulation, compositional generalization, and math reasoning reveal that least-to-most prompting is capable of generalizing to more difficult problems than those seen in the prompts. A notable finding is that when the GPT-3 code-davinci-002 model is used with least-to-most prompting, it can solve the compositional generalization benchmark SCAN in any split (including length split) with an accuracy of at least 99% using just 14 exemplars, compared to only 16% accuracy with chain-of-thought prompting. This is particularly noteworthy because neural-symbolic models in the literature that specialize in solving SCAN are trained on the entire training set containing over 15,000 examples. We have included prompts for all the tasks in the Appendix.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2205.10625

Country:

Asia > Middle East > Bahrain (0.04)
North America > United States > Maryland (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
(12 more...)

Genre: Personal > Obituary (0.45)

Industry:

Leisure & Entertainment > Sports > Football (1.00)
Government (1.00)
Education (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Induced Natural Language Rationales and Interleaved Markup Tokens Enable Extrapolation in Large Language Models

Bueno, Mirelle, Gemmell, Carlos, Dalton, Jeffrey, Lotufo, Roberto, Nogueira, Rodrigo

arXiv.org Artificial IntelligenceNov-28-2022

The ability to extrapolate, i.e., to make predictions on sequences that are longer than those presented as training examples, is a challenging problem for current deep learning models. Recent work shows that this limitation persists in state-of-the-art Transformer-based models. Most solutions to this problem use specific architectures or training methods that do not generalize to other tasks. We demonstrate that large language models can succeed in extrapolation without modifying their architecture or training procedure. Our experimental results show that generating step-by-step rationales and introducing marker tokens are both required for effective extrapolation. First, we induce a language model to produce step-by-step rationales before outputting the answer to effectively communicate the task to the model. However, as sequences become longer, we find that current models struggle to keep track of token positions. To address this issue, we interleave output tokens with markup tokens that act as explicit positional and counting symbols. Our findings show how these two complementary approaches enable remarkable sequence extrapolation and highlight a limitation of current architectures to effectively generalize without explicit surface form guidance. Code available at https://github.com/MirelleB/induced-rationales-markup-tokens

digit, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2208.11445

Country:

South America > Brazil > São Paulo (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Inducing Transformer's Compositional Generalization Ability via Auxiliary Sequence Prediction Tasks

Jiang, Yichen, Bansal, Mohit

arXiv.org Artificial IntelligenceSep-30-2021

Systematic compositionality is an essential mechanism in human language, allowing the recombination of known parts to create novel expressions. However, existing neural models have been shown to lack this basic ability in learning symbolic structures. Motivated by the failure of a Transformer model on the SCAN compositionality challenge (Lake and Baroni, 2018), which requires parsing a command into actions, we propose two auxiliary sequence prediction tasks that track the progress of function and argument semantics, as additional training supervision. These automatically-generated sequences are more representative of the underlying compositional symbolic structures of the input data. During inference, the model jointly predicts the next action and the next tokens in the auxiliary sequences at each step. Experiments on the SCAN dataset show that our method encourages the Transformer to understand compositional structures of the command, improving its accuracy on multiple challenging splits from <= 10% to 100%. With only 418 (5%) training instances, our approach still achieves 97.8% accuracy on the MCD1 split. Therefore, we argue that compositionality can be induced in Transformers given minimal but proper guidance. We also show that a better result is achieved using less contextualized vectors as the attention's query, providing insights into architecture choices in achieving systematic compositionality. Finally, we show positive generalization results on the groundedSCAN task (Ruis et al., 2020). Our code is publicly available at: https://github.com/jiangycTarheel/compositional-auxseq

auxiliary sequence, generalization, sequence, (16 more...)

arXiv.org Artificial Intelligence

2109.15256

Country:

Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(3 more...)

Genre:

Research Report (0.82)
Workflow (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

em Evangelion /em 's Final Finale Does What Its Other Endings Couldn't

SlateAug-23-2021, 18:47:14 GMT

For being one of the most iconic and influential anime series of all time, Neon Genesis Evangelion is also one of the most confusing; as of the franchise's most recent film, released on Amazon Prime earlier this month, the series has officially ended four times. But the new--and truly final--movie, Evangelion: 3.0 1.0 Thrice Upon a Time, delivers a real capstone to the series, as well as a new argument for how to watch the series as a whole. In case you're totally unfamiliar with the series, the gist is as such: Three teenagers, Shinji Ikari (Megumi Ogata), Asuka Langley Shikinami (Yūko Miyamura), and Rei Ayanami (Megumi Hayashibara), serve as the pilots of giant robots known as Evangelions. Though their initial function was to fight against mysterious beings known as Angels, they know serve as pawns between the organization NERV, led by Shinji's father Gendo (Fumihiko Tachiki), who seeks to cause a mass extinction in order to reunite with his late wife, and WILLE, a group of former NERV employees who are now NERV's only opponents. The Rebuild of Evangelion tetralogy, of which Thrice Upon a Time is the last, serves as a sort of re-telling of the events of the original TV series.

evangelion, neon genesis evangelion, thrice, (12 more...)

Slate

Industry:

Leisure & Entertainment (0.71)
Media > Film (0.51)

Technology: Information Technology > Artificial Intelligence > Robots (0.39)

Add feedback

Discovering the Compositional Structure of Vector Representations with Role Learning Networks

Soulos, Paul, McCoy, Tom, Linzen, Tal, Smolensky, Paul

arXiv.org Machine LearningOct-20-2019

Neural networks (NNs) are able to perform tasks that rely on compositional structure even though they lack obvious mechanisms for representing this structure. To analyze the internal representations that enable such success, we propose ROLE, a technique that detects whether these representations implicitly encode symbolic structure. ROLE learns to approximate the representations of a target encoder E by learning a symbolic constituent structure and an embedding of that structure into E's representational vector space. The constituents of the approximating symbol structure are defined by structural positions --- roles --- that can be filled by symbols. We show that when E is constructed to explicitly embed a particular type of structure (string or tree), ROLE successfully extracts the ground-truth roles defining that structure. We then analyze a GRU seq2seq network trained to perform a more complex compositional task (SCAN), where there is no ground truth role scheme available. For this model, ROLE successfully discovers an interpretable symbolic structure that the model implicitly uses to perform the SCAN task, providing a comprehensive account of the representations that drive the behavior of a frequently-used but hard-to-interpret type of model. We verify the causal importance of the discovered symbolic structure by showing that, when we systematically manipulate hidden embeddings based on this symbolic structure, the model's resulting output is changed in the way predicted by our analysis. Finally, we use ROLE to explore whether popular sentence embedding models are capturing compositional structure and find evidence that they are not; we conclude by discussing how insights from ROLE can be used to impart new inductive biases to improve the compositional abilities of such models.

representation, role scheme, sequence, (15 more...)

arXiv.org Machine Learning

1910.09113

Country:

Europe > Italy > Tuscany > Florence (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks

Lake, Brenden M., Baroni, Marco

arXiv.org Artificial IntelligenceJun-6-2018

Humans can understand and produce new utterances effortlessly, thanks to their compositional skills. Once a person learns the meaning of a new verb "dax," he or she can immediately understand the meaning of "dax twice" or "sing and dax." In this paper, we introduce the SCAN domain, consisting of a set of simple compositional navigation commands paired with the corresponding action sequences. We then test the zero-shot generalization capabilities of a variety of recurrent neural networks (RNNs) trained on SCAN with sequence-to-sequence methods. We find that RNNs can make successful zero-shot generalizations when the differences between training and test commands are small, so that they can apply "mix-and-match" strategies to solve the task. However, when generalization requires systematic compositional skills (as in the "dax" example above), RNNs fail spectacularly. We conclude with a proof-of-concept experiment in neural machine translation, suggesting that lack of systematicity might be partially responsible for neural networks' notorious training data thirst.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

1711.0035

Country:

North America > United States (0.46)
North America > Canada (0.29)
Europe > Netherlands (0.28)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback