inductive generalization
Inductive Gradient Adjustment For Spectral Bias In Implicit Neural Representations
Shi, Kexuan, Chen, Hai, Zhang, Leheng, Gu, Shuhang
Implicit Neural Representations (INRs), as a versatile representation paradigm, have achieved success in various computer vision tasks. Due to the spectral bias of the vanilla multi-layer perceptrons (MLPs), existing methods focus on designing MLPs with sophisticated architectures or repurposing training techniques for highly accurate INRs. In this paper, we delve into the linear dynamics model of MLPs and theoretically identify the empirical Neural Tangent Kernel (eNTK) matrix as a reliable link between spectral bias and training dynamics. Based on eNTK matrix, we propose a practical inductive gradient adjustment method, which could purposefully improve the spectral bias via inductive generalization of eNTK-based gradient transformation matrix. We evaluate our method on different INRs tasks with various INR architectures and compare to existing training techniques. The superior representation performance clearly validates the advantage of our proposed method. Armed with our gradient adjustment method, better INRs with more enhanced texture details and sharpened edges can be learned from data by tailored improvements on spectral bias.
Bayesian Models of Inductive Generalization
We argue that human inductive generalization is best explained in a Bayesian framework, rather than by traditional models based on simi- larity computations. We go beyond previous work on Bayesian concept learning by introducing an unsupervised method for constructing flex- ible hypothesis spaces, and we propose a version of the Bayesian Oc- cam's razor that trades off priors and likelihoods to prevent under- or over-generalization in these flexible spaces. We analyze two published data sets on inductive reasoning as well as the results of a new behavioral study that we have carried out.
On Semantic Cognition, Inductive Generalization, and Language Models
My doctoral research focuses on understanding semantic knowledge in neural network models trained solely to predict natural language (referred to as language models, or LMs), by drawing on insights from the study of concepts and categories grounded in cognitive science. I propose a framework inspired by 'inductive reasoning,' a phenomenon that sheds light on how humans utilize background knowledge to make inductive leaps and generalize from new pieces of information about concepts and their properties. Drawing from experiments that study inductive reasoning, I propose to analyze semantic inductive generalization in LMs using phenomena observed in human-induction literature, investigate inductive behavior on tasks such as implicit reasoning and emergent feature recognition, and analyze and relate induction dynamics to the learned conceptual representation space.
I-BERT: Inductive Generalization of Transformer to Arbitrary Context Lengths
Nam, Hyoungwook, Seo, Seung Byum, Mailthody, Vikram Sharma, Michael, Noor, Li, Lan
Self-attention has emerged as a vital component of state-of-the-art sequence-to-sequence models for natural language processing in recent years, brought to the forefront by pre-trained bi-directional Transformer models. Its effectiveness is partly due to its non-sequential architecture, which promotes scalability and parallelism but limits the model to inputs of a bounded length. In particular, such architectures perform poorly on algorithmic tasks, where the model must learn a procedure which generalizes to input lengths unseen in training, a capability we refer to as inductive generalization. Identifying the computational limits of existing self-attention mechanisms, we propose I-BERT, a bi-directional Transformer that replaces positional encodings with a recurrent layer. The model inductively generalizes on a variety of algorithmic tasks where state-of-the-art Transformer models fail to do so. We also test our method on masked language modeling tasks where training and validation sets are partitioned to verify inductive generalization. Out of three algorithmic and two natural language inductive generalization tasks, I-BERT achieves state-of-the-art results on four tasks.
Blocksworld Revisited: Learning and Reasoning to Generate Event-Sequences from Image Pairs
Gokhale, Tejas, Sampat, Shailaja, Fang, Zhiyuan, Yang, Yezhou, Baral, Chitta
The process of identifying changes or transformations in a scene along with the ability of reasoning about their causes and effects, is a key aspect of intelligence. In this work we go beyond recent advances in computational perception, and introduce a more challenging task, Image-based Event-Sequencing (IES). In IES, the task is to predict a sequence of actions required to rearrange objects from the configuration in an input source image to the one in the target image. IES also requires systems to possess inductive generalizability. Motivated from evidence in cognitive development, we compile the first IES dataset, the Blocksworld Image Reasoning Dataset (BIRD) which contains images of wooden blocks in different configurations, and the sequence of moves to rearrange one configuration to the other. We first explore the use of existing deep learning architectures and show that these end-to-end methods under-perform in inferring temporal event-sequences and fail at inductive generalization. We then propose a modular two-step approach: Visual Perception followed by Event-Sequencing, and demonstrate improved performance by combining learning and reasoning. Finally, by showing an extension of our approach on natural images, we seek to pave the way for future research on event sequencing for real world scenes.
A Further Note on Inductive Generalization
In this paper, we develop the algorithm, given in Plotkin (1970), for findingthe least generalization of two clauses, into a theory of inductive generalization.The types of hypothesis which can be formed are very simple. They allhave the form: (x)Px --> Qx.We have been guided by ideas from the philosophy of science, followingBuchanan (1966). There is no search for infallible methods of generatingtrue hypotheses. Instead we define (in terms of first-order predicate calculus)the notions of data and evidence for the data. Next, some formal criteria areset up for a sentence to be a descriptive hypothesis which is a good explanationof the data, given the evidence. We can then look for the best such hypothesis.Machine Intelligence 6