stride
The Order Is The Message
In a controlled experiment on modular arithmetic ($p = 9973$), varying only example ordering while holding all else constant, two fixed-ordering strategies achieve 99.5\% test accuracy by epochs 487 and 659 respectively from a training set comprising 0.3\% of the input space, well below established sample complexity lower bounds for this task under IID ordering. The IID baseline achieves 0.30\% after 5{,}000 epochs from identical data. An adversarially structured ordering suppresses learning entirely. The generalizing model reliably constructs a Fourier representation whose fundamental frequency is the Fourier dual of the ordering structure, encoding information present in no individual training example, with the same fundamental emerging across all seeds tested regardless of initialization or training set composition. We discuss implications for training efficiency, the reinterpretation of grokking, and the safety risks of a channel that evades all content-level auditing.
- Research Report > Experimental Study (0.54)
- Research Report > New Finding (0.46)
- North America > Canada (0.04)
- Europe > France (0.04)
c39e1a03859f9ee215bc49131d0caf33-Supplemental.pdf
Additionally, we show generalization performance of our proposed method across differentvisualdomains. Withthegiven problemcategory(task),asubsetforlearning can be sampled (via domain episode module in Figure 4 in main text). Here, by replacingclass with task, K-shot andN-task reasoning framework can be defined. Here, we show analogical learning with the existing meta learning framework for fast adaptation fromthesourcedomain tothetargetdomain.
- Europe > Austria > Vienna (0.14)
- Asia > Sri Lanka > Central Province > Kandy District > Kandy (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- (3 more...)
- Law (1.00)
- Government (1.00)
- Information Technology (0.67)
- Transportation > Ground > Road (0.46)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
- (5 more...)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Health & Medicine > Health Care Technology (0.91)
- Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.31)
- North America > United States > California (0.04)
- Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.68)
How Sparse Can We Prune A Deep Network: A Fundamental Limit Perspective
Network pruning is a commonly used measure to alleviate the storage and computational burden of deep neural networks. However, the fundamental limit of network pruning is still lacking. To close the gap, in this work we'll take a first-principles approach, i.e. we'll directly impose the sparsity constraint on the loss function and leverage the framework of statistical dimension in convex geometry, thus enabling us to characterize the sharp phase transition point, which can be regarded as the fundamental limit of the pruning ratio. Through this limit, we're able to identify two key factors that determine the pruning ratio limit, namely, weight magnitude and network sharpness .
- North America > Canada > Ontario > Toronto (0.14)
- Asia > China (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Asia > Middle East > Israel (0.04)
- Leisure & Entertainment (0.46)
- Information Technology (0.46)
- Research Report > Promising Solution (0.47)
- Research Report > New Finding (0.46)