nested iteration
Multilevel Initialization for Layer-Parallel Deep Neural Network Training
Cyr, Eric C., Günther, Stefanie, Schroder, Jacob B.
This paper investigates multilevel initialization strategies for training very deep neural networks with a layer-parallel multigrid solver. The scheme is based on the continuous interpretation of the training problem as a problem of optimal control, in which neural networks are represented as discretizations of time-dependent ordinary differential equations. A key goal is to develop a method able to intelligently initialize the network parameters for the very deep networks enabled by scalable layer-parallel training. To do this, we apply a refinement strategy across the time domain, that is equivalent to refining in the layer dimension. The resulting refinements create deep networks, with good initializations for the network parameters coming from the coarser trained networks. We investigate the effectiveness of such multilevel "nested iteration" strategies for network training, showing supporting numerical evidence of reduced run time for equivalent accuracy. In addition, we study whether the initialization strategies provide a regularizing effect on the overall training process and reduce sensitivity to hyperparameters and randomness in initial network parameters.
A Decidable Class of Nested Iterated Schemata (extended version)
Aravantinos, Vincent, Caferra, Ricardo, Peltier, Nicolas
Many problems can be specified by patterns of propositional formulae depending on a parameter, e.g. the specification of a circuit usually depends on the number of bits of its input. We define a logic whose formulae, called "iterated schemata", allow to express such patterns. Schemata extend propositional logic with indexed propositions, e.g. P_i, P_i+1, P_1, and with generalized connectives, e.g. /\i=1..n or i=1..n (called "iterations") where n is an (unbound) integer variable called a "parameter". The expressive power of iterated schemata is strictly greater than propositional logic: it is even out of the scope of first-order logic. We define a proof procedure, called DPLL*, that can prove that a schema is satisfiable for at least one value of its parameter, in the spirit of the DPLL procedure. However the converse problem, i.e. proving that a schema is unsatisfiable for every value of the parameter, is undecidable so DPLL* does not terminate in general. Still, we prove that it terminates for schemata of a syntactic subclass called "regularly nested". This is the first non trivial class for which DPLL* is proved to terminate. Furthermore the class of regularly nested schemata is the first decidable class to allow nesting of iterations, i.e. to allow schemata of the form /\i=1..n (/\j=1..n ...).