\emph{Lifted} RDT based capacity analysis of the 1-hidden layer treelike \emph{sign} perceptrons neural networks
We consider the memorization capabilities of multilayered \emph{sign} perceptrons neural networks (SPNNs). A recent rigorous upper-bounding capacity characterization, obtained in \cite{Stojnictcmspnncaprdt23} utilizing the Random Duality Theory (RDT), demonstrated that adding neurons in a network configuration may indeed be very beneficial. Moreover, for particular \emph{treelike committee machines} (TCM) architectures with $d\leq 5$ neurons in the hidden layer, \cite{Stojnictcmspnncaprdt23} made a very first mathematically rigorous progress in over 30 years by lowering the previously best known capacity bounds of \cite{MitchDurb89}. Here, we first establish that the RDT bounds from \cite{Stojnictcmspnncaprdt23} scale as $\sim \sqrt{d}$ and can not on their own \emph{universally} (over the entire range of $d$) beat the best known $\sim \log(d)$ scaling of the bounds from \cite{MitchDurb89}. After recognizing that the progress from \cite{Stojnictcmspnncaprdt23} is therefore promising, but yet without a complete concretization, we then proceed by considering the recently developed fully lifted RDT (fl RDT) as an alternative. While the fl RDT is indeed a powerful juggernaut, it typically relies on heavy numerical evaluations. To avoid such heavy numerics, we here focus on a simplified, \emph{partially lifted}, variant and show that it allows for very neat, closed form, analytical capacity characterizations. Moreover, we obtain the concrete capacity bounds that \emph{universally} improve for \emph{any} $d$ over the best known ones of \cite{MitchDurb89}.
Dec-13-2023
- Country:
- Asia > Middle East
- Israel (0.04)
- Europe > Switzerland
- Basel-City > Basel (0.04)
- North America > United States
- New York > Erie County
- Buffalo (0.04)
- Ohio > Montgomery County
- Dayton (0.04)
- Texas
- Dallas County > Dallas (0.04)
- Travis County > Austin (0.04)
- New York > Erie County
- Asia > Middle East
- Genre:
- Research Report (0.52)
- Technology: