schoenberg
SchoenbAt: Rethinking Attention with Polynomial basis
Guo, Yuhan, Ding, Lizhong, Yang, Yuwan, Guo, Xuewei
Kernelized attention extends the attention mechanism by modeling sequence correlations through kernel functions, making significant progresses in optimizing attention. Under the guarantee of harmonic analysis theory, kernel functions can be expanded with basis functions, inspiring random feature-based approaches to enhance the efficiency of kernelized attention while maintaining predictive performance. However, current random feature-based works are limited to the Fourier basis expansions under Bochner's theorem. We propose Schoenberg's theorem-based attention (SchoenbAt), which approximates dot-product kernelized attention with the polynomial basis under Schoenberg's theorem via random Maclaurin features and applies a two-stage regularization to constrain the input space and restore the output scale, acting as a drop-in replacement of dot-product kernelized attention. Our theoretical proof of the unbiasedness and concentration error bound of SchoenbAt supports its efficiency and accuracy as a kernelized attention approximation, which is also empirically validated under various random feature dimensions. Evaluations on real-world datasets demonstrate that SchoenbAt significantly enhances computational speed while preserving competitive performance in terms of precision, outperforming several efficient attention methods.
- North America > Canada > Ontario > Toronto (0.14)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Asia > China > Beijing > Beijing (0.04)
Flexible Parametric Inference for Space-Time Hawkes Processes
Siviero, Emilia, Staerman, Guillaume, Clémençon, Stephan, Moreau, Thomas
Many modern spatio-temporal data sets, in sociology, epidemiology or seismology, for example, exhibit self-exciting characteristics, triggering and clustering behaviors both at the same time, that a suitable Hawkes space-time process can accurately capture. This paper aims to develop a fast and flexible parametric inference technique to recover the parameters of the kernel functions involved in the intensity function of a space-time Hawkes process based on such data. Our statistical approach combines three key ingredients: 1) kernels with finite support are considered, 2) the space-time domain is appropriately discretized, and 3) (approximate) precomputations are used. The inference technique we propose then consists of a $\ell_2$ gradient-based solver that is fast and statistically accurate. In addition to describing the algorithmic aspects, numerical experiments have been carried out on synthetic and real spatio-temporal data, providing solid empirical evidence of the relevance of the proposed methodology.
- Energy > Oil & Gas > Upstream (0.48)
- Health & Medicine > Epidemiology (0.34)
The Sonic Revolutions of George Lewis
The piece seems to conjure a prehistoric avant-garde musical workshop, a sonic analogue of the visual culture that can be glimpsed in the cave. Fully notated passages--scampering runs, precisely hammering chords, ghostly arpeggios--are interspersed with opportunities for improvisation. The first twenty-four bars indicate rhythms, dynamics, and registers but not precise pitches. The ending, too, is left open. Cory Smythe, himself a composer and improviser of note, proved an ideal conduit, making the distinction between Lewis's ideas and his own elaborations inconsequential.
- North America > United States > Illinois > Cook County > Chicago (0.06)
- North America > United States > New York (0.05)
- North America > United States > California > San Diego County > San Diego (0.05)
- (3 more...)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
Functions that Preserve Manhattan Distances
What functions, when applied to the pairwise Manhattan distances between any n points, result in the Manhattan distances between another set of n points? In this paper, we show that a function has this property if and only if it is Bernstein. This class of functions admits several classical analytic characterizations and includes f(x) x s for 0 s 1 as well as f(x) 1-e -xt for any t 0. While it was previously known that Bernstein functions had this property, it was not known that these were the only such functions. Our results are a natural extension of the work of Schoenberg from 1938, who addressed this question for Euclidean distances. Schoenberg's work has been applied in probability theory, harmonic analysis, machine learning, theoretical computer science, and more.
Automated Diagram Generation to Build Understanding and Usability
Causal loop and stock and flow diagrams are broadly used in System Dynamics because they help organize relationships and convey meaning. Using the analytical work of Schoenberg (2019) to select what to include in a compressed model, this paper demonstrates how that information can be clearly presented in an automatically generated causal loop diagram. The diagrams are generated using tools developed by people working in graph theory and the generated diagrams are clear and aesthetically pleasing. This approach can also be built upon to generate stock and flow diagrams. Automated stock and flow diagram generation opens the door to representing models developed using only equations, regardless or origin, in a clear and easy to understand way. Because models can be large, the application of grouping techniques, again developed for graph theory, can help structure the resulting diagrams in the most usable form. This paper describes the algorithms developed for automated diagram generation and shows a number of examples of their uses in large models. The application of these techniques to existing, but inaccessible, equation-based models can help broaden the knowledge base for System Dynamics modeling. The techniques can also be used to improve layout in all, or part, of existing models with diagrammatic informtion.
- Research Report (0.64)
- Workflow (0.47)
Semiparametric Bayesian Forecasting of Spatial Earthquake Occurrences
Kolev, Aleksandar A., Ross, Gordon J.
Self-exciting Hawkes processes are used to model events which cluster in time and space, and have been widely studied in seismology under the name of the Epidemic Type Aftershock Sequence (ETAS) model. In the ETAS framework, the occurrence of the mainshock earthquakes in a geographical region is assumed to follow an inhomogeneous spatial point process, and aftershock events are then modelled via a separate triggering kernel. Most previous studies of the ETAS model have relied on point estimates of the model parameters due to the complexity of the likelihood function, and the difficulty in estimating an appropriate mainshock distribution. In order to take estimation uncertainty into account, we instead propose a fully Bayesian formulation of the ETAS model which uses a nonparametric Dirichlet process mixture prior to capture the spatial mainshock process. Direct inference for the resulting model is problematic due to the strong correlation of the parameters for the mainshock and triggering processes, so we instead use an auxiliary latent variable routine to perform efficient inference.
- North America > United States > California (0.14)
- Europe > Italy (0.05)
- Europe > Romania (0.04)
- (10 more...)
- Law Enforcement & Public Safety (1.00)
- Government > Regional Government (1.00)
A philosopher argues that an AI can never be an artist
On March 31, 1913, in the Great Hall of the Musikverein concert house in Vienna, a riot broke out in the middle of a performance of an orchestral song by Alban Berg. Police arrested the concert's organizer for punching Oscar Straus, a little-remembered composer of operettas. Later, at the trial, Straus quipped about the audience's frustration. The punch, he insisted, was the most harmonious sound of the entire evening. History has rendered a different verdict: the concert's conductor, Arnold Schoenberg, has gone down as perhaps the most creative and influential composer of the 20th century. You may not enjoy Schoenberg's dissonant music, which rejects conventional tonality to arrange the 12 notes of the scale according to rules that don't let any predominate. But he changed what humans understand music to be. This is what makes him a genuinely creative and innovative artist.
- Europe > Austria > Vienna (0.25)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Illinois (0.04)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
Random Feature Maps for Dot Product Kernels
Kar, Purushottam, Karnick, Harish
Approximating non-linear kernels using feature maps has gained a lot of interest in recent years due to applications in reducing training and testing times of SVM classifiers and other kernel based learning algorithms. We extend this line of work and present low distortion embeddings for dot product kernels into linear Euclidean spaces. We base our results on a classical result in harmonic analysis characterizing all dot product kernels and use it to define randomized feature maps into explicit low dimensional Euclidean spaces in which the native dot product provides an approximation to the dot product kernel with high confidence.
- Asia > Middle East > Jordan (0.04)
- North America > United States > New York (0.04)
- Europe > Spain > Canary Islands (0.04)
- Asia > India > Uttar Pradesh > Kanpur (0.04)