xtx
Accelerating SGD for Highly Ill-Conditioned Huge-Scale Online Matrix Completion
Gavin Zhang, University of Illinois at Urbana–Champaign, jialun2@illinois.edu, "3026 Hong-Ming Chiu, University of Illinois at Urbana–Champaign, hmchiu2@illinois.edu, "3026 Richard Y. Zhang, University of Illinois at Urbana–Champaign, ryz@illinois.edu
Stable Diffusion Benchmarked: Which GPU Runs AI Fastest
Artificial Intelligence and deep learning are constantly in the headlines these days, whether it be ChatGPT generating poor advice, self-driving cars, artists being accused of using AI, medical advice from AI, and more. Most of these tools rely on complex servers with lots of hardware for training, but using the trained network via inference can be done on your PC, using its graphics card. But how fast are consumer GPUs for doing AI inference? We've benchmarked Stable Diffusion, a popular AI image creator, on the latest Nvidia, AMD, and even Intel GPUs to see how they stack up. If you've by chance tried to get Stable Diffusion up and running on your own PC, you may have some inkling of how complex -- or simple!
Variational Gram Functions: Convex Analysis and Optimization
Jalali, Amin, Fazel, Maryam, Xiao, Lin
We propose a new class of convex penalty functions, called \emph{variational Gram functions} (VGFs), that can promote pairwise relations, such as orthogonality, among a set of vectors in a vector space. These functions can serve as regularizers in convex optimization problems arising from hierarchical classification, multitask learning, and estimating vectors with disjoint supports, among other applications. We study convexity for VGFs, and give efficient characterizations for their convex conjugates, subdifferentials, and proximal operators. We discuss efficient optimization algorithms for regularized loss minimization problems where the loss admits a common, yet simple, variational representation and the regularizer is a VGF. These algorithms enjoy a simple kernel trick, an efficient line search, as well as computational advantages over first order methods based on the subdifferential or proximal maps. We also establish a general representer theorem for such learning problems. Lastly, numerical experiments on a hierarchical classification problem are presented to demonstrate the effectiveness of VGFs and the associated optimization algorithms.
- North America > United States > Washington > King County > Seattle (0.14)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > Washington > King County > Redmond (0.04)
This Bank-Beating Trading Powerhouse Doesn't Use Human Traders
One of the world's fastest-growing trading shops doesn't have any traders. XTX Markets Ltd. has emerged as a foreign-exchange powerhouse, relying on programmers and mathematicians to fuel its rise into the global top five earlier this year. Now, after becoming a formidable player in currencies, XTX has its sights set on growing in stocks, commodities and bonds markets. But in a world where the difference between profit and loss can be tiny fractions of a second, XTX says it relies more on smarts than speed. Instead of building microwave networks to ferret out prices a microsecond before anyone else, XTX uses mathematical models that are tuned with massive data sets.
- North America > United States > New York > New York County > New York City (0.05)
- North America > United States > California (0.05)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
Simple one-pass algorithm for penalized linear regression with cross-validation on MapReduce
In this paper, we propose a one-pass algorithm on MapReduce for penalized linear regression \[f_\lambda(\alpha, \beta) = \|Y - \alpha\mathbf{1} - X\beta\|_2^2 + p_{\lambda}(\beta)\] where $\alpha$ is the intercept which can be omitted depending on application; $\beta$ is the coefficients and $p_{\lambda}$ is the penalized function with penalizing parameter $\lambda$. $f_\lambda(\alpha, \beta)$ includes interesting classes such as Lasso, Ridge regression and Elastic-net. Compared to latest iterative distributed algorithms requiring multiple MapReduce jobs, our algorithm achieves huge performance improvement; moreover, our algorithm is exact compared to the approximate algorithms such as parallel stochastic gradient decent. Moreover, what our algorithm distinguishes with others is that it trains the model with cross validation to choose optimal $\lambda$ instead of user specified one. Key words: penalized linear regression, lasso, elastic-net, ridge, MapReduce