AITopics | belkin

Collaborating Authors

belkin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Feature maps for the Laplacian kernel and its generalizations

Ahir, Sudhendu, Pandit, Parthe

arXiv.org Machine LearningFeb-21-2025

Recent applications of kernel methods in machine learning have seen a renewed interest in the Laplacian kernel, due to its stability to the bandwidth hyperparameter in comparison to the Gaussian kernel, as well as its expressivity being equivalent to that of the neural tangent kernel of deep fully connected networks. However, unlike the Gaussian kernel, the Laplacian kernel is not separable. This poses challenges for techniques to approximate it, especially via the random Fourier features (RFF) methodology and its variants. In this work, we provide random features for the Laplacian kernel and its two generalizations: Mat\'{e}rn kernel and the Exponential power kernel. We provide efficiently implementable schemes to sample weight matrices so that random features approximate these kernels. These weight matrices have a weakly coupled heavy-tailed randomness. Via numerical experiments on real datasets we demonstrate the efficacy of these random feature maps.

kernel, laplacian kernel, random feature, (15 more...)

arXiv.org Machine Learning

2502.15575

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Spain > Canary Islands (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Kernel Methods (0.34)

Add feedback

Fast training of large kernel models with delayed projections

Abedsoltan, Amirhesam, Ma, Siyuan, Pandit, Parthe, Belkin, Mikhail

arXiv.org Machine LearningNov-25-2024

Classical kernel machines have historically faced significant challenges in scaling to large datasets and model sizes--a key ingredient that has driven the success of neural networks. In this paper, we present a new methodology for building kernel machines that can scale efficiently with both data size and model size. Our algorithm introduces delayed projections to Preconditioned Stochastic Gradient Descent (PSGD) allowing the training of much larger models than was previously feasible, pushing the practical limits of kernel-based learning. They have also served as the foundation 2024) leverage the Nyström Approximation (NA) in combination for understanding many significant phenomena in with other strategies to enhance performance. Despite these advantages, ASkotch combines it with block coordinate descent, the scalability of kernel methods has remained a persistent whereas Falkon combines it with the Conjugate Gradient challenge, particularly when applied to large datasets. However, this limitation is critical for expanding the utility these strategies are limited by model size due to memory of kernel-based techniques in modern machine learning applications.

eigenpro 3, eigenpro 4, model size, (17 more...)

arXiv.org Machine Learning

2411.16658

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Asia > India (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

On the Nystrom Approximation for Preconditioning in Kernel Machines

Abedsoltan, Amirhesam, Belkin, Mikhail, Pandit, Parthe, Rademacher, Luis

arXiv.org Machine LearningDec-6-2023

Kernel methods are a popular class of nonlinear predictive models in machine learning. Scalable algorithms for learning kernel models need to be iterative in nature, but convergence can be slow due to poor conditioning. Spectral preconditioning is an important tool to speed-up the convergence of such iterative algorithms for training kernel models. However computing and storing a spectral preconditioner can be expensive which can lead to large computational and storage overheads, precluding the application of kernel methods to problems with large datasets. A Nystrom approximation of the spectral preconditioner is often cheaper to compute and store, and has demonstrated success in practical applications. In this paper we analyze the trade-offs of using such an approximated preconditioner. Specifically, we show that a sample of logarithmic size (as a function of the size of the dataset) enables the Nystrom-based approximated preconditioner to accelerate gradient descent nearly as well as the exact preconditioner, while also reducing the computational and storage overheads.

artificial intelligence, machine learning, preconditioner, (17 more...)

arXiv.org Machine Learning

2312.03311

Country: Asia > Japan (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.36)

Add feedback

Toward Large Kernel Models

Abedsoltan, Amirhesam, Belkin, Mikhail, Pandit, Parthe

arXiv.org Artificial IntelligenceJun-19-2023

Recent studies indicate that kernel machines can often perform similarly or better than deep neural networks (DNNs) on small datasets. The interest in kernel machines has been additionally bolstered by the discovery of their equivalence to wide neural networks in certain regimes. However, a key feature of DNNs is their ability to scale the model size and training data size independently, whereas in traditional kernel machines model size is tied to data size. Because of this coupling, scaling kernel machines to large data has been computationally challenging. In this paper, we provide a way forward for constructing large-scale general kernel models, which are a generalization of kernel machines that decouples the model and data, allowing training on large datasets. Specifically, we introduce EigenPro 3.0, an algorithm based on projected dual preconditioned SGD and show scaling to model and data sizes which have not been possible with existing kernel methods.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2302.02605

Country:

North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Asia > India (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Cut your Losses with Squentropy

Hui, Like, Belkin, Mikhail, Wright, Stephen

arXiv.org Artificial IntelligenceFeb-8-2023

Nearly all practical neural models for classification are trained using cross-entropy loss. Yet this ubiquitous choice is supported by little theoretical or empirical evidence. Recent work (Hui & Belkin, 2020) suggests that training using the (rescaled) square loss is often superior in terms of the classification accuracy. In this paper we propose the "squentropy" loss, which is the sum of two terms: the cross-entropy loss and the average square loss over the incorrect classes. We provide an extensive set of experiments on multi-class classification problems showing that the squentropy loss outperforms both the pure cross entropy and rescaled square losses in terms of the classification accuracy. We also demonstrate that it provides significantly better model calibration than either of these alternative losses and, furthermore, has less variance with respect to the random initialization. Additionally, in contrast to the square loss, squentropy loss can typically be trained using exactly the same optimization parameters, including the learning rate, as the standard cross-entropy loss, making it a true "plug-and-play" replacement. Finally, unlike the rescaled square loss, multiclass squentropy contains no parameters that need to be adjusted.

artificial intelligence, machine learning, squentropy, (18 more...)

arXiv.org Artificial Intelligence

2302.03952

Country:

North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > Massachusetts (0.04)
Europe > Switzerland (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

A Deeper Understanding of Deep Learning

Communications of the ACMMay-21-2022, 06:10:18 GMT

Deep learning should not work as well as it seems to: according to traditional statistics and machine learning, any analysis that has too many adjustable parameters will overfit noisy training data, and then fail when faced with novel test data. In clear violation of this principle, modern neural networks often use vastly more parameters than data points, but they nonetheless generalize to new data quite well. The shaky theoretical basis for generalization has been noted for many years. One proposal was that neural networks implicitly perform some sort of regularization--a statistical tool that penalizes the use of extra parameters. Yet efforts to formally characterize such an "implicit bias" toward smoother solutions have failed, said Roi Livni, an advanced lecturer in the department of electrical engineering of Israel's Tel Aviv University.

generalization, learning, neural network, (15 more...)

Communications of the ACM

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.25)
North America > United States > Massachusetts > Suffolk County > Boston (0.05)
North America > United States > California > San Diego County > San Diego (0.05)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)

Add feedback

A New Link to an Old Model Could Crack the Mystery of Deep Learning

#artificialintelligenceOct-11-2021, 16:25:52 GMT

In the machine learning world, the sizes of artificial neural networks -- and their outsize successes -- are creating conceptual conundrums. When a network named AlexNet won an annual image recognition competition in 2012, it had about 60 million parameters. These parameters, fine-tuned during training, allowed AlexNet to recognize images that it had never seen before. Two years later, a network named VGG wowed the competition with more than 130 million such parameters. Some artificial neural networks, or ANNs, now have billions of parameters. These massive networks -- astoundingly successful at tasks such as classifying images, recognizing speech and translating text from one language to another -- have begun to dominate machine learning and artificial intelligence.

artificial intelligence, machine learning, neural network, (18 more...)

#artificialintelligence

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New Jersey (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

Belkin's $100 Soundform Connect dongle adds AirPlay 2 to any speaker

EngadgetMay-20-2021, 18:26:58 GMT

Some smart home aficionados still eulogize Google's Chromecast Audio, but Belkin's new Soundform Connect aims to fulfill a similar role -- for iOS users, anyway. The $100 dongle can connect to any traditional home speaker and turn it into an AirPlay 2-compatible smart speaker you can cast audio to from iPhones and iPads running iOS 11.4 and iPadOS 11.4 or newer, plus Macs running Catalina and Apple TVs with tvOS 11.4. And when we "any" home speaker, we really mean it. The Soundform has at least one nice touch the Chromecast doesn't -- beyond still existing, that is. In addition to the classic 3.5mm jack, there's also a port for standard optical connections -- the Chromecast Audio required audiophiles to own or purchase a TOSLINK-to-3.5mm According to Belkin, users will able to ask Siri to play their music or podcasts on the speaker in question, as well as ask the virtual assistant what's playing in each room and remotely control the speaker's volume.

airplay 2, belkin, chromecast audio, (1 more...)

Engadget

Industry:

Appliances & Durable Goods (0.61)
Information Technology > Smart Houses & Appliances (0.41)

Technology:

Information Technology > Communications > Mobile (0.99)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.99)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.61)

Add feedback

Belkin SoundForm Elite Hi-Fi smart speaker review: The case of the missing midrange

PCWorldMay-29-2020, 01:50:48 GMT

My thoughts about the Belkin SoundForm Elite Hi-Fi Smart Speaker Wireless Charging can be distilled in a single word: boring. Listening to a $300 speaker should be exciting. Belkin doesn't have a track record of building great audio equipment, but its partner on this project--the French audiophile company Devialet--most certainly does. The Devialet Phantom blew my mind when I reviewed it five years ago. So, I had high hopes when I learned Belkin had enlisted that company's expertise to develop something more mainstream.

artificial intelligence, natural language, soundform elite, (14 more...)

PCWorld

Industry: Appliances & Durable Goods (0.75)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.75)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.75)

Add feedback

Tech can help trim your utility bills, which may be on the rise amid coronavirus shutdown

USATODAY - Tech Top StoriesApr-25-2020, 22:18:26 GMT

Now that most of the country is hunkered down at home to quell the spread of COVID-19, chances are you're spending more on electricity and other utilities. After all, more lights are on and for a longer period of time. You may be turning up the heat to stay warm (especially for northern states). Appliances – ovens, stoves, dishwashers, and washers and dryers – are getting more use than ever before. And then there's laptops and desktops constantly on for doing work or attending virtual classes at school, or perhaps binging TV shows or playing video games.

coronavirus shutdown, electricity, utility bill, (14 more...)

USATODAY - Tech Top Stories

Industry:

Energy (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.71)
Health & Medicine > Therapeutic Area > Immunology (0.71)

Technology:

Information Technology > Communications > Social Media (0.99)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.73)
Information Technology > Communications > Mobile (0.72)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.50)

Add feedback