AITopics | silverman

Collaborating Authors

silverman

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Adaptive Kernel Density Estimation with Pre-training

Zhang, Ruitong, Deng, Ke

arXiv.org Machine LearningMay-14-2026

Density estimation in high-dimensional settings is an important and challenging statistical problem.Traditional methods based on kernel smoothing are inefficient in high dimensions due to the difficulties in specifying appropriate location-adaptive kernels. In this work, we introduce pre-training, a key idea behind many cutting-edge AI technologies, to the context of non-parametric density estimation. By establishing a pre-trained neural network that can recommend an appropriate location-adaptive kernel for each sample point, efficient density estimation with adaptive kernels is achieved in high dimensions. A wide range of numerical experiments show that this strategy is highly effective for improving density-estimation accuracy, when the target distribution is close to the distribution family for pre-training. When the target distribution is substantially different from the pre-training distribution family, the benefit from the proposed pre-training strategy may be diluted, but can be reactivated by an additional fine-tuning procedure.

artificial intelligence, density estimation, machine learning, (17 more...)

arXiv.org Machine Learning

2605.13092

Country:

North America > United States (0.14)
Asia > China (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

How squirrels actually find all their buried nuts

Popular ScienceNov-7-2025, 14:00:00 GMT

Every fall, squirrels hide hundreds of acorns--and use smell, memory, and even theft to get them back. Every fall, squirrels stash hundreds of nuts to survive the colder winter months. Breakthroughs, discoveries, and DIY tips sent every weekday. As someone who routinely "hides" things from myself--car keys, receipts, even my phone while I'm actively talking on it--I felt instantly validated by Sarah Silverman's joke that squirrels forget where they bury 80% of their nuts. "And that's how trees are planted!"

clarissa brincat, perlut, squirrel, (14 more...)

Popular Science

Country:

North America > United States > New Jersey (0.05)
Asia > Middle East > UAE > Dubai Emirate > Dubai (0.05)

Genre: Research Report (0.35)

Industry:

Retail (0.70)
Media > Photography (0.48)

Technology: Information Technology > Artificial Intelligence (0.50)

Add feedback

Density estimation with LLMs: a geometric investigation of in-context learning trajectories

Liu, Toni J. B., Boullé, Nicolas, Sarfati, Raphaël, Earls, Christopher J.

arXiv.org Machine LearningOct-9-2024

Large language models (LLMs) demonstrate remarkable emergent abilities to perform in-context learning across various tasks, including time series forecasting. This work investigates LLMs' ability to estimate probability density functions (PDFs) from data observed in-context; such density estimation (DE) is a fundamental task underlying many probabilistic modeling problems. We leverage the Intensive Principal Component Analysis (InPCA) to visualize and analyze the in-context learning dynamics of LLaMA-2 models. Our main finding is that these LLMs all follow similar learning trajectories in a low-dimensional InPCA space, which are distinct from those of traditional density estimation methods like histograms and Gaussian kernel density estimation (KDE). We interpret the LLaMA in-context DE process as a KDE with an adaptive kernel width and shape. This custom kernel model captures a significant portion of LLaMA's behavior despite having only two parameters. We further speculate on why LLaMA's kernel width and shape differs from classical algorithms, providing insights into the mechanism of in-context probabilistic reasoning in LLMs.

estimation, hellinger distance, trajectory, (15 more...)

arXiv.org Machine Learning

2410.05218

Country:

North America > United States > California (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
Asia > India > West Bengal > Kolkata (0.04)

Genre: Research Report (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Neural Conditional Probability for Inference

Kostic, Vladimir R., Lounici, Karim, Pacreau, Gregoire, Novelli, Pietro, Turri, Giacomo, Pontil, Massimiliano

arXiv.org Machine LearningJul-1-2024

We introduce NCP (Neural Conditional Probability), a novel operator-theoretic approach for learning conditional distributions with a particular focus on inference tasks. NCP can be used to build conditional confidence regions and extract important statistics like conditional quantiles, mean, and covariance. It offers streamlined learning through a single unconditional training phase, facilitating efficient inference without the need for retraining even when conditioning changes. By tapping into the powerful approximation capabilities of neural networks, our method efficiently handles a wide variety of complex probability distributions, effectively dealing with nonlinear relationships between input and output variables. Theoretical guarantees ensure both optimization consistency and statistical accuracy of the NCP method. Our experiments show that our approach matches or beats leading methods using a simple Multi-Layer Perceptron (MLP) with two hidden layers and GELU activations. This demonstrates that a minimalistic architecture with a theoretically grounded loss function can achieve competitive results without sacrificing performance, even in the face of more complex architectures.

confidence interval, density estimation, estimation, (17 more...)

arXiv.org Machine Learning

2407.01171

Country:

North America > United States > New York (0.04)
Europe > Serbia > Vojvodina > South Bačka District > Novi Sad (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.81)

Industry: Health & Medicine (0.67)

Add feedback

As Employers Embrace AI, Workers Fret--and Seek Input

TIME - TechJun-21-2024, 18:57:15 GMT

The Swedish buy-now-pay-later company Klarna has become something of a poster child for the potential benefits of generative artificial intelligence. The company relies on AI to create and tailor promotional images and to draft marketing copy, saving millions of dollars. Earlier this year it said an AI chatbot assistant was doing the work of 700 human customer-service agents, which it forecast would boost profits by 40 million this year. Klarna's approach highlights generative AI's promise for powering businesswide systems, like customer service. U.S. businesses are investing in AI, and they're eager to see such gains.

employer embrace ai, silverman, worker fret, (7 more...)

TIME - Tech

Country: North America (0.06)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.75)

Add feedback

Sarah Silverman's copyright infringement suit against OpenAI will advance in pared-down form

EngadgetFeb-13-2024, 21:14:56 GMT

Sarah Silverman's lawsuit against OpenAI will advance with some of her legal team's claims dismissed. The comedian sued OpenAI and Meta in July 2023, claiming they trained their AI models on her books and other work without consent. Bloomberg reported on Tuesday that the unfair competition portion of the lawsuit will proceed. Judge Martínez-Olguín gave the plaintiffs until March 13 to amend the suit. US District Judge Araceli Martínez-Olguín threw out portions of the complaint from Silverman's legal team Monday, including negligence, unjust enrichment, DMCA violations and accusations of vicarious infringement.

openai, sarah silverman, silverman, (7 more...)

Engadget

Country: North America > United States > California > San Francisco County > San Francisco (0.07)

Industry: Law > Intellectual Property & Technology Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Robust Multi-Modal Density Estimation

Mészáros, Anna, Schumann, Julian F., Alonso-Mora, Javier, Zgonnikov, Arkady, Kober, Jens

arXiv.org Artificial IntelligenceJan-19-2024

Development of multi-modal, probabilistic prediction models has lead to a need for comprehensive evaluation metrics. While several metrics can characterize the accuracy of machine-learned models (e.g., negative log-likelihood, Jensen-Shannon divergence), these metrics typically operate on probability densities. Applying them to purely sample-based prediction models thus requires that the underlying density function is estimated. However, common methods such as kernel density estimation (KDE) have been demonstrated to lack robustness, while more complex methods have not been evaluated in multi-modal estimation problems. In this paper, we present ROME (RObust Multi-modal density Estimator), a non-parametric approach for density estimation which addresses the challenge of estimating multi-modal, non-normal, and highly correlated distributions. ROME utilizes clustering to segment a multi-modal set of samples into multiple uni-modal ones and then combines simple KDE estimates obtained for individual clusters in a single multi-modal estimate. We compared our approach to state-of-the-art methods for density estimation as well as ablations of ROME, showing that it not only outperforms established methods but is also more robust to a variety of distributions. Our results demonstrate that ROME can overcome the issues of over-fitting and over-smoothing exhibited by other estimators, promising a more robust evaluation of probabilistic machine learning models.

density estimator, estimator, rome, (14 more...)

arXiv.org Artificial Intelligence

2401.10566

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > South Holland > Delft (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback

Bandwidth Selection for Gaussian Kernel Ridge Regression via Jacobian Control

Allerbo, Oskar, Jörnsten, Rebecka

arXiv.org Machine LearningDec-1-2023

Most machine learning methods require tuning of hyper-parameters. For kernel ridge regression with the Gaussian kernel, the hyper-parameter is the bandwidth. The bandwidth specifies the length scale of the kernel and has to be carefully selected to obtain a model with good generalization. The default methods for bandwidth selection, cross-validation and marginal likelihood maximization, often yield good results, albeit at high computational costs. Inspired by Jacobian regularization, we formulate an approximate expression for how the derivatives of the functions inferred by kernel ridge regression with the Gaussian kernel depend on the kernel bandwidth. We use this expression to propose a closed-form, computationally feather-light, bandwidth selection heuristic, based on controlling the Jacobian. In addition, the Jacobian expression illuminates how the bandwidth selection is a trade-off between the smoothness of the inferred function and the conditioning of the training data kernel matrix. We show on real and synthetic data that compared to cross-validation and marginal likelihood maximization, our method is on pair in terms of model performance, but up to six orders of magnitude faster.

artificial intelligence, bandwidth, machine learning, (17 more...)

arXiv.org Machine Learning

2205.11956

Country:

North America > United States > California (0.04)
Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(4 more...)

Genre: Research Report (0.64)

Industry: Energy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.82)

Add feedback

What I Found in a Database Meta Uses to Train Generative AI

The Atlantic - TechnologySep-25-2023, 17:27:40 GMT

Editor's note: This article is part of The Atlantic's series on Books3. You can search the database for yourself here, and read about its origins here. This summer, I reported on a data set of more than 191,000 books that were used without permission to train generative-AI systems by Meta, Bloomberg, and others. "Books3," as it's called, was based on a collection of pirated ebooks that includes travel guides, self-published erotic fiction, novels by Stephen King and Margaret Atwood, and a lot more. Books play a crucial role in the training of generative-AI systems.

books3, generative-ai system, train generative ai, (13 more...)

The Atlantic - Technology

Industry:

Law > Litigation (1.00)
Law > Intellectual Property & Technology Law (0.78)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.87)

Add feedback

Revealed: The Authors Whose Pirated Books Are Powering Generative AI

The Atlantic - TechnologyAug-19-2023, 21:04:56 GMT

One of the most troubling issues around generative AI is simple: It's being made in secret. To produce humanlike answers to questions, systems such as ChatGPT process huge quantities of written material. But few people outside of companies such as Meta and OpenAI know the full extent of the texts these programs have been trained on. Some training text comes from Wikipedia and other online writing, but high-quality generative AI requires higher-quality input than is usually found on the internet--that is, it requires the kind found in books. But neither the lawsuit itself nor the commentary surrounding it has offered a look under the hood: We have not previously known for certain whether LLaMA was trained on Silverman's, Kadrey's, or Golden's books, or any others, for that matter.

books3, dataset, developer, (16 more...)

The Atlantic - Technology

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Industry: Law > Intellectual Property & Technology Law (0.95)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback