AITopics | lange

Collaborating Authors

lange

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Appendix Table of Contents

Neural Information Processing SystemsFeb-12-2026, 22:23:19 GMT

All training loops and ES are implemented in JAX (Bradbury et al., 2018).

artificial intelligence, machine learning, task hyperparameter robustness, (12 more...)

Neural Information Processing Systems

Genre: Collection (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)

Add feedback

Appendix Table of Contents

Neural Information Processing SystemsOct-8-2025, 19:48:23 GMT

All training loops and ES are implemented in JAX (Bradbury et al., 2018).

artificial intelligence, machine learning, task hyperparameter robustness, (12 more...)

Neural Information Processing Systems

Genre: Collection (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)

Add feedback

Convex Clustering

Chi, Eric C., Molstad, Aaron J., Gao, Zheming

arXiv.org Machine LearningJul-15-2025

This survey reviews a clustering method based on solving a convex optimization problem. Despite the plethora of existing clustering methods, convex clustering has several uncommon features that distinguish it from prior art. The optimization problem is free of spurious local minima, and its unique global minimizer is stable with respect to all its inputs, including the data, a tuning parameter, and weight hyperparameters. Its single tuning parameter controls the number of clusters and can be chosen using standard techniques from penalized regression. We give intuition into the behavior and theory for convex clustering as well as practical guidance. We highlight important algorithms and give insight into how their computational costs scale with the problem size. Finally, we highlight the breadth of its uses and flexibility to be combined and integrated with other inferential methods.

artificial intelligence, convex, machine learning, (18 more...)

arXiv.org Machine Learning

2507.09077

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(8 more...)

Genre: Overview (1.00)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Stein Variational Evolution Strategies

Braun, Cornelius V., Lange, Robert T., Toussaint, Marc

arXiv.org Artificial IntelligenceOct-14-2024

Stein Variational Gradient Descent (SVGD) is a highly efficient method to sample from an unnormalized probability distribution. However, the SVGD update relies on gradients of the log-density, which may not always be available. Existing gradient-free versions of SVGD make use of simple Monte Carlo approximations or gradients from surrogate distributions, both with limitations. To improve gradient-free Stein variational inference, we combine SVGD steps with evolution strategy (ES) updates. Our results demonstrate that the resulting algorithm generates high-quality samples from unnormalized target densities without requiring gradient information. Compared to prior gradient-free SVGD methods, we find that the integration of the ES update in SVGD significantly improves the performance on multiple challenging benchmark problems.

evolutionary algorithm, machine learning, particle, (15 more...)

arXiv.org Artificial Intelligence

2410.1039

Country:

South America > Paraguay > Asunción > Asunción (0.04)
Europe > Germany (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

Add feedback

Optimal lower bounds for logistic log-likelihoods

Anceschi, Niccolò, Rigon, Tommaso, Zanella, Giacomo, Durante, Daniele

arXiv.org Machine LearningOct-14-2024

The logit transform is arguably the most widely-employed link function beyond linear settings. This transformation routinely appears in regression models for binary data and provides, either explicitly or implicitly, a core building-block within state-of-the-art methodologies for both classification and regression. Its widespread use, combined with the lack of analytical solutions for the optimization of general losses involving the logit transform, still motivates active research in computational statistics. Among the directions explored, a central one has focused on the design of tangent lower bounds for logistic log-likelihoods that can be tractably optimized, while providing a tight approximation of these log-likelihoods. Although progress along these lines has led to the development of effective minorize-maximize (MM) algorithms for point estimation and coordinate ascent variational inference schemes for approximate Bayesian inference under several logit models, the overarching focus in the literature has been on tangent quadratic minorizers. In fact, it is still unclear whether tangent lower bounds sharper than quadratic ones can be derived without undermining the tractability of the resulting minorizer. This article addresses such a challenging question through the design and study of a novel piece-wise quadratic lower bound that uniformly improves any tangent quadratic minorizer, including the sharpest ones, while admitting a direct interpretation in terms of the classical generalized lasso problem. As illustrated in a ridge logistic regression, this unique connection facilitates more effective implementations than those provided by available piece-wise bounds, while improving the convergence speed of quadratic ones.

minorizer, regression, tangent minorizer, (16 more...)

arXiv.org Machine Learning

2410.10309

Country:

Europe > Italy > Lombardy > Milan (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > North Carolina > Durham County > Durham (0.04)

Genre: Research Report > Experimental Study (0.35)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

M\'elange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity

Griggs, Tyler, Liu, Xiaoxuan, Yu, Jiaxiang, Kim, Doyoung, Chiang, Wei-Lin, Cheung, Alvin, Stoica, Ion

arXiv.org Artificial IntelligenceJun-27-2024

Large language models (LLMs) are increasingly integrated into many online services, yet they remain cost-prohibitive to deploy due to the requirement of expensive GPU instances. Prior work has addressed the high cost of LLM serving by improving the inference engine, but less attention has been given to selecting the most cost-efficient GPU type(s) for a specific LLM service. There is a large and growing landscape of GPU types and, within these options, higher cost does not always lead to increased performance. Instead, through a comprehensive investigation, we find that three key LLM service characteristics (request size, request rate, SLO) strongly influence GPU cost efficiency, and differing GPU types are most cost efficient for differing LLM service settings. As a result, the most cost-efficient allocation for a given service is typically a mix of heterogeneous GPU types. Based on this analysis, we introduce M\'elange, a GPU allocation framework that navigates these diverse LLM service characteristics and heterogeneous GPU option space to automatically and efficiently derive the minimal-cost GPU allocation for a given LLM service. We formulate the GPU allocation task as a cost-aware bin packing problem where GPUs are bins and items are slices of the service workload. Our formulation's constraints account for a service's unique characteristics, allowing M\'elange to be flexible to support diverse service settings and heterogeneity-aware to adapt the GPU allocation to a specific service. Compared to using only a single GPU type, M\'elange reduces deployment costs by up to 77% in conversational settings, 33% in document-based settings, and 51% in a mixed setting.

cost efficiency, request size, slo, (14 more...)

arXiv.org Artificial Intelligence

2404.14527

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.64)

Industry: Information Technology (0.46)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Generic Approach for Identification of Event Related Brain Potentials via a Competitive Neural Network Structure

Neural Information Processing SystemsApr-6-2023, 17:58:13 GMT

We present a novel generic approach to the problem of Event Related Potential identification and classification, based on a competitive N eu(cid:173) ral Net architecture. The network weights converge to the embedded signal patterns, resulting in the formation of a matched filter bank. The network performance is analyzed via a simulation study, exploring identification robustness under low SNR conditions and compared to the expected performance from an information theoretic perspective. The classifier is applied to real event-related potential data recorded during a classic odd-ball type paradigm; for the first time, within(cid:173) session variable signal patterns are automatically identified, dismiss(cid:173) ing the strong and limiting requirement of a-priori stimulus-related selective grouping of the recorded data.

competitive neural network structure, event related brain potential, generic approach, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.76)

Add feedback

Why AI and machine learning are drifting away from the cloud

#artificialintelligenceAug-2-2022, 01:25:38 GMT

A quick-service restaurant chain is running its AI models on machines inside its stores to localize delivery logistics. At the same time, a global pharma company is training its machine learning models on premises, using servers it manages by itself. Cloud computing isn't going anywhere, but some companies that use machine learning models and the tech vendors supplying the platforms to manage them say machine learning is having an on-premises moment. For many years, cloud providers have argued that the computing requirements for machine learning would be far too expensive and cumbersome to start up on their own, but the field is maturing. "We still have a ton of customers who want to go on a cloud migration, but we're definitely now seeing -- at least in the past year or so -- a lot more customers who want to repatriate workloads back onto on-premise because of cost," said Thomas Robinson, vice president of strategic partnerships and corporate development at MLOps platform company Domino Data Lab.

cloud, customer, infrastructure, (15 more...)

#artificialintelligence

Country:

North America > United States > California > San Francisco County > San Francisco (0.05)
North America > United States > California > San Diego County > San Diego (0.05)
North America > United States > California > Los Angeles County > Los Angeles (0.05)

Industry:

Information Technology > Services (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Consumer Products & Services > Restaurants (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback

Bregman Power k-Means for Clustering Exponential Family Data

Vellal, Adithya, Chakraborty, Saptarshi, Xu, Jason

arXiv.org Machine LearningJun-22-2022

Recent progress in center-based clustering algorithms combats poor local minima by implicit annealing, using a family of generalized means. These methods are variations of Lloyd's celebrated $k$-means algorithm, and are most appropriate for spherical clusters such as those arising from Gaussian data. In this paper, we bridge these algorithmic advances to classical work on hard clustering under Bregman divergences, which enjoy a bijection to exponential family distributions and are thus well-suited for clustering objects arising from a breadth of data generating mechanisms. The elegant properties of Bregman divergences allow us to maintain closed form updates in a simple and transparent algorithm, and moreover lead to new theoretical arguments for establishing finite sample bounds that relax the bounded support assumption made in the existing state of the art. Additionally, we consider thorough empirical analyses on simulated experiments and a case study on rainfall data, finding that the proposed method outperforms existing peer methods in a variety of non-Gaussian data settings.

artificial intelligence, bregman power, machine learning, (15 more...)

arXiv.org Machine Learning

2206.1086

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)
North America > United States > Maryland > Baltimore (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

How the metaverse will let you simulate everything

#artificialintelligenceJan-26-2022, 18:35:28 GMT

This article is part of a VB special issue. Read the full series here: The metaverse - How close are we? Defining the "metaverse" is a difficult task, but one commonly accepted definition is a digital space populated by representations of people, places, and things. Through a combination of technologies including virtual reality (VR), augmented reality (AR), and AI, the metaverse that some futurists envision is an extension of the real world -- albeit without the physical trappings. Companies like Rockstar and Roblox have pitched the metaverse as the ideal platform for gaming, but there's no limit to the potential applications in the enterprise.

industrial metaverse, metaverse, simulation, (17 more...)

#artificialintelligence

Industry:

Transportation > Ground > Road (0.50)
Transportation > Passenger (0.40)
Information Technology > Robotics & Automation (0.40)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.85)

Add feedback