AITopics

Plotting

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant Deployment

Neural Information Processing SystemsJun-2-2025, 13:18:07 GMT

Motivated by the transformative capabilities of large language models (LLMs) across various natural language tasks, there has been a growing demand to deploy these models effectively across diverse real-world applications and platforms. However, the challenge of efficiently deploying LLMs has become increasingly pronounced due to the varying application-specific performance requirements and the rapid evolution of computational platforms, which feature diverse resource constraints and deployment flows. These varying requirements necessitate LLMs that can adapt their structures (depth and width) for optimal efficiency across different platforms and application specifications. To address this critical gap, we propose AmoebaLLM, a novel framework designed to enable the instant derivation of LLM subnets of arbitrary shapes, which achieve the accuracyefficiency frontier and can be extracted immediately after a one-time fine-tuning. In this way, AmoebaLLM significantly facilitates rapid deployment tailored to various platforms and applications. Specifically, AmoebaLLM integrates three innovative components: (1) a knowledge-preserving subnet selection strategy that features a dynamic-programming approach for depth shrinking and an importancedriven method for width shrinking; (2) a shape-aware mixture of LoRAs to mitigate gradient conflicts among subnets during fine-tuning; and (3) an in-place distillation scheme with loss-magnitude balancing as the fine-tuning objective. Extensive experiments validate that AmoebaLLM not only sets new standards in LLM adaptability but also successfully delivers subnets that achieve stateof-the-art trade-offs between accuracy and efficiency.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (0.93)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Uncovering, Explaining, and Mitigating the Superficial Safety of Backdoor Defense Rui Min 1

Neural Information Processing SystemsJun-2-2025, 13:17:50 GMT

Backdoor attacks pose a significant threat to Deep Neural Networks (DNNs) as they allow attackers to manipulate model predictions with backdoor triggers. To address these security vulnerabilities, various backdoor purification methods have been proposed to purify compromised models.

artificial intelligence, machine learning, purified model, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Referee

Neural Information Processing SystemsJun-2-2025, 13:17:28 GMT

Despite this, q-means is still useful for machine learning purposes.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.61)

Add feedback

Faster Accelerated First-order Methods for Convex Optimization with Strongly Convex Function Constraints

Neural Information Processing SystemsJun-2-2025, 13:17:14 GMT

We show the superior performance of our methods in sparsity-inducing constrained optimization, notably Google's personalized

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Asia > China (0.14)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Don't plug these 7 appliances (including AC units) into extension cords - here's why

ZDNetJun-2-2025, 13:17:06 GMT

Extension cords are generally a safe solution for running power to electronics that are too far from the nearest wall outlet. But the operative word here is "electronics," which is not as all-encompassing as some people might think. Appliances (like refrigerators and toaster ovens) are obviously electronic devices, but they're in a different class from most electronics because of the amperage demands they need to function. Extension cords are manufactured with a maximum capacity to handle electrical current, which is determined by the size or gauge of the wire used in the cord. For instance, a 16-gauge extension cord can handle a maximum of 13 amps, while a 14-gauge cord can handle up to 15 amps (or 1,800 watts), the same as a standard wall outlet in the US.

artificial intelligence, extension cord, internet of things, (14 more...)

ZDNet

Country: North America > United States (0.51)

Industry:

Information Technology > Smart Houses & Appliances (0.66)
Construction & Engineering > HVAC (0.41)

Technology:

Information Technology > Internet of Things (0.40)
Information Technology > Artificial Intelligence (0.31)

Add feedback

TrajCLIP: Pedestrian Trajectory Prediction Method Using Contrastive Learning and Idempotent Networks

Neural Information Processing SystemsJun-2-2025, 13:17:02 GMT

The distribution of pedestrian trajectories is highly complex and influenced by the scene, nearby pedestrians, and subjective intentions. This complexity presents challenges for modeling and generalizing trajectory prediction. Previous methods modeled the feature space of future trajectories based on the high-dimensional feature space of historical trajectories, but this approach is suboptimal because it overlooks the similarity between historical and future trajectories. Our proposed method, TrajCLIP, utilizes contrastive learning and idempotent generative networks to address this issue. By pairing historical and future trajectories and applying contrastive learning on the encoded feature space, we enforce same-space consistency constraints. To manage complex distributions, we use idempotent loss and tightness loss to control over-expansion in the latent space. Additionally, we have developed a trajectory interpolation algorithm and synthetic trajectory data to enhance model capacity and improve generalization. Experimental results on public datasets demonstrate that TrajCLIP achieves state-of-the-art performance and excels in scene-to-scene transfer, few-shot transfer, and online learning tasks.

artificial intelligence, machine learning, trajectory, (17 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Genre: Research Report > Experimental Study (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Robust Fine-tuning of Zero-shot Models via Variance Reduction

Neural Information Processing SystemsJun-2-2025, 13:16:54 GMT

When fine-tuning zero-shot models like CLIP, our desideratum is for the fine-tuned model to excel in both in-distribution (ID) and out-of-distribution (OOD). Recently, ensemble-based models (ESM) have been shown to offer significant robustness improvement, while preserving high ID accuracy. However, our study finds that ESMs do not solve the ID-OOD trade-offs: they achieve peak performance for ID and OOD accuracy at different mixing coefficients. When optimized for OOD accuracy, the ensemble model exhibits a noticeable decline in ID accuracy, and vice versa. In contrast, we propose a sample-wise ensembling technique that can simultaneously attain the best ID and OOD accuracy without the trade-offs.

accuracy, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Reconciling Modern Deep Learning with Traditional Optimization Analyses: The Intrinsic Learning Rate

Neural Information Processing SystemsJun-2-2025, 13:16:40 GMT

Recent works (e.g., (Li and Arora, 2020)) suggest that the use of popular normalization schemes (including Batch Normalization) in today's deep learning can move it far from a traditional optimization viewpoint, e.g., use of exponentially increasing learning rates. The current paper highlights other ways in which behavior of normalized nets departs from traditional viewpoints, and then initiates a formal framework for studying their mathematics via suitable adaptation of the conventional framework namely, modeling SGD-induced training trajectory via a suitable stochastic differential equation (SDE) with a noise term that captures gradient noise. This yields: (a) A new "intrinsic learning rate" parameter that is the product of the normal learning rate η and weight decay factor λ. Analysis of the SDE shows how the effective speed of learning varies and equilibrates over time under the control of intrinsic LR.

artificial intelligence, equilibrium, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.84)

Add feedback

a7453a5f026fb6831d68bdc9cb0edcae-AuthorFeedback.pdf

Neural Information Processing SystemsJun-2-2025, 13:14:08 GMT

We thank reviewers for their thorough reading. We will fix the typos and clarify the unclear points in the next version of our paper. Batch size has been an important component of past analyses. When the nets are without BN, e.g. with LN or GN, the magnitude However, this analysis doesn't hold for the general case where BN is allowed and thus we treat batch size as a fixed hyper-parameter The fast equilibrium conjecture only partially explains the benefits of BN. Besides this conjecture, there are many other benefits, e.g., BN affects the If we make the second phase longer, one should expect the ratio becomes closer to 10. 2. Figure 10 gives a more clear and However, this is not observed in any of our settings, so it's not clear to us whether the heavy tail assumption holds for our setting.

artificial intelligence, machine learning, reviewer, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.74)

Add feedback

CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching

Neural Information Processing SystemsJun-2-2025, 13:13:49 GMT

Diffusion models have demonstrated great success in the field of text-to-image generation. However, alleviating the misalignment between the text prompts and images is still challenging. We break down the problem into two causes: concept ignorance and concept mismapping. To tackle the two challenges, we propose CoMat, an end-to-end diffusion model fine-tuning strategy with the imageto-text concept matching mechanism. Firstly, we introduce a novel image-totext concept activation module to guide the diffusion model in revisiting ignored concepts. Additionally, an attribute concentration module is proposed to map the text conditions of each entity to its corresponding image area correctly. Extensive experimental evaluations, conducted across three distinct text-to-image alignment benchmarks, demonstrate the superior efficacy of our proposed method, CoMat-SDXL, over the baseline model, SDXL [49]. We also show that our method enhances general condition utilization capability and generalizes to the long and complex prompt despite not specifically training on it. The code is available at https://github.com/CaraJ7/CoMat.

artificial intelligence, diffusion model, machine learning, (18 more...)

Neural Information Processing Systems

Country: