AITopics

Plotting

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Accelerating Transformers with Spectrum-Preserving Token Merging 1,3,4

Neural Information Processing SystemsMay-29-2025, 02:43:20 GMT

Increasing the throughput of the Transformer architecture, a foundational component used in numerous state-of-the-art models for vision and language tasks (e.g., GPT, LLaVa), is an important problem in machine learning. One recent and effective strategy is to merge token representations within Transformer models, aiming to reduce computational and memory requirements while maintaining accuracy. Prior works have proposed algorithms based on Bipartite Soft Matching (BSM), which divides tokens into distinct sets and merges the top k similar tokens. However, these methods have significant drawbacks, such as sensitivity to tokensplitting strategies and damage to informative tokens in later layers.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.92)

Industry:

Leisure & Entertainment > Sports > Baseball (0.67)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

A Gradient Sampling Method With Complexity Guarantees for Lipschitz Functions in High and Low Dimensions

Neural Information Processing SystemsMay-29-2025, 02:43:12 GMT

Their method is a novel modification of Goldstein's classical subgradient method. Their work, however, makes use of a nonstandard subgradient oracle model and requires the function to be directionally differentiable. Our first contribution in this paper is to show that both of these assumptions can be dropped by simply adding a small random perturbation in each step of their algorithm. The resulting method works on any Lipschitz function whose value and gradient can be evaluated at points of differentiability.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

A Gradient Sampling Method With Complexity Guarantees for Lipschitz Functions in High and Low Dimensions

Neural Information Processing SystemsMay-29-2025, 02:43:08 GMT

algorithm, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

On the Parameter Identifiability of Partially Observed Linear Causal Models

Neural Information Processing SystemsMay-29-2025, 02:43:01 GMT

Linear causal models are important tools for modeling causal dependencies and yet in practice, only a subset of the variables can be observed. In this paper, we examine the parameter identifiability of these models by investigating whether the edge coefficients can be recovered given the causal structure and partially observed data. Our setting is more general than that of prior research--we allow all variables, including both observed and latent ones, to be flexibly related, and we consider the coefficients of all edges, whereas most existing works focus only on the edges between observed variables. Theoretically, we identify three types of indeterminacy for the parameters in partially observed linear causal models. We then provide graphical conditions that are sufficient for all parameters to be identifiable and show that some of them are provably necessary. Methodologically, we propose a novel likelihoodbased parameter estimation method that addresses the variance indeterminacy in a specific way and can asymptotically recover the underlying parameters up to trivial indeterminacy. Empirical studies on both synthetic and real-world datasets validate our identifiability theory and the effectiveness of the proposed method in the finitesample regime.

artificial intelligence, latent variable, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.82)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Add feedback

OctField Hierarchical Implicit Functions for 3D Modeling Supplemental Material

Neural Information Processing SystemsMay-29-2025, 02:42:46 GMT

In this supplemental material, we provide more details on network architecture and more visualization results, including shape reconstruction/comparison, shape Generation, and shape Interpolations. Furthermore, some results on scene reconstruction and comparison with Local Implicit Grid [3] are presented to demonstrate our superiority on large data representation thanks to the hierarchical tree structure of our proposed OctField representation. All sections are listed as follows: Section 1 provides the details of network architecture and training. Section 2, Section 3 and Section 4 provide more visualization results on a number of 3D modeling tasks, including shape reconstruction, generation and interpolation. Section 5 conducts four ablation studies, including with or without overlapping of adjacent octants, the training strategy, the distinction of latent codes and the subdivision parameter τ.

artificial intelligence, machine learning, octant, (13 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

A Multi-Implicit Neural Representation for Fonts

Neural Information Processing SystemsMay-29-2025, 02:42:11 GMT

In our experiments, we train an auto-decoder based network which is an 8-layer MLP, and each hidden layer contains 384 neurons. We use the LeakyReLU activation function as the non-linearity. The latent embedding z is a 128-D vector. For better convergence, sharing the spirit from [4], a skip connection is built between inputs and the third hidden layer, i.e., the inputs are concatenated to the output of the third hidden layer. Rather than following the traditional training routine in the reconstruction and interpolation tasks, the training strategy for the generation task is to freeze the learned latent embedding weights after 1000 epochs, such that the training is more stable across glyphs of the same font family.

artificial intelligence, machine learning, representation, (16 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A Multi-Implicit Neural Representation for Fonts Zhaowen Wang 2 Matthew Fisher

Neural Information Processing SystemsMay-29-2025, 02:42:05 GMT

Fonts are ubiquitous across documents and come in a variety of styles. They are either represented in a native vector format or rasterized to produce fixed resolution images. In the first case, the non-standard representation prevents benefiting from latest network architectures for neural representations; while, in the latter case, the rasterized representation, when encoded via networks, results in loss of data fidelity, as font-specific discontinuities like edges and corners are difficult to represent using neural networks. Based on the observation that complex fonts can be represented by a superposition of a set of simpler occupancy functions, we introduce multi-implicits to represent fonts as a permutation-invariant set of learned implicit functions, without losing features (e.g., edges and corners). However, while multi-implicits locally preserve font features, obtaining supervision in the form of ground truth multi-channel signals is a problem in itself. Instead, we propose how to train such a representation with only local supervision, while the proposed neural architecture directly finds globally consistent multi-implicits for font families. We extensively evaluate the proposed representation for various tasks including reconstruction, interpolation, and synthesis to demonstrate clear advantages with existing alternatives. Additionally, the representation naturally enables glyph completion, wherein a single characteristic font is used to synthesize a whole font family in the target style.

artificial intelligence, machine learning, representation, (20 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Language Without Borders: A Dataset and Benchmark for Code-Switching Lip Reading Supplementary Material

Neural Information Processing SystemsMay-29-2025, 02:41:46 GMT

This supplement to our main paper, "Language Without Borders: A Dataset and Benchmark for Code-Switching Lip Reading," includes detailed descriptions of the dataset collection methods, a comprehensive data card, and datasheets. Additionally, we provide licensing information for the dataset, along with an author statement affirming adherence to the license. Further discussions on the societal impact are included, covering cultural context and privacy considerations. Implementation details of the methods applied to the dataset are also provided. This application, illustrated in Figure 3, not only facilitates the usages of participants, but also ensures the integrity and uniformity of the collected data. Prior to the commencement of the recording, participants are adequately briefed about the entire data collection process and all necessary precautions. This includes detailed instructions for downloading and installing our application, important pre-requisites for successful data collection such as securing a quiet environment for recordings. It guarantees that the participant's face is fully within the video frame and directly facing the camera, and avoiding the presence of additional faces in the recording frame. It is of fundamental importance that during the recording, participants are advised to hold their phone with one hand while maintaining an optimal distance from the camera to achieve clear and properly framed video images. To avoid any distractions or impediments during the recording session, participants are recommended to disable notification alerting from various apps like WeChat or any others that could potentially obstruct the recording interface's prompts.

artificial intelligence, data quality, machine learning, (18 more...)

Neural Information Processing Systems

Country: Asia > China (0.29)

Industry:

Law (1.00)
Information Technology > Services (0.66)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Quality (0.93)

Add feedback

Language Without Borders: A Dataset and Benchmark for Code-Switching Lip Reading

Neural Information Processing SystemsMay-29-2025, 02:41:43 GMT

Lip reading aims at transforming the videos of continuous lip movement into textual contents, and has achieved significant progress over the past decade. It serves as a critical yet practical assistance for speech-impaired individuals, with more practicability than speech recognition in noisy environments. With the increasing interpersonal communications in social media owing to globalization, the existing monolingual datasets for lip reading may not be sufficient to meet the exponential proliferation of bilingual and even multilingual users. However, to our best knowledge, research on code-switching is only explored in speech recognition, while the attempts in lip reading are seriously neglected. To bridge this gap, we have collected a bilingual code-switching lip reading benchmark composed of Chinese and English, dubbed CSLR.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: Asia > China (0.47)

Genre: Research Report (0.93)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Delving into the Reversal Curse: How Far Can Large Language Models Generalize?

Neural Information Processing SystemsMay-29-2025, 02:39:24 GMT

A prime example is the recently debated "reversal curse", which surfaces when models, having been trained on the fact "A is B", struggle to generalize this knowledge to infer that "B is A". In this paper, we examine the manifestation of the reversal curse across various tasks and delve into both the generalization abilities and the problem-solving mechanisms of LLMs. This investigation leads to a series of significant insights: (1) LLMs are able to generalize to "B is A" when both A and B are presented in the context as in the case of a multiple-choice question.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

Genre: