AITopics

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Limitations and societal impact A.1 Limitations B

Neural Information Processing SystemsJun-1-2025, 06:13:10 GMT

VLC is a benchmark for Bidirectional Vision-Language Compositionality evaluation. Each instance consists of two images and two captions. Using each of the images and captions as a base, a model is asked to select the pair that correctly represents the base versus the hard negative distractor with minor compositional changes. Thus, we can measure image-to-text and text-to-image retrieval with hard negative pairs. To obtain good results on the dataset, it is necessary that the model performs well in both directions for the same instance. Each instance of the dataset consists of six fields: image: COCO 2017 validation image.

artificial intelligence, caption, machine learning, (16 more...)

Neural Information Processing Systems

Industry: Social Sector (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.95)

Add feedback

VLC: Extending Vision-Language Compositionality Evaluation with Text-to-Image Retrieval

Neural Information Processing SystemsJun-1-2025, 06:13:06 GMT

VLC is to add a synthetic hard negative image generated from the synthetic text, resulting in two image-to-text retrieval examples (one for each image) and, more importantly, two text-to-image retrieval examples (one for each text).

caption, large language model, machine learning, (22 more...)

Neural Information Processing Systems

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre:

Overview (0.68)
Research Report > New Finding (0.46)

Industry:

Information Technology (0.68)
Government (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
(2 more...)

Add feedback

Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations

Fenglin Liu, Yuanxin Liu, Xuancheng Ren, Xiaodong He, Xu Sun

Neural Information Processing SystemsJun-1-2025, 06:12:41 GMT

In vision-and-language grounding problems, fine-grained representations of the image are considered to be of paramount importance. Most of the current systems incorporate visual features and textual concepts as a sketch of an image. However, plainly inferred representations are usually undesirable in that they are composed of separate components, the relations of which are elusive. In this work, we aim at representing an image with a set of integrated visual regions and corresponding textual concepts, reflecting certain semantics. To this end, we build the Mutual Iterative Attention (MIA) module, which integrates correlated visual features and textual concepts, respectively, by aligning the two modalities. We evaluate the proposed approach on two representative vision-and-language grounding tasks, i.e., image captioning and visual question answering. In both tasks, the semanticgrounded image representations consistently boost the performance of the baseline models under all metrics across the board. The results demonstrate that our approach is effective and generalizes well to a wide range of models for image-related applications.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Asia > China (0.14)
North America > Canada (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.75)

Add feedback

Finite-Sample Analysis for SARSA with Linear Function Approximation

Shaofeng Zou, Tengyu Xu, Yingbin Liang

Neural Information Processing SystemsJun-1-2025, 06:11:54 GMT

SARSA is an on-policy algorithm to learn a Markov decision process policy in reinforcement learning. We investigate the SARSA algorithm with linear function approximation under the non-i.i.d.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States > New York > Erie County > Buffalo (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.52)

Add feedback

SSDM: Scalable Speech Dysfluency Modeling

Neural Information Processing SystemsJun-1-2025, 06:09:03 GMT

Speech dysfluency modeling is the core module for spoken language learning, and speech therapy. However, there are three challenges. First, current state-of-the-art solutions [1, 2] suffer from poor scalability. Second, there is a lack of a large-scale dysfluency corpus. Third, there is not an effective learning framework. In this paper, we propose SSDM: Scalable Speech Dysfluency Modeling, which (1) adopts articulatory gestures as scalable forced alignment; (2) introduces connectionist subsequence aligner (CSA) to achieve dysfluency alignment; (3) introduces a largescale simulated dysfluency corpus called Libri-Dys; and (4) develops an end-to-end system by leveraging the power of large language models (LLMs). We expect SSDM to serve as a standard in the area of dysfluency modeling.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Genre: Research Report > Experimental Study (0.93)

Industry:

Information Technology (0.92)
Health & Medicine > Therapeutic Area > Neurology (0.67)
Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Learning Functional Transduction: S.I. 1 S.1.1 Theoretical results 1 S.2 Numerical implementation 4 S.2.1 Loss functions and evaluations

Neural Information Processing SystemsJun-1-2025, 06:08:55 GMT

'Depth' corresponds to network number of layers, 'MLP dim' to the dimensionality of the hidden layer representation in F

artificial intelligence, experiment, machine learning, (16 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

Add feedback

Learning to Decouple the Lights for 3D Face Texture Modeling Tianxin Huang Ying Tai 2

Neural Information Processing SystemsJun-1-2025, 06:06:57 GMT

Existing research has made impressive strides in reconstructing human facial shapes and textures from images with well-illuminated faces and minimal external occlusions. Nevertheless, it remains challenging to recover accurate facial textures from scenarios with complicated illumination affected by external occlusions, e.g. a face that is partially obscured by items such as a hat. Existing works based on the assumption of single and uniform illumination cannot correctly process these data. In this work, we introduce a novel approach to model 3D facial textures under such unnatural illumination. Instead of assuming single illumination, our framework learns to imitate the unnatural illumination as a composition of multiple separate light conditions combined with learned neural representations, named Light Decoupling. According to experiments on both single images and video sequences, we demonstrate the effectiveness of our approach in modeling facial textures under challenging illumination affected by occlusions.

artificial intelligence, machine learning, texture, (15 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Beyond Online Balanced Descent: An Optimal Algorithm for Smoothed Online Optimization

Gautam Goel, Yiheng Lin, Haoyuan Sun, Adam Wierman

Neural Information Processing SystemsJun-1-2025, 06:06:33 GMT

We study online convex optimization in a setting where the learner seeks to minimize the sum of a per-round hitting cost and a movement cost which is incurred when changing decisions between rounds.

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Re-randomized Densification for One Permutation Hashing and Bin-wise Consistent Weighted Sampling

Ping Li, Xiaoyun Li, Cun-Hui Zhang

Neural Information Processing SystemsJun-1-2025, 06:03:51 GMT

Jaccard similarity is widely used as a distance measure in many machine learning and search applications. Typically, hashing methods are essential for the use of Jaccard similarity to be practical in large-scale settings. For hashing binary (0/1) data, the idea of one permutation hashing (OPH) with densification significantly accelerates traditional minwise hashing algorithms while providing unbiased and accurate estimates. In this paper, we propose a "re-randomization" strategy in the process of densification and we show that it achieves the smallest variance among existing densification schemes. The success of this idea inspires us to generalize one permutation hashing to weighted (non-binary) data, resulting in the so-called "binwise consistent weighted sampling (BCWS)" algorithm. We analyze the behavior of BCWS and compare it with a recent alternative. Experiments on a range of datasets and tasks confirm the effectiveness of proposed methods. We expect that BCWS will be adopted in practice for training kernel machines and fast similarity search.

artificial intelligence, data mining, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > California > Santa Clara County (0.14)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

Add feedback

Transfer Q: Principled Decoding for LLM Alignment Ming Yin

Neural Information Processing SystemsJun-1-2025, 06:03:19 GMT

Aligning foundation models is essential for their safe and trustworthy deployment. However, traditional fine-tuning methods are computationally intensive and require updating billions of model parameters. A promising alternative, alignment via decoding, adjusts the response distribution directly without model updates to maximize a target reward r, thus providing a lightweight and adaptable framework for alignment.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: