AITopics | tang

Collaborating Authors

tang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

d71a4a6c796cacd9b8a298589943cdf3-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 04:48:08 GMT

The codes related todataset, model, loss, training pipeline and experiment areenclosed. Cross-Domain MAFLAFLWMAFLWR 300W Supervised learning TCDCN[13] XX 7.95 7.65 - 5.54 MTCNN[12] XX 5.39 6.90 - WingLoss[3] XX - - - 4.04 Generative modeling based DeformingAE[9] OX 5.45 - - ImGen.[4] After the initialization period, the intra pseudo-paired dataxd1)d1, xd2)d2 and inter pseudo-paired dataxd1)d2,xd2)d1 aregenerated with latent space exploration described atSection 3.2. Atlastsemanticmatchingloss LM are utilized to get intra semantic matching lossLM1 and inter semantic matching lossLM2. We provide more examples of pseudo-paired data on various combinations of original and pair domainsinFig.3.

artificial intelligence, machine learning, proceedings, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

CogView: MasteringText-to-ImageGenerationvia Transformers

Neural Information Processing SystemsFeb-10-2026, 11:13:28 GMT

Then in the stage 2, an auto-regressive model (such as PixelCNN [47]) learns to fit the prior of hidden variables.

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > China > Jiangsu Province > Changzhou (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

8682cc30db9c025ecd3fee433f8ab54c-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 06:13:59 GMT

dimension, graph, spectral, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(3 more...)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

VTC-LFC: VisionTransformerCompressionwith Low-FrequencyComponents

Neural Information Processing SystemsFeb-9-2026, 05:29:05 GMT

However,thecompression only in the spatial domain suffers from a dramatic performance drop without finetuning and is not robust to noise, as the noise in the spatial domain can easily confuse the pruning criteria, leading to some parameters/channels being pruned incorrectly.

artificial intelligence, machine learning, pruning, (19 more...)

Neural Information Processing Systems

Genre: Research Report (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

Bernoulli f n Z

Neural Information Processing SystemsFeb-8-2026, 17:25:01 GMT

Attime nodeof 2 have example, Wesimulate equally UASE, techniques omnib d =7 , while visualisation, above, 1. Cross-sectional: The 2. Longitudinal: The Inthissection stability described embedding P(1),. Independent UASE, on P tdt dT, but U thelinearvT, while d= ran P)isoftend.

artificial intelligence, arxivpreprintarxiv, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Bristol (0.05)
Asia (0.04)
Africa > Senegal > Kolda Region > Kolda (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.47)

Add feedback

Kangaroo: Lossless Self-Speculative Decoding for Accelerating LLMs via Double Early Exiting

Neural Information Processing SystemsDec-24-2025, 02:41:20 GMT

Speculative decoding has demonstrated its effectiveness in accelerating the inference of large language models (LLMs) while maintaining an identical sampling distribution. However, the conventional approach of training separate draft model to achieve a satisfactory token acceptance rate can be costly and impractical. In this paper, we propose a novel self-speculative decoding framework \emph{Kangaroo} with \emph{double} early exiting strategy, which leverages the shallow sub-network and the \texttt{LM Head} of the well-trained target LLM to construct a self-drafting model. Then, the self-verification stage only requires computing the remaining layers over the \emph{early-exited} hidden states in parallel. To bridge the representation gap between the sub-network and the full model, we train a lightweight and efficient adapter module on top of the sub-network.

artificial intelligence, large language model, natural language, (11 more...)

Neural Information Processing Systems

Country: North America > United States (0.07)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.83)

Add feedback

Temporal Object-Aware Vision Transformer for Few-Shot Video Object Detection

Kumar, Yogesh, Mishra, Anand

arXiv.org Artificial IntelligenceNov-19-2025

Few-shot Video Object Detection (FSVOD) addresses the challenge of detecting novel objects in videos with limited labeled examples, overcoming the constraints of traditional detection methods that require extensive training data. This task presents key challenges, including maintaining temporal consistency across frames affected by occlusion and appearance variations, and achieving novel object generalization without relying on complex region proposals, which are often computationally expensive and require task-specific training. Our novel object-aware temporal modeling approach addresses these challenges by incorporating a filtering mechanism that selectively propagates high-confidence object features across frames. This enables efficient feature progression, reduces noise accumulation, and enhances detection accuracy in a few-shot setting. By utilizing few-shot trained detection and classification heads with focused feature propagation, we achieve robust temporal consistency without depending on explicit object tube proposals. Our approach achieves performance gains, with AP improvements of 3.7% (FSVOD-500), 5.3% (FSYTV-40), 4.3% (VidOR), and 4.5 (VidVRD) in the 5-shot setting. Further results demonstrate improvements in 1-shot, 3-shot, and 10-shot configurations. We make the code public at: https://github.com/yogesh-iitj/fs-video-vit

artificial intelligence, detection, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2511.13784

Country: Asia > India (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Let Me Show You: Learning by Retrieving from Egocentric Video for Robotic Manipulation

Zhu, Yichen, Feng, Feifei

arXiv.org Artificial IntelligenceNov-10-2025

Robots operating in complex and uncertain environments face considerable challenges. Advanced robotic systems often rely on extensive datasets to learn manipulation tasks. In contrast, when humans are faced with unfamiliar tasks, such as assembling a chair, a common approach is to learn by watching video demonstrations. In this paper, we propose a novel method for learning robot policies by Retrieving-from-Video (RfV), using analogies from human demonstrations to address manipulation tasks. Our system constructs a video bank comprising recordings of humans performing diverse daily tasks. To enrich the knowledge from these videos, we extract mid-level information, such as object affordance masks and hand motion trajectories, which serve as additional inputs to enhance the robot model's learning and generalization capabilities. We further feature a dual-component system: a video retriever that taps into an external video bank to fetch task-relevant video based on task specification, and a policy generator that integrates this retrieved knowledge into the learning cycle. This approach enables robots to craft adaptive responses to various scenarios and generalize to tasks beyond those in the training data. Through rigorous testing in multiple simulated and real-world settings, our system demonstrates a marked improvement in performance over conventional robotic systems, showcasing a significant breakthrough in the field of robotics.

artificial intelligence, arxiv preprint arxiv, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2511.05199

Country: Asia > China (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

DANIEL: A Distributed and Scalable Approach for Global Representation Learning with EHR Applications

Wang, Zebin, Gan, Ziming, Tang, Weijing, Xia, Zongqi, Cai, Tianrun, Cai, Tianxi, Lu, Junwei

arXiv.org Artificial IntelligenceNov-5-2025

Classical probabilistic graphical models face fundamental challenges in modern data environments, which are characterized by high dimensionality, source heterogeneity, and stringent data-sharing constraints. In this work, we revisit the Ising model, a well-established member of the Markov Random Field (MRF) family, and develop a distributed framework that enables scalable and privacy-preserving representation learning from large-scale binary data with inherent low-rank structure. Our approach optimizes a non-convex surrogate loss function via bi-factored gradient descent, offering substantial computational and communication advantages over conventional convex approaches. We evaluate our algorithm on multi-institutional electronic health record (EHR) datasets from 58,248 patients across the University of Pittsburgh Medical Center (UPMC) and Mass General Brigham (MGB), demonstrating superior performance in global representation learning and downstream clinical tasks, including relationship detection, patient phenotyping, and patient clustering. These results highlight a broader potential for statistical inference in federated, high-dimensional settings while addressing the practical challenges of data complexity and multi-institutional integration.

data mining, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2511.02754

Country: