AITopics | Liu, Xinyu

Collaborating Authors

Liu, Xinyu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Almost Sure Convergence Rates and Concentration of Stochastic Approximation and Reinforcement Learning with Markovian Noise

Qian, Xiaochi, Xie, Zixuan, Liu, Xinyu, Zhang, Shangtong

arXiv.org Machine LearningNov-20-2024

This paper establishes the first almost sure convergence rate and the first maximal concentration bound with exponential tails for general contractive stochastic approximation algorithms with Markovian noise. As a corollary, we also obtain convergence rates in $L^p$. Key to our successes is a novel discretization of the mean ODE of stochastic approximation algorithms using intervals with diminishing (instead of constant) length. As applications, we provide the first almost sure convergence rate for $Q$-learning with Markovian samples without count-based learning rates. We also provide the first concentration bound for off-policy temporal difference learning with Markovian samples.

artificial intelligence, machine learning, stochastic approximation and reinforcement learning, (2 more...)

arXiv.org Machine Learning

2411.13711

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

EITNet: An IoT-Enhanced Framework for Real-Time Basketball Action Recognition

Liu, Jingyu, Liu, Xinyu, Qu, Mingzhe, Lyu, Tianyi

arXiv.org Artificial IntelligenceOct-13-2024

Integrating IoT technology into basketball action recognition enhances sports analytics, providing crucial insights into player performance and game strategy. However, existing methods often fall short in terms of accuracy and efficiency, particularly in complex, real-time environments where player movements are frequently occluded or involve intricate interactions. To overcome these challenges, we propose the EITNet model, a deep learning framework that combines EfficientDet for object detection, I3D for spatiotemporal feature extraction, and TimeSformer for temporal analysis, all integrated with IoT technology for seamless real-time data collection and processing. Our contributions include developing a robust architecture that improves recognition accuracy to 92\%, surpassing the baseline EfficientDet model's 87\%, and reducing loss to below 5.0 compared to EfficientDet's 9.0 over 50 epochs. Furthermore, the integration of IoT technology enhances real-time data processing, providing adaptive insights into player performance and strategy. The paper details the design and implementation of EITNet, experimental validation, and a comprehensive evaluation against existing models. The results demonstrate EITNet's potential to significantly advance automated sports analysis and optimize data utilization for player performance and strategy improvement.

artificial intelligence, machine learning, real time system, (21 more...)

arXiv.org Artificial Intelligence

2410.09954

Country: Asia > China (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Leisure & Entertainment > Games (0.69)
Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Internet of Things (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(2 more...)

Add feedback

Forgetting Curve: A Reliable Method for Evaluating Memorization Capability for Long-context Models

Liu, Xinyu, Zhao, Runsong, Huang, Pengcheng, Xiao, Chunyang, Li, Bei, Wang, Jingang, Xiao, Tong, Zhu, Jingbo

arXiv.org Artificial IntelligenceOct-6-2024

Numerous recent works target to extend effective context length for language models and various methods, tasks and benchmarks exist to measure model's effective memorization length. However, through thorough investigations, we find limitations for currently existing evaluations on model's memorization capability. We provide an extensive survey for limitations in this work and propose a new method called forgetting curve to measure the memorization capability of long-context models. We show that forgetting curve has the advantage of being robust to the tested corpus and the experimental settings, of not relying on prompts and can be applied to any model size. We apply our forgetting curve to a large variety of models involving both transformer and RNN/SSM based architectures. Our measurement provides empirical evidence for the effectiveness of transformer extension techniques while raises questions for the effective length of RNN/SSM based models. We also examine the difference between our measurement and existing benchmarks as well as popular metrics for various models. Our code and results can be found at https://github.com/1azybug/ForgettingCurve.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.04727

Country: Asia > China > Liaoning Province (0.14)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.74)

Add feedback

More Effective LLM Compressed Tokens with Uniformly Spread Position Identifiers and Compression Loss

Zhao, Runsong, Huang, Pengcheng, Liu, Xinyu, Xiao, Chunyang, Xiao, Tong, Zhu, Jingbo

arXiv.org Artificial IntelligenceSep-27-2024

Compressing Transformer inputs into compressd tokens allows running LLMs with improved speed and cost efficiency. Based on the compression method ICAE, we carefully examine the position identifier choices for compressed tokens and also propose a new compression loss. We demonstrate empirically that our proposed methods achieve significantly higher compression ratios (15x compared to 4x for ICAE), while being able to attain comparable reconstruction performance.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2409.14364

Country: Asia > China (0.29)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

3rd Place Solution for MOSE Track in CVPR 2024 PVUW workshop: Complex Video Object Segmentation

Liu, Xinyu, Zhang, Jing, Zhang, Kexin, Yang, Yuting, Jiao, Licheng, Yang, Shuyuan

arXiv.org Artificial IntelligenceJun-5-2024

Motion Expression guided Video Segmentation (MeViS) Track is designed to advance the study of natural languageguided Video Object Segmentation (VOS) is a vital task in computer video understanding in complex environments, with vision, focusing on distinguishing foreground objects the goal of fostering the development of a more comprehensive from the background across video frames. Our work draws and robust pixel-level understanding of video scenes in inspiration from the Cutie model, and we investigate the effects such settings and realistic scenarios through the inclusion of of object memory, the total number of memory frames, new videos, sentences, and annotations [7].

artificial intelligence, object-oriented architecture, segmentation, (16 more...)

arXiv.org Artificial Intelligence

2406.03668

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.36)

Add feedback

A Fourier Approach to the Parameter Estimation Problem for One-dimensional Gaussian Mixture Models

Liu, Xinyu, Zhang, Hai

arXiv.org Machine LearningApr-18-2024

The purpose of this paper is twofold. First, we propose a novel algorithm for estimating parameters in one-dimensional Gaussian mixture models (GMMs). The algorithm takes advantage of the Hankel structure inherent in the Fourier data obtained from independent and identically distributed (i.i.d) samples of the mixture. For GMMs with a unified variance, a singular value ratio functional using the Fourier data is introduced and used to resolve the variance and component number simultaneously. The consistency of the estimator is derived. Compared to classic algorithms such as the method of moments and the maximum likelihood method, the proposed algorithm does not require prior knowledge of the number of Gaussian components or good initial guesses. Numerical experiments demonstrate its superior performance in estimation accuracy and computational cost. Second, we reveal that there exists a fundamental limit to the problem of estimating the number of Gaussian components or model order in the mixture model if the number of i.i.d samples is finite. For the case of a single variance, we show that the model order can be successfully estimated only if the minimum separation distance between the component means exceeds a certain threshold value and can fail if below. We derive a lower bound for this threshold value, referred to as the computational resolution limit, in terms of the number of i.i.d samples, the variance, and the number of Gaussian components. Numerical experiments confirm this phase transition phenomenon in estimating the model order. Moreover, we demonstrate that our algorithm achieves better scores in likelihood, AIC, and BIC when compared to the EM algorithm.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

2404.12613

Country:

Asia > China > Hong Kong (0.14)
North America > United States (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Add feedback

A Structure-Guided Gauss-Newton Method for Shallow ReLU Neural Network

Cai, Zhiqiang, Ding, Tong, Liu, Min, Liu, Xinyu, Xia, Jianlin

arXiv.org Artificial IntelligenceApr-7-2024

In this paper, we propose a structure-guided Gauss-Newton (SgGN) method for solving least squares problems using a shallow ReLU neural network. The method effectively takes advantage of both the least squares structure and the neural network structure of the objective function. By categorizing the weights and biases of the hidden and output layers of the network as nonlinear and linear parameters, respectively, the method iterates back and forth between the nonlinear and linear parameters. The nonlinear parameters are updated by a damped Gauss-Newton method and the linear ones are updated by a linear solver. Moreover, at the Gauss-Newton step, a special form of the Gauss-Newton matrix is derived for the shallow ReLU neural network and is used for efficient iterations. It is shown that the corresponding mass and Gauss-Newton matrices in the respective linear and nonlinear steps are symmetric and positive definite under reasonable assumptions. Thus, the SgGN method naturally produces an effective search direction without the need of additional techniques like shifting in the Levenberg-Marquardt method to achieve invertibility of the Gauss-Newton matrix. The convergence and accuracy of the method are demonstrated numerically for several challenging function approximation problems, especially those with discontinuities or sharp transition layers that pose significant challenges for commonly used training algorithms in machine learning.

artificial intelligence, machine learning, matrix, (18 more...)

arXiv.org Artificial Intelligence

2404.05064

Country: North America > United States > Indiana > Tippecanoe County (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Breaking Data Silos: Cross-Domain Learning for Multi-Agent Perception from Independent Private Sources

Li, Jinlong, Li, Baolu, Liu, Xinyu, Xu, Runsheng, Ma, Jiaqi, Yu, Hongkai

arXiv.org Artificial IntelligenceFeb-6-2024

The diverse agents in multi-agent perception systems may be from different companies. Each company might use the identical classic neural network architecture based encoder for feature extraction. However, the data source to train the various agents is independent and private in each company, leading to the Distribution Gap of different private data for training distinct agents in multi-agent perception system. The data silos by the above Distribution Gap could result in a significant performance decline in multi-agent perception. In this paper, we thoroughly examine the impact of the distribution gap on existing multi-agent perception systems. To break the data silos, we introduce the Feature Distribution-aware Aggregation (FDA) framework for cross-domain learning to mitigate the above Distribution Gap in multi-agent perception. FDA comprises two key components: Learnable Feature Compensation Module and Distribution-aware Statistical Consistency Module, both aimed at enhancing intermediate features to minimize the distribution gap among multi-agent features. Intensive experiments on the public OPV2V and V2XSet datasets underscore FDA's effectiveness in point cloud-based 3D object detection, presenting it as an invaluable augmentation to existing multi-agent perception systems.

artificial intelligence, distribution gap, perception, (14 more...)

arXiv.org Artificial Intelligence

2402.04273

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry:

Government (0.79)
Information Technology > Security & Privacy (0.49)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Towards Robust Aspect-based Sentiment Analysis through Non-counterfactual Augmentations

Liu, Xinyu, Ding, Yan, An, Kaikai, Xiao, Chunyang, Madhyastha, Pranava, Xiao, Tong, Zhu, Jingbo

arXiv.org Artificial IntelligenceJul-20-2023

While state-of-the-art NLP models have demonstrated excellent performance for aspect based sentiment analysis (ABSA), substantial evidence has been presented on their lack of robustness. This is especially manifested as significant degradation in performance when faced with out-of-distribution data. Recent solutions that rely on counterfactually augmented datasets show promising results, but they are inherently limited because of the lack of access to explicit causal structure. In this paper, we present an alternative approach that relies on non-counterfactual data augmentation. Our proposal instead relies on using noisy, cost-efficient data augmentations that preserve semantics associated with the target aspect. Our approach then relies on modelling invariances between different versions of the data to improve robustness. A comprehensive suite of experiments shows that our proposal significantly improves upon strong pre-trained baselines on both standard and robustness-specific datasets. Our approach further establishes a new state-of-the-art on the ABSA robustness benchmark and transfers well across domains.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2306.13971

Country:

Europe (0.67)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.86)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos

Hori, Chiori, Peng, Puyuan, Harwath, David, Liu, Xinyu, Ota, Kei, Jain, Siddarth, Corcodel, Radu, Jha, Devesh, Romeres, Diego, Roux, Jonathan Le

arXiv.org Artificial IntelligenceJun-27-2023

To realize human-robot collaboration, robots need to execute actions for new tasks according to human instructions given finite prior knowledge. Human experts can share their knowledge of how to perform a task with a robot through multi-modal instructions in their demonstrations, showing a sequence of short-horizon steps to achieve a long-horizon goal. This paper introduces a method for robot action sequence generation from instruction videos using (1) an audio-visual Transformer that converts audio-visual features and instruction speech to a sequence of robot actions called dynamic movement primitives (DMPs) and (2) style-transfer-based training that employs multi-task learning with video captioning and weakly-supervised learning with a semantic classifier to exploit unpaired video-action data. We built a system that accomplishes various cooking actions, where an arm robot executes a DMP sequence acquired from a cooking video using the audio-visual Transformer. Experiments with Epic-Kitchen-100, YouCookII, QuerYD, and in-house instruction video datasets show that the proposed method improves the quality of DMP sequences by 2.3 times the METEOR score obtained with a baseline video-to-action Transformer. The model achieved 32% of the task success rate with the task knowledge of the object.

artificial intelligence, machine learning, video, (16 more...)

arXiv.org Artificial Intelligence

2306.15644

Country: North America > United States > Texas > Travis County > Austin (0.14)

Genre: Workflow (0.68)

Industry: Education > Educational Technology > Audio & Video (0.56)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Robots > Manipulation (0.34)

Add feedback