AITopics | Wu, Yan

Collaborating Authors

Wu, Yan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Robust Training Objectives Improve Embedding-based Retrieval in Industrial Recommendation Systems

Kolodner, Matthew, Ju, Mingxuan, Fan, Zihao, Zhao, Tong, Ghazizadeh, Elham, Wu, Yan, Shah, Neil, Liu, Yozen

arXiv.org Artificial IntelligenceSep-22-2024

Improving recommendation systems (RS) can greatly enhance the user experience across many domains, such as social media. Many RS utilize embedding-based retrieval (EBR) approaches to retrieve candidates for recommendation. In an EBR system, the embedding quality is key. According to recent literature, self-supervised multitask learning (SSMTL) has showed strong performance on academic benchmarks in embedding learning and resulted in an overall improvement in multiple downstream tasks, demonstrating a larger resilience to the adverse conditions between each downstream task and thereby increased robustness and task generalization ability through the training objective. However, whether or not the success of SSMTL in academia as a robust training objectives translates to large-scale (i.e., over hundreds of million users and interactions in-between) industrial RS still requires verification. Simply adopting academic setups in industrial RS might entail two issues. Firstly, many self-supervised objectives require data augmentations (e.g., embedding masking/corruption) over a large portion of users and items, which is prohibitively expensive in industrial RS. Furthermore, some self-supervised objectives might not align with the recommendation task, which might lead to redundant computational overheads or negative transfer. In light of these two challenges, we evaluate using a robust training objective, specifically SSMTL, through a large-scale friend recommendation system on a social media platform in the tech sector, identifying whether this increase in robustness can work at scale in enhancing retrieval in the production setting. Through online A/B testing with SSMTL-based EBR, we observe statistically significant increases in key metrics in the friend recommendations, with up to 5.45% improvements in new friends made and 1.91% improvements in new friends made with cold-start users.

artificial intelligence, machine learning, recommendation, (14 more...)

arXiv.org Artificial Intelligence

2409.14682

Country:

Europe > Italy (0.17)
North America > United States (0.16)

Genre:

Instructional Material (1.00)
Research Report > Experimental Study (0.49)
Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Analysis of Impulsive Interference in Digital Audio Broadcasting Systems in Electric Vehicles

Chen, Chin-Hung, Huang, Wen-Hung, Karanov, Boris, Young, Alex, Wu, Yan, van Houtum, Wim

arXiv.org Artificial IntelligenceMay-17-2024

Recently, new types of interference in electric vehicles (EVs), such as converters switching and/or battery chargers, have been found to degrade the performance of wireless digital transmission systems. Measurements show that such an interference is characterized by impulsive behavior and is widely varying in time. This paper uses recorded data from our EV testbed to analyze the impulsive interference in the digital audio broadcasting band. Moreover, we use our analysis to obtain a corresponding interference model. In particular, we studied the temporal characteristics of the interference and confirmed that its amplitude indeed exhibits an impulsive behavior. Our results show that impulsive events span successive received signal samples and thus indicate a bursty nature. To this end, we performed a data-driven modification of a well-established model for bursty impulsive interference, the Markov-Middleton model, to produce synthetic noise realization. We investigate the optimal symbol detector design based on the proposed model and show significant performance gains compared to the conventional detector based on the additive white Gaussian noise assumption.

artificial intelligence, interference, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2405.10828

Country: Europe > Netherlands (0.33)

Genre: Research Report > New Finding (0.69)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Electric Vehicle (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Data-Driven Symbol Detection for Intersymbol Interference Channels with Bursty Impulsive Noise

Karanov, Boris, Chen, Chin-Hung, Wu, Yan, Young, Alex, van Houtum, Wim

arXiv.org Artificial IntelligenceMay-17-2024

We developed machine learning approaches for data-driven trellis-based soft symbol detection in coded transmission over intersymbol interference (ISI) channels in presence of bursty impulsive noise (IN), for example encountered in wireless digital broadcasting systems and vehicular communications. This enabled us to obtain optimized detectors based on the Bahl-Cocke-Jelinek-Raviv (BCJR) algorithm while circumventing the use of full channel state information (CSI) for computing likelihoods and trellis state transition probabilities. First, we extended the application of the neural network (NN)-aided BCJR, recently proposed for ISI channels with additive white Gaussian noise (AWGN). Although suitable for estimating likelihoods via labeling of transmission sequences, the BCJR-NN method does not provide a framework for learning the trellis state transitions. In addition to detection over the joint ISI and IN states we also focused on another scenario where trellis transitions are not trivial: detection for the ISI channel with AWGN with inaccurate knowledge of the channel memory at the receiver. Without access to the accurate state transition matrix, the BCJR- NN performance significantly degrades in both settings. To this end, we devised an alternative approach for data-driven BCJR detection based on the unsupervised learning of a hidden Markov model (HMM). The BCJR-HMM allowed us to optimize both the likelihood function and the state transition matrix without labeling. Moreover, we demonstrated the viability of a hybrid NN and HMM BCJR detection where NN is used for learning the likelihoods, while the state transitions are optimized via HMM. While reducing the required prior channel knowledge, the examined data-driven detectors with learned trellis state transitions achieve bit error rates close to the optimal full CSI-based BCJR, significantly outperforming detection with inaccurate CSI.

artificial intelligence, detection, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2405.10814

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment (0.48)
Media > Television (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

CPA-Enhancer: Chain-of-Thought Prompted Adaptive Enhancer for Object Detection under Unknown Degradations

Zhang, Yuwei, Wu, Yan, Liu, Yanming, Peng, Xinyue

arXiv.org Artificial IntelligenceMar-22-2024

Object detection methods under known single degradations have been extensively investigated. However, existing approaches require prior knowledge of the degradation type and train a separate model for each, limiting their practical applications in unpredictable environments. To address this challenge, we propose a chain-of-thought (CoT) prompted adaptive enhancer, CPA-Enhancer, for object detection under unknown degradations. Specifically, CPA-Enhancer progressively adapts its enhancement strategy under the step-by-step guidance of CoT prompts, that encode degradation-related information. To the best of our knowledge, it's the first work that exploits CoT prompting for object detection tasks. Overall, CPA-Enhancer is a plug-and-play enhancement model that can be integrated into any generic detectors to achieve substantial gains on degraded images, without knowing the degradation type priorly. Experimental results demonstrate that CPA-Enhancer not only sets the new state of the art for object detection but also boosts the performance of other downstream vision tasks under unknown degradations.

artificial intelligence, machine learning, proceedings, (14 more...)

arXiv.org Artificial Intelligence

2403.1122

Country: Europe > Netherlands (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Multi-perspective Feedback-attention Coupling Model for Continuous-time Dynamic Graphs

Zhu, Xiaobo, Wu, Yan, Li, Zhipeng, Su, Hailong, Che, Jin, Chen, Zhanheng, Wang, Liying

arXiv.org Artificial IntelligenceDec-13-2023

Recently, representation learning over graph networks has gained popularity, with various models showing promising results. Despite this, several challenges persist: 1) most methods are designed for static or discrete-time dynamic graphs; 2) existing continuous-time dynamic graph algorithms focus on a single evolving perspective; and 3) many continuous-time dynamic graph approaches necessitate numerous temporal neighbors to capture long-term dependencies. In response, this paper introduces the Multi-Perspective Feedback-Attention Coupling (MPFA) model. MPFA incorporates information from both evolving and raw perspectives, efficiently learning the interleaved dynamics of observed processes. The evolving perspective employs temporal self-attention to distinguish continuously evolving temporal neighbors for information aggregation. Through dynamic updates, this perspective can capture long-term dependencies using a small number of temporal neighbors. Meanwhile, the raw perspective utilizes a feedback attention module with growth characteristic coefficients to aggregate raw neighborhood information. Experimental results on a self-organizing dataset and seven public datasets validate the efficacy and competitiveness of our proposed model.

data mining, information, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2312.07983

Country:

North America > United States (0.46)
Asia > China > Ningxia Hui Autonomous Region (0.14)

Genre: Research Report (0.82)

Industry: Education > Educational Setting > Online (0.31)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Visual-Policy Learning through Multi-Camera View to Single-Camera View Knowledge Distillation for Robot Manipulation Tasks

Acar, Cihan, Binici, Kuluhan, Tekirdağ, Alp, Wu, Yan

arXiv.org Artificial IntelligenceDec-2-2023

The use of multi-camera views simultaneously has been shown to improve the generalization capabilities and performance of visual policies. However, the hardware cost and design constraints in real-world scenarios can potentially make it challenging to use multiple cameras. In this study, we present a novel approach to enhance the generalization performance of vision-based Reinforcement Learning (RL) algorithms for robotic manipulation tasks. Our proposed method involves utilizing a technique known as knowledge distillation, in which a pre-trained ``teacher'' policy trained with multiple camera viewpoints guides a ``student'' policy in learning from a single camera viewpoint. To enhance the student policy's robustness against camera location perturbations, it is trained using data augmentation and extreme viewpoint changes. As a result, the student policy learns robust visual features that allow it to locate the object of interest accurately and consistently, regardless of the camera viewpoint. The efficacy and efficiency of the proposed method were evaluated both in simulation and real-world environments. The results demonstrate that the single-view visual student policy can successfully learn to grasp and lift a challenging object, which was not possible with a single-view policy alone. Furthermore, the student policy demonstrates zero-shot transfer capability, where it can successfully grasp and lift objects in real-world scenarios for unseen visual configurations.

machine learning, reinforcement learning, student policy, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2023.3336245

2303.07026

Country: Asia > Singapore (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Dynamic Link Prediction for New Nodes in Temporal Graph Networks

Zhu, Xiaobo, Wu, Yan, Zhang, Qinhu, Chen, Zhanheng, He, Ying

arXiv.org Artificial IntelligenceOct-15-2023

Modelling temporal networks for dynamic link prediction of new nodes has many real-world applications, such as providing relevant item recommendations to new customers in recommender systems and suggesting appropriate posts to new users on social platforms. Unlike old nodes, new nodes have few historical links, which poses a challenge for the dynamic link prediction task. Most existing dynamic models treat all nodes equally and are not specialized for new nodes, resulting in suboptimal performances. In this paper, we consider dynamic link prediction of new nodes as a few-shot problem and propose a novel model based on the meta-learning principle to effectively mitigate this problem. Specifically, we develop a temporal encoder with a node-level span memory to obtain a new node embedding, and then we use a predictor to determine whether the new node generates a link. To overcome the few-shot challenge, we incorporate the encoder-predictor into the meta-learning paradigm, which can learn two types of implicit information during the formation of the temporal network through span adaptation and node adaptation. The acquired implicit information can serve as model initialisation and facilitate rapid adaptation to new nodes through a fine-tuning process on just a few links. Experiments on three publicly available datasets demonstrate the superior performance of our model compared to existing state-of-the-art methods.

data mining, machine learning, node, (21 more...)

arXiv.org Artificial Intelligence

2310.09787

Country: Asia > China (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Promising Solution (0.68)

Industry: Media > News (0.49)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

On a continuous time model of gradient descent dynamics and instability in deep learning

Rosca, Mihaela, Wu, Yan, Qin, Chongli, Dherin, Benoit

arXiv.org Machine LearningSep-13-2023

The recipe behind the success of deep learning has been the combination of neural networks and gradient-based optimization. Understanding the behavior of gradient descent however, and particularly its instability, has lagged behind its empirical success. To add to the theoretical tools available to study gradient descent we propose the principal flow (PF), a continuous time flow that approximates gradient descent dynamics. To our knowledge, the PF is the only continuous flow that captures the divergent and oscillatory behaviors of gradient descent, including escaping local minima and saddle points. Through its dependence on the eigendecomposition of the Hessian the PF sheds light on the recently observed edge of stability phenomena in deep learning. Using our new understanding of instability we propose a learning rate adaptation method which enables us to control the trade-off between training stability and test set evaluation performance.

artificial intelligence, gradient descent, machine learning, (17 more...)

arXiv.org Machine Learning

2302.01952

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AbDiffuser: Full-Atom Generation of In-Vitro Functioning Antibodies

Martinkus, Karolis, Ludwiczak, Jan, Cho, Kyunghyun, Liang, Wei-Ching, Lafrance-Vanasse, Julien, Hotzel, Isidro, Rajpal, Arvind, Wu, Yan, Bonneau, Richard, Gligorijevic, Vladimir, Loukas, Andreas

arXiv.org Artificial IntelligenceJul-28-2023

We introduce AbDiffuser, an equivariant and physics-informed diffusion model for the joint generation of antibody 3D structures and sequences. AbDiffuser is built on top of a new representation of protein structure, relies on a novel architecture for aligned proteins, and utilizes strong diffusion priors to improve the denoising process. Our approach improves protein diffusion by taking advantage of domain knowledge and physics-based constraints; handles sequence-length changes; and reduces memory complexity by an order of magnitude enabling backbone and side chain generation. Numerical experiments showcase the ability of AbDiffuser to generate antibodies that closely track the sequence and structural properties of a reference set. Laboratory experiments confirm that all 16 HER2 antibodies discovered were expressed at high levels and that 57.1% of selected designs were tight binders. We focus on the generation of immunoglobulin proteins, also known as antibodies, that help the immune ...

abdiffuser, artificial intelligence, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2308.05027

Country: North America > United States (0.67)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Government > Regional Government > North America Government > United States Government > FDA (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)

Add feedback

Protein Discovery with Discrete Walk-Jump Sampling

Frey, Nathan C., Berenberg, Daniel, Zadorozhny, Karina, Kleinhenz, Joseph, Lafrance-Vanasse, Julien, Hotzel, Isidro, Wu, Yan, Ra, Stephen, Bonneau, Richard, Cho, Kyunghyun, Loukas, Andreas, Gligorijevic, Vladimir, Saremi, Saeed

arXiv.org Artificial IntelligenceJun-8-2023

We resolve difficulties in training and sampling from a discrete generative model by learning a smoothed energy function, sampling from the smoothed data manifold with Langevin Markov chain Monte Carlo (MCMC), and projecting back to the true data manifold with one-step denoising. Our Discrete Walk-Jump Sampling formalism combines the maximum likelihood training of an energy-based model and improved sample quality of a score-based model, while simplifying training and sampling by requiring only a single noise level. We evaluate the robustness of our approach on generative modeling of antibody proteins and introduce the distributional conformity score to benchmark protein generative models. By optimizing and sampling from our models for the proposed distributional conformity score, 97-100% of generated samples are successfully expressed and purified and 35% of functional designs show equal or improved binding affinity compared to known functional antibodies on the first attempt in a single round of laboratory experiments. We also report the first demonstration of long-run fast-mixing MCMC chains where diverse antibody protein classes are visited in a single MCMC chain.

artificial intelligence, machine learning, sequence, (18 more...)

arXiv.org Artificial Intelligence

2306.1236

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.97)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
(3 more...)

Add feedback