AITopics | Yang, Weikai

Collaborating Authors

Yang, Weikai

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Structural-Entropy-Based Sample Selection for Efficient and Effective Learning

Xie, Tianchi, Zhu, Jiangning, Ma, Guozu, Lin, Minzhi, Chen, Wei, Yang, Weikai, Liu, Shixia

arXiv.org Artificial IntelligenceOct-5-2024

Sample selection improves the efficiency and effectiveness of machine learning models by providing informative and representative samples. Typically, samples can be modeled as a sample graph, where nodes are samples and edges represent their similarities. Most existing methods are based on local information, such as the training difficulty of samples, thereby overlooking global information, such as connectivity patterns. This oversight can result in suboptimal selection because global information is crucial for ensuring that the selected samples well represent the structural properties of the graph. To address this issue, we employ structural entropy to quantify global information and losslessly decompose it from the whole graph to individual nodes using the Shapley value. Based on the decomposition, we present $\textbf{S}$tructural-$\textbf{E}$ntropy-based sample $\textbf{S}$election ($\textbf{SES}$), a method that integrates both global and local information to select informative and representative samples. SES begins by constructing a $k$NN-graph among samples based on their similarities. It then measures sample importance by combining structural entropy (global metric) with training difficulty (local metric). Finally, SES applies importance-biased blue noise sampling to select a set of diverse and representative samples. Comprehensive experiments on three learning scenarios -- supervised learning, active learning, and continual learning -- clearly demonstrate the effectiveness of our method.

artificial intelligence, machine learning, proceedings, (17 more...)

arXiv.org Artificial Intelligence

2410.02268

Country:

Asia > China (0.28)
North America > United States > Wisconsin (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Foundation Models Meet Visualizations: Challenges and Opportunities

Yang, Weikai, Liu, Mengchen, Wang, Zheng, Liu, Shixia

arXiv.org Artificial IntelligenceOct-9-2023

Recent studies have indicated that foundation models, such as BERT and GPT, excel in adapting to a variety of downstream tasks. This adaptability has established them as the dominant force in building artificial intelligence (AI) systems. As visualization techniques intersect with these models, a new research paradigm emerges. This paper divides these intersections into two main areas: visualizations for foundation models (VIS4FM) and foundation models for visualizations (FM4VIS). In VIS4FM, we explore the primary role of visualizations in understanding, refining, and evaluating these intricate models. This addresses the pressing need for transparency, explainability, fairness, and robustness. Conversely, within FM4VIS, we highlight how foundation models can be utilized to advance the visualization field itself. The confluence of foundation models and visualizations holds great promise, but it also comes with its own set of challenges. By highlighting these challenges and the growing opportunities, this paper seeks to provide a starting point for continued exploration in this promising avenue.

data mining, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2310.05771

Country: Asia > China (0.28)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Health & Medicine (0.68)
Information Technology (0.46)
Education > Educational Setting (0.46)

Technology:

Information Technology > Visualization (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Human Computer Interaction > Interfaces (1.00)
(6 more...)

Add feedback

Interactive Steering of Hierarchical Clustering

Yang, Weikai, Wang, Xiting, Lu, Jie, Dou, Wenwen, Liu, Shixia

arXiv.org Machine LearningSep-21-2020

Hierarchical clustering is an important technique to organize big data for exploratory data analysis. However, existing one-size-fits-all hierarchical clustering methods often fail to meet the diverse needs of different users. To address this challenge, we present an interactive steering method to visually supervise constrained hierarchical clustering by utilizing both public knowledge (e.g., Wikipedia) and private knowledge from users. The novelty of our approach includes 1) automatically constructing constraints for hierarchical clustering using knowledge (knowledge-driven) and intrinsic data distribution (data-driven), and 2) enabling the interactive steering of clustering through a visual interface (user-driven). Our method first maps each data item to the most relevant items in a knowledge base. An initial constraint tree is then extracted using the ant colony optimization algorithm. The algorithm balances the tree width and depth and covers the data items with high confidence. Given the constraint tree, the data items are hierarchically clustered using evolutionary Bayesian rose tree. To clearly convey the hierarchical clustering results, an uncertainty-aware tree visualization has been developed to enable users to quickly locate the most uncertain sub-hierarchies and interactively improve them. The quantitative evaluation and case study demonstrate that the proposed approach facilitates the building of customized clustering trees in an efficient and effective manner.

artificial intelligence, health & medicine, hierarchy, (16 more...)

arXiv.org Machine Learning

doi: 10.1109/TVCG.2020.2995100

2009.09618

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry:

Education > Educational Setting > Higher Education (0.67)
Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Diagnosing Concept Drift with Visual Analytics

Yang, Weikai, Li, Zhen, Liu, Mengchen, Lu, Yafeng, Cao, Kelei, Maciejewski, Ross, Liu, Shixia

arXiv.org Machine LearningSep-15-2020

Concept drift is a phenomenon in which the distribution of a data stream changes over time in unforeseen ways, causing prediction models built on historical data to become inaccurate. While a variety of automated methods have been developed to identify when concept drift occurs, there is limited support for analysts who need to understand and correct their models when drift is detected. In this paper, we present a visual analytics method, DriftVis, to support model builders and analysts in the identification and correction of concept drift in streaming data. DriftVis combines a distribution-based drift detection method with a streaming scatterplot to support the analysis of drift caused by the distribution changes of data streams and to explore the impact of these changes on the model's accuracy. A quantitative experiment and two case studies on weather prediction and text classification have been conducted to demonstrate our proposed tool and illustrate how visual analytics can be used to support the detection, examination, and correction of concept drift.

comp, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

2007.14372

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (0.67)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)

Add feedback