AITopics | Gou, Liang

Collaborating Authors

Gou, Liang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MAG-V: A Multi-Agent Framework for Synthetic Data Generation and Verification

Sengupta, Saptarshi, Vashistha, Harsh, Curtis, Kristal, Mallipeddi, Akshay, Mathur, Abhinav, Ross, Joseph, Gou, Liang

arXiv.org Artificial IntelligenceJan-10-2025

Extending the capabilities of Large Language Models (LLMs) with functions or tools for environment interaction has led to the emergence of the agent paradigm. In industry, training an LLM is not always feasible because of the scarcity of domain data, legal holds on proprietary customer data, rapidly changing business requirements, and the need to prototype new assistants. Agents provide an elegant solution to the above by relying on the zero-shot reasoning abilities of the underlying LLM and utilizing tools to explore and reason over customer data and respond to user requests. However, there are two concerns here: (I) acquiring large scale customer queries for agent testing is time-consuming, and (II) high reliance on the tool call sequence (or trajectory) followed by the agent to respond to user queries may lead to unexpected or incorrect behavior. To address this, we propose MAG-V, a multi-agent framework to first generate a dataset of questions that mimic customer queries; and second, reverse-engineer alternate questions from the responses for trajectory verification. Initial results indicate that our synthetic data can improve agent performance on actual customer queries. Furthermore, our trajectory verification methodology, inspired by distant supervision and using traditional machine learning (ML) models, outperforms a GPT-4o judge baseline by 11% accuracy and matches the performance of a GPT-4 judge on our constructed dataset. Overall, our approach is a step towards unifying diverse task agents into a cohesive framework for achieving an aligned objective.

large language model, machine learning, trajectory, (18 more...)

arXiv.org Artificial Intelligence

2412.04494

Country:

Asia > Thailand (0.15)
North America > United States (0.14)
Oceania > Australia (0.14)
North America > Canada (0.14)

Genre: Research Report (0.64)

Industry: Law (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

InterVLS: Interactive Model Understanding and Improvement with Vision-Language Surrogates

Huang, Jinbin, He, Wenbin, Gou, Liang, Ren, Liu, Bryan, Chris

arXiv.org Artificial IntelligenceNov-6-2023

Deep learning models are widely used in critical applications, highlighting the need for pre-deployment model understanding and improvement. Visual concept-based methods, while increasingly used for this purpose, face challenges: (1) most concepts lack interpretability, (2) existing methods require model knowledge, often unavailable at run time. Additionally, (3) there lacks a no-code method for post-understanding model improvement. Addressing these, we present InterVLS. The system facilitates model understanding by discovering text-aligned concepts, measuring their influence with model-agnostic linear surrogates. Employing visual analytics, InterVLS offers concept-based explanations and performance insights. It enables users to adjust concept influences to update a model, facilitating no-code model improvement. We evaluate InterVLS in a user study, illustrating its functionality with two scenarios. Results indicates that InterVLS is effective to help users identify influential concepts to a model, gain insights and adjust concept influence to improve the model. We conclude with a discussion based on our study results.

intervls, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2311.03547

Country:

North America > United States (0.28)
Europe > Austria > Vienna (0.14)

Genre: Research Report (1.00)

Industry:

Transportation (0.46)
Information Technology (0.46)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

GradOrth: A Simple yet Efficient Out-of-Distribution Detection with Orthogonal Projection of Gradients

Behpour, Sima, Doan, Thang, Li, Xin, He, Wenbin, Gou, Liang, Ren, Liu

arXiv.org Artificial IntelligenceAug-1-2023

Detecting out-of-distribution (OOD) data is crucial for ensuring the safe deployment of machine learning models in real-world applications. However, existing OOD detection approaches primarily rely on the feature maps or the full gradient space information to derive OOD scores neglecting the role of most important parameters of the pre-trained network over in-distribution (ID) data. In this study, we propose a novel approach called GradOrth to facilitate OOD detection based on one intriguing observation that the important features to identify OOD data lie in the lower-rank subspace of in-distribution (ID) data. In particular, we identify OOD data by computing the norm of gradient projection on the subspaces considered important for the in-distribution data. A large orthogonal projection value (i.e. a small projection value) indicates the sample as OOD as it captures a weak correlation of the ID data. This simple yet effective method exhibits outstanding performance, showcasing a notable reduction in the average false positive rate at a 95% true positive rate (FPR95) of up to 8% when compared to the current state-of-the-art methods.

artificial intelligence, dataset, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2308.0031

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Hyp-OW: Exploiting Hierarchical Structure Learning with Hyperbolic Distance Enhances Open World Object Detection

Doan, Thang, Li, Xin, Behpour, Sima, He, Wenbin, Gou, Liang, Ren, Liu

arXiv.org Artificial IntelligenceJun-25-2023

Open World Object Detection (OWOD) is a challenging and realistic task that extends beyond the scope of standard Object Detection task. It involves detecting both known and unknown objects while integrating learned knowledge for future tasks. However, the level of 'unknownness' varies significantly depending on the context. For example, a tree is typically considered part of the background in a self-driving scene, but it may be significant in a household context. We argue that this external or contextual information should already be embedded within the known classes. In other words, there should be a semantic or latent structure relationship between the known and unknown items to be discovered. Motivated by this observation, we propose Hyp-OW, a method that learns and models hierarchical representation of known items through a SuperClass Regularizer. Leveraging this learned representation allows us to effectively detect unknown objects using a Similarity Distance-based Relabeling module. Extensive experiments on benchmark datasets demonstrate the effectiveness of Hyp-OW achieving improvement in both known and unknown detection (up to 6 points). These findings are particularly pronounced in our newly designed benchmark, where a strong hierarchical structure exists between known and unknown objects.

category, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2306.14291

Country:

North America > United States (0.28)
Asia > Middle East > Israel (0.14)
Asia > Middle East > Qatar (0.14)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment > Sports (0.47)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

CLIP-S$^4$: Language-Guided Self-Supervised Semantic Segmentation

He, Wenbin, Jamonnak, Suphanut, Gou, Liang, Ren, Liu

arXiv.org Artificial IntelligenceMay-1-2023

Existing semantic segmentation approaches are often limited by costly pixel-wise annotations and predefined classes. In this work, we present CLIP-S$^4$ that leverages self-supervised pixel representation learning and vision-language models to enable various semantic segmentation tasks (e.g., unsupervised, transfer learning, language-driven segmentation) without any human annotations and unknown class information. We first learn pixel embeddings with pixel-segment contrastive learning from different augmented views of images. To further improve the pixel embeddings and enable language-driven semantic segmentation, we design two types of consistency guided by vision-language models: 1) embedding consistency, aligning our pixel embeddings to the joint feature space of a pre-trained vision-language model, CLIP; and 2) semantic consistency, forcing our model to make the same predictions as CLIP over a set of carefully designed target classes with both known and unknown prototypes. Thus, CLIP-S$^4$ enables a new task of class-free semantic segmentation where no unknown class information is needed during training. As a result, our approach shows consistent and substantial performance improvement over four popular benchmarks compared with the state-of-the-art unsupervised and language-driven semantic segmentation methods. More importantly, our method outperforms these methods on unknown class recognition by a large margin.

machine learning, natural language, segmentation, (17 more...)

arXiv.org Artificial Intelligence

2305.0104

Genre: Research Report (0.64)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Self-supervised Semantic Segmentation Grounded in Visual Concepts

He, Wenbin, Surmeier, William, Shekar, Arvind Kumar, Gou, Liang, Ren, Liu

arXiv.org Artificial IntelligenceJul-26-2022

Unsupervised semantic segmentation requires assigning a label to every pixel without any human annotations. Despite recent advances in self-supervised representation learning for individual images, unsupervised semantic segmentation with pixel-level representations is still a challenging task and remains underexplored. In this work, we propose a self-supervised pixel representation learning method for semantic segmentation by using visual concepts (i.e., groups of pixels with semantic meanings, such as parts, objects, and scenes) extracted from images. To guide self-supervised learning, we leverage three types of relationships between pixels and concepts, including the relationships between pixels and local concepts, local and global concepts, as well as the co-occurrence of concepts. We evaluate the learned pixel embeddings and visual concepts on three datasets, including PASCAL VOC 2012, COCO 2017, and DAVIS 2017. Our results show that the proposed method gains consistent and substantial improvements over recent unsupervised semantic segmentation approaches, and also demonstrate that visual concepts can reveal insights into image datasets.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2203.13868

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
(2 more...)

Add feedback

Towards a Flexible Embedding Learning Framework

Yeh, Chin-Chia Michael, Gelda, Dhruv, Zhuang, Zhongfang, Zheng, Yan, Gou, Liang, Zhang, Wei

arXiv.org Artificial IntelligenceSep-23-2020

Representation learning is a fundamental building block for analyzing entities in a database. While the existing embedding learning methods are effective in various data mining problems, their applicability is often limited because these methods have pre-determined assumptions on the type of semantics captured by the learned embeddings, and the assumptions may not well align with specific downstream tasks. In this work, we propose an embedding learning framework that 1) uses an input format that is agnostic to input data type, 2) is flexible in terms of the relationships that can be embedded into the learned representations, and 3) provides an intuitive pathway to incorporate domain knowledge into the embedding learning process. Our proposed framework utilizes a set of entity-relation-matrices as the input, which quantifies the affinities among different entities in the database. Moreover, a sampling mechanism is carefully designed to establish a direct connection between the input and the information captured by the output embeddings. To complete the representation learning toolbox, we also outline a simple yet effective post-processing technique to properly visualize the learned embeddings. Our empirical results demonstrate that the proposed framework, in conjunction with a set of relevant entity-relation-matrices, outperforms the existing state-of-the-art approaches in various data mining tasks.

matrix, survey article, text processing, (24 more...)

arXiv.org Artificial Intelligence

2009.10989

Country:

Asia (0.93)
North America > United States (0.46)

Genre:

Research Report > Promising Solution (0.34)
Research Report > New Finding (0.34)

Industry:

Consumer Products & Services > Restaurants (0.70)
Banking & Finance (0.69)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Dynamic Graph Representation Learning via Self-Attention Networks

Sankar, Aravind, Wu, Yanhong, Gou, Liang, Zhang, Wei, Yang, Hao

arXiv.org Machine LearningDec-21-2018

Learning latent representations of nodes in graphs is an important and ubiquitous task with widespread applications such as link prediction, node classification, and graph visualization. Previous methods on graph representation learning mainly focus on static graphs, however, many real-world graphs are dynamic and evolve over time. In this paper, we present Dynamic Self-Attention Network (DySAT), a novel neural architecture that operates on dynamic graphs and learns node representations that capture both structural properties and temporal evolutionary patterns. Specifically, DySAT computes node representations by jointly employing self-attention layers along two dimensions: structural neighborhood and temporal dynamics. We conduct link prediction experiments on two classes of graphs: communication networks and bipartite rating networks. Our experimental results show that DySAT has a significant performance gain over several different state-of-the-art graph embedding baselines.

deep learning, neural network, representation, (22 more...)

arXiv.org Machine Learning

1812.0943

Country:

North America > Canada (0.68)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > Santa Clara County (0.14)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Social Lens: Personalization Around User Defined Collections for Filtering Enterprise Message Streams

Daly, Elizabeth M. (IBM Research, Cambridge) | Muller, Michael (IBM Research, Cambridge) | Gou, Liang (The Pennsylvania State University) | Millen, David R. (IBM Research, Cambridge)

AAAI ConferencesJul-12-2011

Social media has led to a data explosion and has begun to play an ever increasing role as a valuable source of information and a mechanism for information discovery. The wealth of data highlights the need for methods to filter and sort information in order to allow users to discover useful information. Most traditional solutions focus on the user, either the user's social network, or a form of personalization based on collaborative filtering or predictive user modeling. This paper presents a novel algorithm to view information through a lens based on a user defined collection while excluding the attributes of the user from the analysis. As a result, the lens is transparent, tunable and sharable amongst users and, additionally allows both a reduction in information overload while discovering new related content.

artificial intelligence, information, social media, (17 more...)

AAAI Conferences

Fifth International AAAI Conference on Weblogs and Social Media

Country:

North America > United States > Pennsylvania (0.14)
North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Services (0.50)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)

Add feedback