AITopics | Fang, Haoyang

Collaborating Authors

Fang, Haoyang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Effectively Steer LLM To Follow Preference via Building Confident Directions

Song, Bingqing, Han, Boran, Zhang, Shuai, Wang, Hao, Fang, Haoyang, Min, Bonan, Wang, Yuyang, Hong, Mingyi

arXiv.org Artificial IntelligenceMar-4-2025

Having an LLM that aligns with human preferences is essential for accommodating individual needs, such as maintaining writing style or generating specific topics of interest. The majority of current alignment methods rely on fine-tuning or prompting, which can be either costly or difficult to control. Model steering algorithms, which modify the model output by constructing specific steering directions, are typically easy to implement and optimization-free. However, their capabilities are typically limited to steering the model into one of the two directions (i.e., bidirectional steering), and there has been no theoretical understanding to guarantee their performance. In this work, we propose a theoretical framework to understand and quantify the model steering methods. Inspired by the framework, we propose a confident direction steering method (CONFST) that steers LLMs via modifying their activations at inference time. More specifically, CONFST builds a confident direction that is closely aligned with users' preferences, and this direction is then added to the activations of the LLMs to effectively steer the model output. Our approach offers three key advantages over popular bidirectional model steering methods: 1) It is more powerful, since multiple (i.e. more than two) users' preferences can be aligned simultaneously; 2) It is simple to implement, since there is no need to determine which layer to add the steering vector to; 3) No explicit user instruction is required. We validate our method on GPT-2 XL (1.5B), Mistral (7B) and Gemma-it (9B) models for tasks that require shifting the output of LLMs across various topics and styles, achieving superior performance over competing methods.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2503.02989

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AutoGluon-Multimodal (AutoMM): Supercharging Multimodal AutoML with Foundation Models

Tang, Zhiqiang, Fang, Haoyang, Zhou, Su, Yang, Taojiannan, Zhong, Zihan, Hu, Tony, Kirchhoff, Katrin, Karypis, George

arXiv.org Artificial IntelligenceApr-30-2024

AutoGluon-Multimodal (AutoMM) is introduced as an open-source AutoML library designed specifically for multimodal learning. Distinguished by its exceptional ease of use, AutoMM enables fine-tuning of foundation models with just three lines of code. Supporting various modalities including image, text, and tabular data, both independently and in combination, the library offers a comprehensive suite of functionalities spanning classification, regression, object detection, semantic matching, and image segmentation. Experiments across diverse datasets and tasks showcases AutoMM's superior performance in basic classification and regression tasks compared to existing AutoML tools, while also demonstrating competitive results in advanced tasks, aligning with specialized toolboxes designed for such purposes.

machine learning, natural language, text classification, (20 more...)

arXiv.org Artificial Intelligence

2404.16233

Country:

North America > United States (0.46)
North America > Canada > Ontario > Toronto (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Banking & Finance (0.67)
Health & Medicine > Diagnostic Medicine > Imaging (0.46)
Information Technology > Services (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Information Management (1.00)
Information Technology > Data Science (1.00)
(6 more...)

Add feedback

Convolution Meets LoRA: Parameter Efficient Finetuning for Segment Anything Model

Zhong, Zihan, Tang, Zhiqiang, He, Tong, Fang, Haoyang, Yuan, Chun

arXiv.org Artificial IntelligenceJan-31-2024

The Segment Anything Model (SAM) stands as a foundational framework for image segmentation. While it exhibits remarkable zero-shot generalization in typical scenarios, its advantage diminishes when applied to specialized domains like medical imagery and remote sensing. To address this limitation, this paper introduces Conv-LoRA, a simple yet effective parameter-efficient fine-tuning approach. By integrating ultra-lightweight convolutional parameters into Low-Rank Adaptation (LoRA), Conv-LoRA can inject image-related inductive biases into the plain ViT encoder, further reinforcing SAM's local prior assumption. Notably, Conv-LoRA not only preserves SAM's extensive segmentation knowledge but also revives its capacity of learning high-level image semantics, which is constrained by SAM's foreground-background segmentation pretraining. Comprehensive experimentation across diverse benchmarks spanning multiple domains underscores Conv-LoRA's superiority in adapting SAM to real-world semantic segmentation tasks.

large language model, machine learning, segmentation, (18 more...)

arXiv.org Artificial Intelligence

2401.17868

Country:

Europe (1.00)
Asia > China (0.28)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.63)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.94)
Health & Medicine > Therapeutic Area > Oncology (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

A Transformer-Based Substitute Recommendation Model Incorporating Weakly Supervised Customer Behavior Data

Ye, Wenting, Yang, Hongfei, Zhao, Shuai, Fang, Haoyang, Shi, Xingjian, Neppalli, Naveen

arXiv.org Artificial IntelligenceApr-8-2023

The substitute-based recommendation is widely used in E-commerce to provide better alternatives to customers. However, existing research typically uses the customer behavior signals like co-view and view-but-purchase-another to capture the substitute relationship. Despite its intuitive soundness, we find that such an approach might ignore the functionality and characteristics of products. In this paper, we adapt substitute recommendation into language matching problem by taking product title description as model input to consider product functionality. We design a new transformation method to de-noise the signals derived from production data. In addition, we consider multilingual support from the engineering point of view. Our proposed end-to-end transformer-based model achieves both successes from offline and online experiments. The proposed model has been deployed in a large-scale E-commerce website for 11 marketplaces in 6 languages. Our proposed model is demonstrated to increase revenue by 19% based on an online A/B experiment.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2211.02533

Country: North America > United States (0.15)

Genre: Research Report (0.64)

Industry: Information Technology > Services > e-Commerce Services (0.91)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback