AITopics | Hu, Qian

Collaborating Authors

Hu, Qian

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Ticktack : Long Span Temporal Alignment of Large Language Models Leveraging Sexagenary Cycle Time Expression

Han, Xue, Hu, Qian, Wang, Yitong, Gao, Wenchun, Zhang, Lianlian, Wang, Qing, Mei, Lijun, Deng, Chao, Feng, Junlan

arXiv.org Artificial IntelligenceMar-7-2025

Large language models (LLMs) suffer from temporal misalignment issues especially across long span of time. The issue arises from knowing that LLMs are trained on large amounts of data where temporal information is rather sparse over long times, such as thousands of years, resulting in insufficient learning or catastrophic forgetting by the LLMs. This paper proposes a methodology named "Ticktack" for addressing the LLM's long-time span misalignment in a yearly setting. Specifically, we first propose to utilize the sexagenary year expression instead of the Gregorian year expression employed by LLMs, achieving a more uniform distribution in yearly granularity. Then, we employ polar coordinates to model the sexagenary cycle of 60 terms and the year order within each term, with additional temporal encoding to ensure LLMs understand them. Finally, we present a temporal representational alignment approach for post-training LLMs that effectively distinguishes time points with relevant knowledge, hence improving performance on time-related tasks, particularly over a long period. We also create a long time span benchmark for evaluation. Experimental results prove the effectiveness of our proposal.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.0415

Country:

Asia (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Quantitative Certification of Bias in Large Language Models

Chaudhary, Isha, Hu, Qian, Kumar, Manoj, Ziyadi, Morteza, Gupta, Rahul, Singh, Gagandeep

arXiv.org Artificial IntelligenceMay-29-2024

Warning: This paper contains model outputs which are offensive in nature. Large Language Models (LLMs) can produce responses that exhibit social biases and support stereotypes. However, conventional benchmarking is insufficient to thoroughly evaluate LLM bias, as it can not scale to large sets of prompts and provides no guarantees. Therefore, we propose a novel certification framework QuaCer-B (Quantitative Certification of Bias) that provides formal guarantees on obtaining unbiased responses from target LLMs under large sets of prompts. A certificate consists of high-confidence bounds on the probability of obtaining biased responses from the LLM for any set of prompts containing sensitive attributes, sampled from a distribution. We illustrate the bias certification in LLMs for prompts with various prefixes drawn from given distributions. We consider distributions of random token sequences, mixtures of manual jailbreaks, and jailbreaks in the LLM's embedding space to certify its bias. We certify popular LLMs with QuaCer-B and present novel insights into their biases.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2405.1878

Country:

Asia > China (0.14)
Europe > Belgium (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Toward Informal Language Processing: Knowledge of Slang in Large Language Models

Sun, Zhewei, Hu, Qian, Gupta, Rahul, Zemel, Richard, Xu, Yang

arXiv.org Artificial IntelligenceApr-12-2024

Recent advancement in large language models (LLMs) has offered a strong potential for natural language systems to process informal language. A representative form of informal language is slang, used commonly in daily conversations and online social media. To date, slang has not been comprehensively evaluated in LLMs due partly to the absence of a carefully designed and publicly accessible benchmark. Using movie subtitles, we construct a dataset that supports evaluation on a diverse set of tasks pertaining to automatic processing of slang. For both evaluation and finetuning, we show the effectiveness of our dataset on two core applications: 1) slang detection, and 2) identification of regional and historical sources of slang from natural sentences. We also show how our dataset can be used to probe the output distributions of LLMs for interpretive insights. We find that while LLMs such as GPT-4 achieve good performance in a zero-shot setting, smaller BERT-like models finetuned on our dataset achieve comparable performance. Furthermore, we show that our dataset enables finetuning of LLMs such as GPT-3.5 that achieve substantially better performance than strong zero-shot baselines. Our work offers a comprehensive evaluation and a high-quality benchmark on English slang based on the OpenSubtitles corpus, serving both as a publicly accessible resource and a platform for applying tools for informal language processing.

large language model, machine learning, slang, (18 more...)

arXiv.org Artificial Intelligence

2404.02323

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Washington > King County > Seattle (0.14)
(2 more...)

Genre: Research Report > Experimental Study (0.46)

Industry:

Media > Film (0.48)
Leisure & Entertainment (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Can Small Language Models Help Large Language Models Reason Better?: LM-Guided Chain-of-Thought

Lee, Jooyoung, Yang, Fan, Tran, Thanh, Hu, Qian, Barut, Emre, Chang, Kai-Wei, Su, Chengwei

arXiv.org Artificial IntelligenceApr-4-2024

We introduce a novel framework, LM-Guided CoT, that leverages a lightweight (i.e., <1B) language model (LM) for guiding a black-box large (i.e., >10B) LM in reasoning tasks. Specifically, the lightweight LM first generates a rationale for each input instance. The Frozen large LM is then prompted to predict a task output based on the rationale generated by the lightweight LM. Our approach is resource-efficient in the sense that it only requires training the lightweight LM. We optimize the model through 1) knowledge distillation and 2) reinforcement learning from rationale-oriented and task-oriented reward signals. We assess our method with multi-hop extractive question answering (QA) benchmarks, HotpotQA, and 2WikiMultiHopQA. Experimental results show that our approach outperforms all baselines regarding answer prediction accuracy. We also find that reinforcement learning helps the model to produce higher-quality rationales with improved QA performance.

large language model, machine learning, reinforcement learning, (20 more...)

arXiv.org Artificial Intelligence

2404.03414

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.46)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.69)

Add feedback

Faithful Model Evaluation for Model-Based Metrics

Goyal, Palash, Hu, Qian, Gupta, Rahul

arXiv.org Artificial IntelligenceDec-19-2023

Statistical significance testing is used in natural language processing (NLP) to determine whether the results of a study or experiment are likely to be due to chance or if they reflect a genuine relationship. A key step in significance testing is the estimation of confidence interval which is a function of sample variance. Sample variance calculation is straightforward when evaluating against ground truth. However, in many cases, a metric model is often used for evaluation. For example, to compare toxicity of two large language models, a toxicity classifier is used for evaluation. Existing works usually do not consider the variance change due to metric model errors, which can lead to wrong conclusions. In this work, we establish the mathematical foundation of significance testing for model-based metrics. With experiments on public benchmark datasets and a production system, we show that considering metric model errors to calculate sample variances for model-based metrics changes the conclusions in certain experiments.

large language model, machine learning, variance, (19 more...)

arXiv.org Artificial Intelligence

2312.17254

Genre: Research Report > Experimental Study (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.73)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Evaluating Large Language Models on Controlled Generation Tasks

Sun, Jiao, Tian, Yufei, Zhou, Wangchunshu, Xu, Nan, Hu, Qian, Gupta, Rahul, Wieting, John Frederick, Peng, Nanyun, Ma, Xuezhe

arXiv.org Artificial IntelligenceOct-22-2023

While recent studies have looked into the abilities of large language models in various benchmark tasks, including question generation, reading comprehension, multilingual and etc, there have been few studies looking into the controllability of large language models on generation tasks. We present an extensive analysis of various benchmarks including a sentence planning benchmark with different granularities. After comparing large language models against state-of-the-start finetuned smaller models, we present a spectrum showing large language models falling behind, are comparable, or exceed the ability of smaller models. We conclude that **large language models struggle at meeting fine-grained hard constraints**.

controlled generation task, large language model, natural language, (1 more...)

arXiv.org Artificial Intelligence

2310.14542

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

FLIRT: Feedback Loop In-context Red Teaming

Mehrabi, Ninareh, Goyal, Palash, Dupuy, Christophe, Hu, Qian, Ghosh, Shalini, Zemel, Richard, Chang, Kai-Wei, Galstyan, Aram, Gupta, Rahul

arXiv.org Artificial IntelligenceAug-8-2023

Warning: this paper contains content that may be inappropriate or offensive. As generative models become available for public use in various applications, testing and analyzing vulnerabilities of these models has become a priority. Here we propose an automatic red teaming framework that evaluates a given model and exposes its vulnerabilities against unsafe and inappropriate content generation. Our framework uses in-context learning in a feedback loop to red team models and trigger them into unsafe content generation. We propose different in-context attack strategies to automatically learn effective and diverse adversarial prompts for text-to-image models. Our experiments demonstrate that compared to baseline approaches, our proposed strategy is significantly more effective in exposing vulnerabilities in Stable Diffusion (SD) model, even when the latter is enhanced with safety features. Furthermore, we demonstrate that the proposed framework is effective for red teaming text-to-text models, resulting in significantly higher toxic response generation rate compared to previously reported numbers.

machine learning, natural language, text-to-image model, (17 more...)

arXiv.org Artificial Intelligence

2308.04265

Country:

North America > United States (0.46)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (0.91)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Is the Elephant Flying? Resolving Ambiguities in Text-to-Image Generative Models

Mehrabi, Ninareh, Goyal, Palash, Verma, Apurv, Dhamala, Jwala, Kumar, Varun, Hu, Qian, Chang, Kai-Wei, Zemel, Richard, Galstyan, Aram, Gupta, Rahul

arXiv.org Artificial IntelligenceNov-17-2022

Natural language often contains ambiguities that can lead to misinterpretation and miscommunication. While humans can handle ambiguities effectively by asking clarifying questions and/or relying on contextual cues and common-sense knowledge, resolving ambiguities can be notoriously hard for machines. In this work, we study ambiguities that arise in text-to-image generative models. We curate a benchmark dataset covering different types of ambiguities that occur in these systems. We then propose a framework to mitigate ambiguities in the prompts given to the systems by soliciting clarifications from the user. Through automatic and human evaluations, we show the effectiveness of our framework in generating more faithful images aligned with human intention in the presence of ambiguities.

ambiguity, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2211.12503

Country:

North America > United States (0.46)
Europe (0.46)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.85)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

OPP-Miner: Order-preserving sequential pattern mining

Wu, Youxi, Hu, Qian, Li, Yan, Guo, Lei, Zhu, Xingquan, Wu, Xindong

arXiv.org Artificial IntelligenceFeb-8-2022

A time series is a collection of measurements in chronological order. Discovering patterns from time series is useful in many domains, such as stock analysis, disease detection, and weather forecast. To discover patterns, existing methods often convert time series data into another form, such as nominal/symbolic format, to reduce dimensionality, which inevitably deviates the data values. Moreover, existing methods mainly neglect the order relationships between time series values. To tackle these issues, inspired by order-preserving matching, this paper proposes an Order-Preserving sequential Pattern (OPP) mining method, which represents patterns based on the order relationships of the time series data. An inherent advantage of such representation is that the trend of a time series can be represented by the relative order of the values underneath the time series data. To obtain frequent trends in time series, we propose the OPP-Miner algorithm to mine patterns with the same trend (sub-sequences with the same relative order). OPP-Miner employs the filtration and verification strategies to calculate the support and uses pattern fusion strategy to generate candidate patterns. To compress the result set, we also study finding the maximal OPPs. Experiments validate that OPP-Miner is not only efficient and scalable but can also discover similar sub-sequences in time series. In addition, case studies show that our algorithms have high utility in analyzing the COVID-19 epidemic by identifying critical trends and improve the clustering performance.

data mining, machine learning, pattern recognition, (21 more...)

arXiv.org Artificial Intelligence

2202.0314

Country: Asia > China (0.47)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.89)
Health & Medicine > Therapeutic Area > Immunology (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Information Management (0.93)

Add feedback

Attribute reduction and rule acquisition of formal decision context based on two new kinds of decision rules

Hu, Qian, Qin, Keyun

arXiv.org Artificial IntelligenceJul-3-2021

This paper mainly studies the rule acquisition and attribute reduction for formal decision context based on two new kinds of decision rules, namely I-decision rules and II-decision rules. The premises of these rules are object-oriented concepts, and the conclusions are formal concept and property-oriented concept respectively. The rule acquisition algorithms for I-decision rules and II-decision rules are presented. Some comparative analysis of these algorithms with the existing algorithms are examined which shows that the algorithms presented in this study behave well. The attribute reduction approaches to preserve I-decision rules and II-decision rules are presented by using discernibility matrix.

artificial intelligence, i-decision rule, object-oriented architecture, (12 more...)

arXiv.org Artificial Intelligence

2107.03288

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.35)

Add feedback