AITopics | Wu, Han

Plotting

Wu, Han

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model Ensembling

Yao, Yuxuan, Wu, Han, Liu, Mingyang, Luo, Sichun, Han, Xiongwei, Liu, Jie, Guo, Zhijiang, Song, Linqi

arXiv.org Artificial IntelligenceOct-3-2024

Large language models (LLMs) exhibit varying strengths and weaknesses across different tasks, prompting recent studies to explore the benefits of ensembling models to leverage their complementary advantages. However, existing LLM ensembling methods often overlook model compatibility and struggle with inefficient alignment of probabilities across the entire vocabulary. In this study, we empirically investigate the factors influencing ensemble performance, identifying model performance, vocabulary size, and response style as key determinants, revealing that compatibility among models is essential for effective ensembling. This analysis leads to the development of a simple yet effective model selection strategy that identifies compatible models. TE), a novel approach that efficiently combines models by focusing on the union of the top-k tokens from each model, thereby avoiding the need for full vocabulary alignment and reducing computational overhead. TE significantly enhances performance compared to existing methods, offering a more efficient framework for LLM ensembling. Large language models (LLMs) have demonstrated remarkable performance across a wide range of tasks and have shown promising results in real-world applications (OpenAI, 2023; Yang et al., 2024; Dubey et al., 2024). Given the diversity in data sources, model architectures, and training methods, LLMs exhibit varying strengths and weaknesses depending on the task at hand. Consequently, rather than relying solely on training an LLM from scratch, an alternative approach is to create an ensemble of LLMs. This method allows for leveraging the complementary advantages of different LLMs (Jiang et al., 2023b; Lu et al., 2024; Yu et al., 2024b). Existing model ensembling methods can be broadly categorized into three types: output-level, probability-level, and training-level approaches.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.03777

Country:

North America > United States (0.67)
North America > Canada (0.46)

Genre:

Research Report > New Finding (0.34)
Overview > Innovation (0.34)

Industry:

Leisure & Entertainment (1.00)
Media > Music (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Automatic deductive coding in discourse analysis: an application of large language models in learning analytics

Zhang, Lishan, Wu, Han, Huang, Xiaoshan, Duan, Tengfei, Du, Hanxiang

arXiv.org Artificial IntelligenceOct-2-2024

Deductive coding is a common discourse analysis method widely used by learning science and learning analytics researchers for understanding teaching and learning interactions. It often requires researchers to manually label all discourses to be analyzed according to a theoretically guided coding scheme, which is time-consuming and labor-intensive. The emergence of large language models such as GPT has opened a new avenue for automatic deductive coding to overcome the limitations of traditional deductive coding. To evaluate the usefulness of large language models in automatic deductive coding, we employed three different classification methods driven by different artificial intelligence technologies, including the traditional text classification method with text feature engineering, BERT-like pretrained language model and GPT-like pretrained large language model (LLM). We applied these methods to two different datasets and explored the potential of GPT and prompt engineering in automatic deductive coding. By analyzing and comparing the accuracy and Kappa values of these three classification methods, we found that GPT with prompt engineering outperformed the other two methods on both datasets with limited number of training samples. By providing detailed prompt structures, the reported work demonstrated how large language models can be used in the implementation of automatic deductive coding.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.0124

Country:

Asia > China (0.28)
North America > Canada > Quebec > Montreal (0.14)

Genre:

Instructional Material (1.00)
Research Report > New Finding (0.46)

Industry:

Education > Educational Setting > Online (0.93)
Education > Educational Technology > Educational Software > Computer Based Training (0.93)
Education > Assessment & Standards > Student Performance (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Beyond Statistical Estimation: Differentially Private Individual Computation via Shuffling

Wang, Shaowei, Dong, Changyu, Song, Xiangfu, Li, Jin, Zhou, Zhili, Wang, Di, Wu, Han

arXiv.org Artificial IntelligenceJul-11-2024

In data-driven applications, preserving user privacy while enabling valuable computations remains a critical challenge. Technologies like Differential Privacy (DP) have been pivotal in addressing these concerns. The shuffle model of DP requires no trusted curators and can achieve high utility by leveraging the privacy amplification effect yielded from shuffling. These benefits have led to significant interest in the shuffle model. However, the computation tasks in the shuffle model are limited to statistical estimation, making the shuffle model inapplicable to real-world scenarios in which each user requires a personalized output. This paper introduces a novel paradigm termed Private Individual Computation (PIC), expanding the shuffle model to support a broader range of permutation-equivariant computations. PIC enables personalized outputs while preserving privacy, and enjoys privacy amplification through shuffling. We propose a concrete protocol that realizes PIC. By using one-time public keys, our protocol enables users to receive their outputs without compromising anonymity, which is essential for privacy amplification. Additionally, we present an optimal randomizer, the Minkowski Response, designed for the PIC model to enhance utility. We formally prove the security and privacy properties of the PIC protocol. Theoretical analysis and empirical evaluations demonstrate PIC's capability in handling non-statistical computation tasks, and the efficacy of PIC and the Minkowski randomizer in achieving superior utility compared to existing solutions.

artificial intelligence, machine learning, randomizer, (18 more...)

arXiv.org Artificial Intelligence

2406.18145

Country:

Asia (0.46)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Research Report (0.50)
Workflow (0.45)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback

MMTE: Corpus and Metrics for Evaluating Machine Translation Quality of Metaphorical Language

Wang, Shun, Zhang, Ge, Wu, Han, Loakman, Tyler, Huang, Wenhao, Lin, Chenghua

arXiv.org Artificial IntelligenceJun-19-2024

Machine Translation (MT) has developed rapidly since the release of Large Language Models and current MT evaluation is performed through comparison with reference human translations or by predicting quality scores from human-labeled data. However, these mainstream evaluation methods mainly focus on fluency and factual reliability, whilst paying little attention to figurative quality. In this paper, we investigate the figurative quality of MT and propose a set of human evaluation metrics focused on the translation of figurative language. We additionally present a multilingual parallel metaphor corpus generated by post-editing. Our evaluation protocol is designed to estimate four aspects of MT: Metaphorical Equivalence, Emotion, Authenticity, and Quality. In doing so, we observe that translations of figurative expressions display different traits from literal ones.

large language model, machine learning, translation, (20 more...)

arXiv.org Artificial Intelligence

2406.13698

Country:

Europe (1.00)
Asia (0.68)
North America > United States > Colorado (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.64)

Industry: Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Learning From Correctness Without Prompting Makes LLM Efficient Reasoner

Yao, Yuxuan, Wu, Han, Guo, Zhijiang, Zhou, Biyan, Gao, Jiahui, Luo, Sichun, Hou, Hanxu, Fu, Xiaojin, Song, Linqi

arXiv.org Artificial IntelligenceMar-27-2024

Large language models (LLMs) have demonstrated outstanding performance across various tasks, yet they still exhibit limitations such as hallucination, unfaithful reasoning, and toxic content. One potential approach to mitigate these issues is learning from human or external feedback (e.g. tools). In this paper, we introduce an intrinsic self-correct reasoning framework for LLMs that eliminates the need for human feedback, external tools, and handcraft prompts. The proposed framework, based on a multi-step reasoning paradigm \textbf{Le}arning from \textbf{Co}rrectness (\textsc{LeCo}), improves reasoning performance without needing to learn from errors. This paradigm prioritizes learning from correct reasoning steps, and a unique method to measure confidence for each reasoning step based on generation logits. Experimental results across various multi-step reasoning tasks demonstrate the effectiveness of the framework in improving reasoning performance with reduced token consumption.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2403.19094

Country:

North America > United States > Louisiana (0.14)
Asia > Middle East > UAE (0.14)

Genre:

Workflow (0.69)
Research Report (0.64)

Industry: Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Adversarial Detection: Attacking Object Detection in Real Time

Wu, Han, Yunas, Syed, Rowlands, Sareh, Ruan, Wenjie, Wahlstrom, Johan

arXiv.org Artificial IntelligenceDec-12-2023

Intelligent robots rely on object detection models to perceive the environment. Following advances in deep learning security it has been revealed that object detection models are vulnerable to adversarial attacks. However, prior research primarily focuses on attacking static images or offline videos. Therefore, it is still unclear if such attacks could jeopardize real-world robotic applications in dynamic environments. This paper bridges this gap by presenting the first real-time online attack against object detection models. We devise three attacks that fabricate bounding boxes for nonexistent objects at desired locations. The attacks achieve a success rate of about 90% within about 20 iterations. The demo video is available at https://youtu.be/zJZ1aNlXsMU.

artificial intelligence, machine learning, overlay, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/IV55152.2023.10186608

2209.01962

Country: Europe > United Kingdom (0.15)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (0.35)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Fine-grained Conversational Decoding via Isotropic and Proximal Search

Yao, Yuxuan, Wu, Han, Xu, Qiling, Song, Linqi

arXiv.org Artificial IntelligenceNov-14-2023

General-purpose text decoding approaches are usually adopted for dialogue response generation. Although the quality of the generated responses can be improved with dialogue-specific encoding methods, conversational decoding methods are still under-explored. Inspired by \citet{wu2023learning} that a good dialogue feature space should follow the rules of locality and isotropy, we present a fine-grained conversational decoding method, termed \textit{isotropic and proximal search (IPS)}. Our method is designed to generate the semantic-concentrated response, while still maintaining informativeness and discrimination against the context. Experiments show that our approach outperforms existing decoding strategies in the dialogue field across both automatic and human evaluation metrics. More in-depth analyses further confirm the effectiveness of our approach.

large language model, machine learning, utterance, (22 more...)

arXiv.org Artificial Intelligence

2310.0813

Country:

Asia (1.00)
North America > United States > Texas (0.14)
North America > United States > Pennsylvania (0.14)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Reconstruct Before Summarize: An Efficient Two-Step Framework for Condensing and Summarizing Meeting Transcripts

Tan, Haochen, Wu, Han, Shao, Wei, Zhang, Xinyun, Zhan, Mingjie, Hou, Zhaohui, Liang, Ding, Song, Linqi

arXiv.org Artificial IntelligenceOct-22-2023

Based on this understanding, Although numerous achievements have been made we propose a two-step meeting summarization in the well-structured text abstractive summarization framework, Reconstrcut before Summarize(RbS), (Zhang et al., 2020a; Liu* et al., 2018; Lewis to address the challenge of scattered et al., 2020), the research on meeting summarization information in meetings. RbS adopts a reconstructor is still stretched in limit. There are some outstanding to reconstruct the responses in the meeting, it challenges in this field, including 1) much also synchronically traces out which texts in the noise brought from automated speech recognition meeting drove the responses and marks them as models; 2) lengthy meeting transcripts consisting essential contents. Therefore, salient information of casual conversations, content redundancy, and is captured and annotated as anchor tokens in RbS.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2305.07988

Country:

Europe (0.93)
North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Towards Versatile and Efficient Visual Knowledge Integration into Pre-trained Language Models with Cross-Modal Adapters

Zhang, Xinyun, Tan, Haochen, Wu, Han, Zhan, Mingjie, Liang, Ding, Yu, Bei

arXiv.org Artificial IntelligenceAug-28-2023

Humans learn language via multi-modal knowledge. However, due to the text-only pre-training scheme, most existing pre-trained language models (PLMs) are hindered from the multi-modal information. To inject visual knowledge into PLMs, existing methods incorporate either the text or image encoder of vision-language models (VLMs) to encode the visual information and update all the original parameters of PLMs for knowledge fusion. In this paper, we propose a new plug-and-play module, X-adapter, to flexibly leverage the aligned visual and textual knowledge learned in pre-trained VLMs and efficiently inject them into PLMs. Specifically, we insert X-adapters into PLMs, and only the added parameters are updated during adaptation. To fully exploit the potential in VLMs, X-adapters consist of two sub-modules, V-expert and T-expert, to fuse VLMs' image and text representations, respectively. We can opt for activating different sub-modules depending on the downstream tasks. Experimental results show that our method can significantly improve the performance on object-color reasoning and natural language understanding (NLU) tasks compared with PLM baselines.

artificial intelligence, natural language, text processing, (19 more...)

arXiv.org Artificial Intelligence

2305.07358

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.70)

Add feedback

Distributed Black-box Attack against Image Classification Cloud Services

Wu, Han, Rowlands, Sareh, Wahlstrom, Johan

arXiv.org Artificial IntelligenceAug-21-2023

Black-box adversarial attacks can fool image classifiers into misclassifying images without requiring access to model structure and weights. Recent studies have reported attack success rates of over 95% with less than 1,000 queries. The question then arises of whether black-box attacks have become a real threat against IoT devices that rely on cloud APIs to achieve image classification. To shed some light on this, note that prior research has primarily focused on increasing the success rate and reducing the number of queries. However, another crucial factor for black-box attacks against cloud APIs is the time required to perform the attack. This paper applies black-box attacks directly to cloud APIs rather than to local models, thereby avoiding mistakes made in prior research that applied the perturbation before image encoding and pre-processing. Further, we exploit load balancing to enable distributed black-box attacks that can reduce the attack time by a factor of about five for both local search and gradient estimation methods.

artificial intelligence, cloud computing, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2210.16371

Genre: Research Report (1.00)

Industry:

Transportation > Air (1.00)
Information Technology (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.90)
(2 more...)

Add feedback