AITopics | Jia, Robin

Collaborating Authors

Jia, Robin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Robust Data Watermarking in Language Models by Injecting Fictitious Knowledge

Cui, Xinyue, Wei, Johnny Tian-Zheng, Swayamdipta, Swabha, Jia, Robin

arXiv.org Artificial IntelligenceMar-11-2025

Data watermarking in language models injects traceable signals, such as specific token sequences or stylistic patterns, into copyrighted text, allowing copyright holders to track and verify training data ownership. Previous data watermarking techniques primarily focus on effective memorization after pretraining, while overlooking challenges that arise in other stages of the LLM pipeline, such as the risk of watermark filtering during data preprocessing, or potential forgetting through post-training, or verification difficulties due to API-only access. We propose a novel data watermarking approach that injects coherent and plausible yet fictitious knowledge into training data using generated passages describing a fictitious entity and its associated attributes. Our watermarks are designed to be memorized by the LLM through seamlessly integrating in its training data, making them harder to detect lexically during preprocessing. We demonstrate that our watermarks can be effectively memorized by LLMs, and that increasing our watermarks' density, length, and diversity of attributes strengthens their memorization. We further show that our watermarks remain robust throughout LLM development, maintaining their effectiveness after continual pretraining and supervised finetuning. Finally, we show that our data watermarks can be evaluated even under API-only access via question answering.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.04036

Country:

Asia (0.46)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Promote, Suppress, Iterate: How Language Models Answer One-to-Many Factual Queries

Yan, Tianyi Lorena, Jia, Robin

arXiv.org Artificial IntelligenceMar-5-2025

To answer one-to-many factual queries (e.g., listing cities of a country), a language model (LM) must simultaneously recall knowledge and avoid repeating previous answers. How are these two subtasks implemented and integrated internally? Across multiple datasets and models, we identify a promote-then-suppress mechanism: the model first recalls all answers, and then suppresses previously generated ones. Specifically, LMs use both the subject and previous answer tokens to perform knowledge recall, with attention propagating subject information and MLPs promoting the answers. Then, attention attends to and suppresses previous Figure 1: To answer one-to-many factual queries, we answer tokens, while MLPs amplify the found that LMs first use attention to propagate subject suppression signal. Our mechanism is corroborated information to the last token, which is used by MLPs by extensive experimental evidence: in to promote all possible answers. Attention then attends addition to using early decoding and causal to and suppresses the subject and previous answer tokens, tracing, we analyze how components use different while MLPs amplify the suppression and further tokens by introducing both Token Lens, promote new answers.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2502.20475

Country:

Asia > China (0.46)
North America > United States > California (0.28)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Interrogating LLM design under a fair learning doctrine

Wei, Johnny Tian-Zheng, Wang, Maggie, Godbole, Ameya, Choi, Jonathan H., Jia, Robin

arXiv.org Artificial IntelligenceFeb-22-2025

The current discourse on large language models (LLMs) and copyright largely takes a "behavioral" perspective, focusing on model outputs and evaluating whether they are substantially similar to training data. However, substantial similarity is difficult to define algorithmically and a narrow focus on model outputs is insufficient to address all copyright risks. In this interdisciplinary work, we take a complementary "structural" perspective and shift our focus to how LLMs are trained. We operationalize a notion of "fair learning" by measuring whether any training decision substantially affected the model's memorization. As a case study, we deconstruct Pythia, an open-source LLM, and demonstrate the use of causal and correlational analyses to make factual determinations about Pythia's training decisions. By proposing a legal standard for fair learning and connecting memorization analyses to this standard, we identify how judges may advance the goals of copyright law through adjudication. Finally, we discuss how a fair learning standard might evolve to enhance its clarity by becoming more rule-like and incorporating external technical guidelines.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2502.1629

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry:

Law > Intellectual Property & Technology Law (1.00)
Information Technology (1.00)
Law > Litigation (0.93)
Government > Regional Government > North America Government > United States Government (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

FoNE: Precise Single-Token Number Embeddings via Fourier Features

Zhou, Tianyi, Fu, Deqing, Soltanolkotabi, Mahdi, Jia, Robin, Sharan, Vatsal

arXiv.org Artificial IntelligenceFeb-13-2025

Large Language Models (LLMs) typically represent numbers using multiple tokens, which requires the model to aggregate these tokens to interpret numerical values. This fragmentation makes both training and inference less efficient and adversely affects the model's performance on number-related tasks. Inspired by the observation that pre-trained LLMs internally learn Fourier-like features for number tokens, we propose Fourier Number Embedding (FoNE), a novel method that directly maps numbers into the embedding space with their Fourier features. FoNE encodes each number as a single token with only two embedding dimensions per digit, effectively capturing numerical values without fragmentation. This compact representation accelerates both training and inference. Compared to traditional subword and digit-wise embeddings, FoNE not only reduces computational overhead but also achieves higher accuracy across various numerical tasks including addition, subtraction and multiplication. On 6-digit decimal addition, FoNE requires 64$\times$ less data to achieve 99% accuracy than subword and digit-wise embeddings while using 3$\times$ and 6$\times$ fewer tokens per number, respectively. Furthermore, FoNE is the only method that yields 100% accuracy on over 100,000 test examples for addition, subtraction, and multiplication. The codes and visualization are available at https://fouriernumber.github.io/.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2502.09741

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Mechanistic Interpretability of Emotion Inference in Large Language Models

Tak, Ala N., Banayeeanzade, Amin, Bolourani, Anahita, Kian, Mina, Jia, Robin, Gratch, Jonathan

arXiv.org Artificial IntelligenceFeb-8-2025

Large language models (LLMs) show promising capabilities in predicting human emotions from text. However, the mechanisms through which these models process emotional stimuli remain largely unexplored. Our study addresses this gap by investigating how autoregressive LLMs infer emotions, showing that emotion representations are functionally localized to specific regions in the model. Our evaluation includes diverse model families and sizes and is supported by robustness checks. We then show that the identified representations are psychologically plausible by drawing on cognitive appraisal theory, a well-established psychological framework positing that emotions emerge from evaluations (appraisals) of environmental stimuli. By causally intervening on construed appraisal concepts, we steer the generation and show that the outputs align with theoretical and intuitive expectations. This work highlights a novel way to causally intervene and precisely shape emotional text generation, potentially benefiting safety and alignment in sensitive affective domains.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2502.05489

Country:

Europe (1.00)
Asia (0.92)
North America > United States > California (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Verify with Caution: The Pitfalls of Relying on Imperfect Factuality Metrics

Godbole, Ameya, Jia, Robin

arXiv.org Artificial IntelligenceJan-30-2025

Improvements in large language models have led to increasing optimism that they can serve as reliable evaluators of natural language generation outputs. In this paper, we challenge this optimism by thoroughly re-evaluating five state-of-the-art factuality metrics on a collection of 11 datasets for summarization, retrieval-augmented generation, and question answering. We find that these evaluators are inconsistent with each other and often misestimate system-level performance, both of which can lead to a variety of pitfalls. We further show that these metrics exhibit biases against highly paraphrased outputs and outputs that draw upon faraway parts of the source documents. We urge users of these factuality metrics to proceed with caution and manually validate the reliability of these metrics in their domain of interest before proceeding.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2501.14883

Country:

Europe (0.92)
North America > Mexico > Mexico City (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

The statistical advantage of automatic NLG metrics at the system level

Wei, Johnny Tian-Zheng, Jia, Robin

arXiv.org Artificial IntelligenceDec-13-2024

Estimating the expected output quality of generation systems is central to NLG. This paper qualifies the notion that automatic metrics are not as good as humans in estimating system-level quality. Statistically, humans are unbiased, high variance estimators, while metrics are biased, low variance estimators. We compare these estimators by their error in pairwise prediction (which generation system is better?) using the bootstrap. Measuring this error is complicated: predictions are evaluated against noisy, human predicted labels instead of the ground truth, and metric predictions fluctuate based on the test sets they were calculated on. By applying a bias-variance-noise decomposition, we adjust this error to a noise-free, infinite test set setting. Our analysis compares the adjusted error of metrics to humans and a derived, perfect segment-level annotator, both of which are unbiased estimators dependent on the number of judgments collected. In MT, we identify two settings where metrics outperform humans due to a statistical advantage in variance: when the number of human judgments used is small, and when the quality difference between compared systems is small. The data and code to reproduce our analyses are available at https://github.com/johntzwei/metric-statistical-advantage .

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2105.12437

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota (0.29)
North America > United States > California (0.28)

Genre: Research Report (1.00)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

TLDR: Token-Level Detective Reward Model for Large Vision Language Models

Fu, Deqing, Xiao, Tong, Wang, Rui, Zhu, Wang, Zhang, Pengchuan, Pang, Guan, Jia, Robin, Chen, Lawrence

arXiv.org Artificial IntelligenceOct-7-2024

Although reward models have been successful in improving multimodal large language models, the reward models themselves remain brutal and contain minimal information. Notably, existing reward models only mimic human annotations by assigning only one binary feedback to any text, no matter how long the text is. In the realm of multimodal language models, where models are required to process both images and texts, a naive reward model may learn implicit biases toward texts and become less grounded in images. In this paper, we propose a $\textbf{T}$oken-$\textbf{L}$evel $\textbf{D}$etective $\textbf{R}$eward Model ($\textbf{TLDR}$) to provide fine-grained annotations to each text token. We first introduce a perturbation-based method to generate synthetic hard negatives and their token-level labels to train TLDR models. Then we show the rich usefulness of TLDR models both in assisting off-the-shelf models to self-correct their generations, and in serving as a hallucination evaluation tool. Finally, we show that TLDR models can significantly speed up human annotation by 3 times to acquire a broader range of high-quality vision language data.

caption, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.04734

Country:

Asia (0.67)
Europe (0.46)
North America > United States > California (0.14)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

When Parts are Greater Than Sums: Individual LLM Components Can Outperform Full Models

Chang, Ting-Yun, Thomason, Jesse, Jia, Robin

arXiv.org Artificial IntelligenceJun-24-2024

This paper studies in-context learning (ICL) by decomposing the output of large language models into the individual contributions of attention heads and MLPs (components). We observe curious components: good-performing ones that individually do well on a classification task, even when the model performs poorly; bad-performing ones that do much worse than chance; and label-biased components that always predict the same label. We find that component accuracies are well-correlated across different demonstration sets and perturbations of prompt templates, even when the full-model accuracy varies greatly. Based on our findings, we propose component reweighting, which learns to linearly re-scale the component activations from a few labeled examples. Given 24 labeled examples, our method improves by an average of 6.0% accuracy points over 24-shot ICL across 8 tasks on Llama-2-7B. Overall, this paper both enriches our understanding of ICL and provides a practical method for improvement Figure 1: Each dot represents a component (attention by examining model internals.

computational linguistic, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2406.13131

Country:

Asia (0.68)
Europe (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(2 more...)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Proving membership in LLM pretraining data via data watermarks

Wei, Johnny Tian-Zheng, Wang, Ryan Yixiang, Jia, Robin

arXiv.org Artificial IntelligenceJun-10-2024

Detecting whether copyright holders' works were used in LLM pretraining is poised to be an important problem. This work proposes using data watermarks to enable principled detection with only black-box model access, provided that the rightholder contributed multiple training documents and watermarked them before public release. By applying a randomly sampled data watermark, detection can be framed as hypothesis testing, which provides guarantees on the false detection rate. We study two watermarks: one that inserts random sequences, and another that randomly substitutes characters with Unicode lookalikes. We first show how three aspects of watermark design -- watermark length, number of duplications, and interference -- affect the power of the hypothesis test. Next, we study how a watermark's detection strength changes under model and dataset scaling: while increasing the dataset size decreases the strength of the watermark, watermarks remain strong if the model size also increases. Finally, we view SHA hashes as natural watermarks and show that we can robustly detect hashes from BLOOM-176B's training data, as long as they occurred at least 90 times. Together, our results point towards a promising future for data watermarks in real world use.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2402.10892

Country:

Europe (1.00)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback