AITopics | Li, Jinfeng

Collaborating Authors

Li, Jinfeng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

fairBERTs: Erasing Sensitive Information Through Semantic and Fairness-aware Perturbations

Li, Jinfeng, Chen, Yuefeng, Liu, Xiangyu, Huang, Longtao, Zhang, Rong, Xue, Hui

arXiv.org Artificial IntelligenceJul-11-2024

Pre-trained language models (PLMs) have revolutionized both the natural language processing research and applications. However, stereotypical biases (e.g., gender and racial discrimination) encoded in PLMs have raised negative ethical implications for PLMs, which critically limits their broader applications. To address the aforementioned unfairness issues, we present fairBERTs, a general framework for learning fair fine-tuned BERT series models by erasing the protected sensitive information via semantic and fairness-aware perturbations generated by a generative adversarial network. Through extensive qualitative and quantitative experiments on two real-world tasks, we demonstrate the great superiority of fairBERTs in mitigating unfairness while maintaining the model utility. We also verify the feasibility of transferring adversarial components in fairBERTs to other conventionally trained BERT-like models for yielding fairness improvements. Our findings may shed light on further research on building fairer fine-tuned PLMs.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2407.08189

Country: North America > United States (0.30)

Genre: Research Report > New Finding (0.88)

Industry: Information Technology > Security & Privacy (0.63)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

S-Eval: Automatic and Adaptive Test Generation for Benchmarking Safety Evaluation of Large Language Models

Yuan, Xiaohan, Li, Jinfeng, Wang, Dongxia, Chen, Yuefeng, Mao, Xiaofeng, Huang, Longtao, Xue, Hui, Wang, Wenhai, Ren, Kui, Wang, Jingyi

arXiv.org Artificial IntelligenceMay-28-2024

Large Language Models have gained considerable attention for their revolutionary capabilities. However, there is also growing concern on their safety implications, making a comprehensive safety evaluation for LLMs urgently needed before model deployment. In this work, we propose S-Eval, a new comprehensive, multi-dimensional and open-ended safety evaluation benchmark. At the core of S-Eval is a novel LLM-based automatic test prompt generation and selection framework, which trains an expert testing LLM Mt combined with a range of test selection strategies to automatically construct a high-quality test suite for the safety evaluation. The key to the automation of this process is a novel expert safety-critique LLM Mc able to quantify the riskiness score of an LLM's response, and additionally produce risk tags and explanations. Besides, the generation process is also guided by a carefully designed risk taxonomy with four different levels, covering comprehensive and multi-dimensional safety risks of concern. Based on these, we systematically construct a new and large-scale safety evaluation benchmark for LLMs consisting of 220,000 evaluation prompts, including 20,000 base risk prompts (10,000 in Chinese and 10,000 in English) and 200,000 corresponding attack prompts derived from 10 popular adversarial instruction attacks against LLMs. Moreover, considering the rapid evolution of LLMs and accompanied safety threats, S-Eval can be flexibly configured and adapted to include new risks, attacks and models. S-Eval is extensively evaluated on 20 popular and representative LLMs. The results confirm that S-Eval can better reflect and inform the safety risks of LLMs compared to existing benchmarks. We also explore the impacts of parameter scales, language environments, and decoding parameters on the evaluation, providing a systematic methodology for evaluating the safety of LLMs.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2405.14191

Country:

North America > United States (0.30)
Asia (0.29)

Genre: Research Report > New Finding (0.88)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (1.00)
Law > Criminal Law (0.67)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

FairRec: Fairness Testing for Deep Recommender Systems

Guo, Huizhong, Li, Jinfeng, Wang, Jingyi, Liu, Xiangyu, Wang, Dongxia, Hu, Zehong, Zhang, Rong, Xue, Hui

arXiv.org Artificial IntelligenceApr-14-2023

Deep learning-based recommender systems (DRSs) are increasingly and widely deployed in the industry, which brings significant convenience to people's daily life in different ways. However, recommender systems are also shown to suffer from multiple issues,e.g., the echo chamber and the Matthew effect, of which the notation of "fairness" plays a core role.While many fairness notations and corresponding fairness testing approaches have been developed for traditional deep classification models, they are essentially hardly applicable to DRSs. One major difficulty is that there still lacks a systematic understanding and mapping between the existing fairness notations and the diverse testing requirements for deep recommender systems, not to mention further testing or debugging activities. To address the gap, we propose FairRec, a unified framework that supports fairness testing of DRSs from multiple customized perspectives, e.g., model utility, item diversity, item popularity, etc. We also propose a novel, efficient search-based testing approach to tackle the new challenge, i.e., double-ended discrete particle swarm optimization (DPSO) algorithm, to effectively search for hidden fairness issues in the form of certain disadvantaged groups from a vast number of candidate groups. Given the testing report, by adopting a simple re-ranking mitigation strategy on these identified disadvantaged groups, we show that the fairness of DRSs can be significantly improved. We conducted extensive experiments on multiple industry-level DRSs adopted by leading companies. The results confirm that FairRec is effective and efficient in identifying the deeply hidden fairness issues, e.g., achieving 95% testing accuracy with half to 1/8 time.

evolutionary algorithm, fairrec, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2304.0703

Country:

North America > United States (0.29)
Asia > China (0.29)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback

CAINNFlow: Convolutional block Attention modules and Invertible Neural Networks Flow for anomaly detection and localization tasks

Yan, Ruiqing, Zhang, Fan, Huang, Mengyuan, Liu, Wu, Hu, Dongyu, Li, Jinfeng, Liu, Qiang, Jiang, Jinrong, Guo, Qianjin, Zheng, Linghan

arXiv.org Artificial IntelligenceDec-15-2022

Detection of object anomalies is crucial in industrial processes, but unsupervised anomaly detection and localization is particularly important due to the difficulty of obtaining a large number of defective samples and the unpredictable types of anomalies in real life. Among the existing unsupervised anomaly detection and localization methods, the NF-based scheme has achieved better results. However, the two subnets (complex functions) $s_{i}(u_{i})$ and $t_{i}(u_{i})$ in NF are usually multilayer perceptrons, which need to squeeze the input visual features from 2D flattening to 1D, destroying the spatial location relationship in the feature map and losing the spatial structure information. In order to retain and effectively extract spatial structure information, we design in this study a complex function model with alternating CBAM embedded in a stacked $3\times3$ full convolution, which is able to retain and effectively extract spatial structure information in the normalized flow model. Extensive experimental results on the MVTec AD dataset show that CAINNFlow achieves advanced levels of accuracy and inference efficiency based on CNN and Transformer backbone networks as feature extractors, and CAINNFlow achieves a pixel-level AUC of $98.64\%$ for anomaly detection in MVTec AD.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2206.01992

Country: Asia > China (0.48)

Genre: Research Report > New Finding (0.48)

Industry:

Education (0.46)
Information Technology (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.68)

Add feedback

RoChBert: Towards Robust BERT Fine-tuning for Chinese

Zhang, Zihan, Li, Jinfeng, Shi, Ning, Yuan, Bo, Liu, Xiangyu, Zhang, Rong, Xue, Hui, Sun, Donghong, Zhang, Chao

arXiv.org Artificial IntelligenceOct-28-2022

Despite of the superb performance on a wide range of tasks, pre-trained language models (e.g., BERT) have been proved vulnerable to adversarial texts. In this paper, we present RoChBERT, a framework to build more Robust BERT-based models by utilizing a more comprehensive adversarial graph to fuse Chinese phonetic and glyph features into pre-trained representations during fine-tuning. Inspired by curriculum learning, we further propose to augment the training dataset with adversarial texts in combination with intermediate samples. Extensive experiments demonstrate that RoChBERT outperforms previous methods in significant ways: (i) robust -- RoChBERT greatly improves the model robustness without sacrificing accuracy on benign texts. Specifically, the defense lowers the success rates of unlimited and limited attacks by 59.43% and 39.33% respectively, while remaining accuracy of 93.30%; (ii) flexible -- RoChBERT can easily extend to various language models to solve different downstream tasks with excellent performance; and (iii) efficient -- RoChBERT can be directly applied to the fine-tuning stage without pre-training language model from scratch, and the proposed data augmentation method is also low-cost.

adversarial text, machine learning, rochbert, (17 more...)

arXiv.org Artificial Intelligence

2210.15944

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Voyageur: An Experiential Travel Search Engine

Evensen, Sara, Feng, Aaron, Halevy, Alon, Li, Jinfeng, Li, Vivian, Li, Yuliang, Liu, Huining, Mihaila, George, Morales, John, Nuno, Natalie, Pavlovic, Ekaterina, Tan, Wang-Chiew, Wang, Xiaolan

arXiv.org Artificial IntelligenceMar-4-2019

We describe Voyageur, which is an application of experiential search to the domain of travel. Unlike traditional search engines for online services, experiential search focuses on the experiential aspects of the service under consideration. In particular, Voyageur needs to handle queries for subjective aspects of the service (e.g., quiet hotel, friendly staff) and combine these with objective attributes, such as price and location. Voyageur also highlights interesting facts and tips about the services the user is considering to provide them with further insights into their choices.

artificial intelligence, information management, voyageur, (18 more...)

arXiv.org Artificial Intelligence

1903.01498

Country: North America > United States > California > San Francisco County > San Francisco (0.15)

Genre: Research Report (0.64)

Industry: Consumer Products & Services > Travel (0.83)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.87)

Add feedback

Norm-Ranging LSH for Maximum Inner Product Search

Yan, Xiao, Li, Jinfeng, Dai, Xinyan, Chen, Hongzhi, Cheng, James

Neural Information Processing SystemsDec-31-2018

Neyshabur and Srebro proposed SIMPLE-LSH, which is the state-of-the-art hashing based algorithm for maximum inner product search (MIPS). We found that the performance of SIMPLE-LSH, in both theory and practice, suffers from long tails in the 2-norm distribution of real datasets. We propose NORM-RANGING LSH, which addresses the excessive normalization problem caused by long tails by partitioning a dataset into sub-datasets and building a hash index for each sub-dataset independently. We prove that NORM-RANGING LSH achieves lower query time complexity than SIMPLE-LSH under mild conditions. We also show that the idea of dataset partitioning can improve another hashing based MIPS algorithm. Experiments show that NORM-RANGING LSH probes much less items than SIMPLE-LSH at the same recall, thus significantly benefiting MIPS based applications.

artificial intelligence, dataset, information management, (20 more...)

Neural Information Processing Systems

Country:

Asia (0.14)
North America > Canada (0.14)

Technology:

Information Technology > Information Management > Search (0.71)
Information Technology > Artificial Intelligence > Machine Learning (0.68)
Information Technology > Data Science (0.68)
(2 more...)

Add feedback

Norm-Ranging LSH for Maximum Inner Product Search

Yan, Xiao, Li, Jinfeng, Dai, Xinyan, Chen, Hongzhi, Cheng, James

Neural Information Processing SystemsDec-31-2018

artificial intelligence, dataset, information management, (20 more...)

Neural Information Processing Systems

Country:

Asia (0.14)
North America > Canada (0.14)

Technology:

Information Technology > Information Management > Search (0.71)
Information Technology > Artificial Intelligence > Machine Learning (0.68)
Information Technology > Data Science (0.68)
(2 more...)

Add feedback

Norm-Ranging LSH for Maximum Inner Product Search

Yan, Xiao, Li, Jinfeng, Dai, Xinyan, Chen, Hongzhi, Cheng, James

arXiv.org Machine LearningOct-22-2018

Neyshabur and Srebro proposed Simple-LSH, which is the state-of-the-art hashing method for maximum inner product search (MIPS) with performance guarantee. We found that the performance of Simple-LSH, in both theory and practice, suffers from long tails in the 2-norm distribution of real datasets. We propose Norm-ranging LSH, which addresses the excessive normalization problem caused by long tails in Simple-LSH by partitioning a dataset into multiple sub-datasets and building a hash index for each sub-dataset independently. We prove that Norm-ranging LSH has lower query time complexity than Simple-LSH. We also show that the idea of partitioning the dataset can improve other hashing based methods for MIPS. To support efficient query processing on the hash indexes of the sub-datasets, a novel similarity metric is formulated. Experiments show that Norm-ranging LSH achieves an order of magnitude speedup over Simple-LSH for the same recall, thus significantly benefiting applications that involve MIPS.

artificial intelligence, information management, lsh, (19 more...)

arXiv.org Machine Learning

1809.08782

Country:

Asia (0.14)
North America > Canada (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Information Management (0.85)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)

Add feedback