AITopics | Zhang, Ge

Collaborating Authors

Zhang, Ge

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

RSC-VAE: Recoding Semantic Consistency Based VAE for One-Class Novelty Detection

Zhang, Ge, Du, Wangzhe

arXiv.org Artificial IntelligenceMay-7-2023

In recent years, there is an increasing interests in reconstruction based generative models for image One-Class Novelty Detection, most of which only focus on image-level information. While in this paper, we further exploit the latent space of Variational Auto-encoder (VAE), a typical reconstruction based model, and we innovatively divide it into three regions: Normal/Anomalous/Unknown-semantic-region. Based on this hypothesis, we propose a new VAE architecture, Recoding Semantic Consistency Based VAE (RSC-VAE), combining VAE with recoding mechanism and constraining the semantic consistency of two encodings. We come up with three training modes of RSC-VAE: 1. One-Class Training Mode, alleviating False Positive problem of normal samples; 2. Distributionally-Shifted Training Mode, alleviating False Negative problem of anomalous samples; 3. Extremely-Imbalanced Training Mode, introducing a small number of anomalous samples for training to enhance the second mode. The experimental results on multiple datasets demonstrate that our mechanism achieves state-of-the-art performance in various baselines including VAE.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2305.04275

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Chinese Open Instruction Generalist: A Preliminary Release

Zhang, Ge, Shi, Yemin, Liu, Ruibo, Yuan, Ruibin, Li, Yizhi, Dong, Siwei, Shu, Yu, Li, Zhaoqun, Wang, Zekun, Lin, Chenghua, Huang, Wenhao, Fu, Jie

arXiv.org Artificial IntelligenceApr-24-2023

Pre-trained large-scale language models (LLMs) have shown revolutionary performance in many downstream tasks (Guo et al., 2023; Wei et al., 2021). One crucial ability of LLMs is called instruction following. That is, models can complete the tasks described by instructions given as input. This ability is based on a specialized training stage called instruction tuning. Compared to unlabeled data used for pre-training, the data for instruction tuning is typically more goal-oriented, and it should explicitly demonstrate how a response follows its corresponding instruction with a given input. There are many instruction tuning datasets in English. For example, the FLAN collection (Longpre et al., 2023) contains 15M examples covering 1836 tasks, and OPT-IML (Iyer et al., 2022b) claims to have 18M examples for more than 2000 tasks (although it is still not publicly available). In contrast, existing data resources for Chinese instruction tuning are either small in scale or have questionable quality. For example, Ziang Leng and Li (2023) directly translate English instruction tuning data into Chinese, but do not consider mitigating translation errors or potential cultural gaps, e.g.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2304.07987

Country: Asia > China (0.47)

Genre: Research Report (0.82)

Industry: Education > Assessment & Standards (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits

Liu, Ruibo, Jia, Chenyan, Zhang, Ge, Zhuang, Ziyu, Liu, Tony X, Vosoughi, Soroush

arXiv.org Artificial IntelligenceJan-4-2023

We present Second Thought, a new learning paradigm that enables language models (LMs) to re-align with human values. By modeling the chain-of-edits between value-unaligned and value-aligned text, with LM fine-tuning and additional refinement through reinforcement learning, Second Thought not only achieves superior performance in three value alignment benchmark datasets but also shows strong human-value transfer learning ability in few-shot scenarios. The generated editing steps also offer better interpretability and ease for interactive error correction. Extensive human evaluations further confirm its effectiveness.

artificial intelligence, machine learning, second thought, (3 more...)

arXiv.org Artificial Intelligence

2301.00355

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.87)

Add feedback

CORGI-PM: A Chinese Corpus For Gender Bias Probing and Mitigation

Zhang, Ge, Li, Yizhi, Wu, Yaoyao, Zhang, Linyuan, Lin, Chenghua, Geng, Jiayi, Wang, Shi, Fu, Jie

arXiv.org Artificial IntelligenceJan-1-2023

As natural language processing (NLP) for gender bias becomes a significant interdisciplinary topic, the prevalent data-driven techniques such as large-scale language models suffer from data inadequacy and biased corpus, especially for languages with insufficient resources such as Chinese. To this end, we propose a Chinese cOrpus foR Gender bIas Probing and Mitigation CORGI-PM, which contains 32.9k sentences with high-quality labels derived by following an annotation scheme specifically developed for gender bias in the Chinese context. Moreover, we address three challenges for automatic textual gender bias mitigation, which requires the models to detect, classify, and mitigate textual gender bias. We also conduct experiments with state-of-the-art language models to provide baselines. To our best knowledge, CORGI-PM is the first sentence-level Chinese corpus for gender bias probing and mitigation.

artificial intelligence, chinese corpus, natural language, (3 more...)

arXiv.org Artificial Intelligence

2301.00395

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

MAP-Music2Vec: A Simple and Effective Baseline for Self-Supervised Music Audio Representation Learning

Li, Yizhi, Yuan, Ruibin, Zhang, Ge, Ma, Yinghao, Lin, Chenghua, Chen, Xingran, Ragni, Anton, Yin, Hanzhi, Hu, Zhijie, He, Haoyu, Benetos, Emmanouil, Gyenge, Norbert, Liu, Ruibo, Fu, Jie

arXiv.org Artificial IntelligenceDec-5-2022

The deep learning community has witnessed an exponentially growing interest in self-supervised learning (SSL). However, it still remains unexplored how to build a framework for learning useful representations of raw music waveforms in a self-supervised manner. In this work, we design Music2Vec, a framework exploring different SSL algorithmic components and tricks for music audio recordings. Our model achieves comparable results to the state-of-the-art (SOTA) music SSL model Jukebox, despite being significantly smaller with less than 2% of parameters of the latter. The model will be released on Huggingface(Please refer to: https://huggingface.co/m-a-p/music2vec-v1)

artificial intelligence, machine learning, representation, (10 more...)

arXiv.org Artificial Intelligence

2212.02508

Country:

Asia (0.48)
Europe (0.47)
North America > United States (0.30)

Genre: Research Report (0.50)

Industry:

Media > Music (0.71)
Leisure & Entertainment (0.71)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.34)

Add feedback

HERB: Measuring Hierarchical Regional Bias in Pre-trained Language Models

Li, Yizhi, Zhang, Ge, Yang, Bohao, Lin, Chenghua, Wang, Shi, Ragni, Anton, Fu, Jie

arXiv.org Artificial IntelligenceNov-5-2022

Fairness has become a trending topic in natural language processing (NLP), which addresses biases targeting certain social groups such as genders and religions. However, regional bias in language models (LMs), a long-standing global discrimination problem, still remains unexplored. This paper bridges the gap by analysing the regional bias learned by the pre-trained language models that are broadly used in NLP tasks. In addition to verifying the existence of regional bias in LMs, we find that the biases on regional groups can be strongly influenced by the geographical clustering of the groups. We accordingly propose a HiErarchical Regional Bias evaluation method (HERB) utilising the information from the sub-region clusters to quantify the bias in pre-trained LMs. Experiments show that our hierarchical metric can effectively evaluate the regional bias with respect to comprehensive topics and measure the potential regional bias that can be propagated to downstream tasks. Our codes are available at https://github.com/Bernard-Yang/HERB.

artificial intelligence, natural language, regional bias, (16 more...)

arXiv.org Artificial Intelligence

2211.02882

Country:

Europe (1.00)
Africa (0.93)
North America > Mexico (0.69)
(2 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Tilting the playing field: Dynamical loss functions for machine learning

Ruiz-Garcia, Miguel, Zhang, Ge, Schoenholz, Samuel S., Liu, Andrea J.

arXiv.org Machine LearningFeb-13-2021

We show that learning can be improved by using loss functions that evolve cyclically during training to emphasize one class at a time. In underparameterized networks, such dynamical loss functions can lead to successful training for networks that fail to find a deep minima of the standard cross-entropy loss. In overparameterized networks, dynamical loss functions can lead to better generalization. Improvement arises from the interplay of the changing loss landscape with the dynamics of the system as it evolves to minimize the loss. In particular, as the loss function oscillates, instabilities develop in the form of bifurcation cascades, which we study using the Hessian and Neural Tangent Kernel. Valleys in the landscape widen and deepen, and then narrow and rise as the loss landscape changes during a cycle. As the landscape narrows, the learning rate becomes too large and the network becomes unstable and bounces around the valley. This process ultimately pushes the system into deeper and wider regions of the loss landscape and is characterized by decreasing eigenvalues of the Hessian. This results in better regularized models with improved generalization performance.

artificial intelligence, loss function, neural network, (16 more...)

arXiv.org Machine Learning

2102.03793

Country:

Europe (1.00)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

CORAL: COde RepresentAtion Learning with Weakly-Supervised Transformers for Analyzing Data Analysis

Zhang, Ge, Merrill, Mike A., Liu, Yang, Heer, Jeffrey, Althoff, Tim

arXiv.org Machine LearningAug-28-2020

Large scale analysis of source code, and in particular scientific source code, holds the promise of better understanding the data science process, identifying analytical best practices, and providing insights to the builders of scientific toolkits. However, large corpora have remained unanalyzed in depth, as descriptive labels are absent and require expert domain knowledge to generate. We propose a novel weakly supervised transformer-based architecture for computing joint representations of code from both abstract syntax trees and surrounding natural language comments. We then evaluate the model on a new classification task for labeling computational notebook cells as stages in the data analysis process from data import to wrangling, exploration, modeling, and evaluation. We show that our model, leveraging only easily-available weak supervision, achieves a 38% increase in accuracy over expert-supplied heuristics and outperforms a suite of baselines. Our model enables us to examine a set of 118,000 Jupyter Notebooks to uncover common data analysis patterns. Focusing on notebooks with relationships to academic articles, we conduct the largest ever study of scientific code and find that notebook composition correlates with the citation count of corresponding papers.

artificial intelligence, neural network, notebook, (19 more...)

arXiv.org Machine Learning

2008.12828

Country:

North America > United States > Washington > King County > Seattle (0.14)
Asia > China (0.14)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback