AITopics | Zeng, Xiangrong

Plotting

Zeng, Xiangrong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Baichuan-M1: Pushing the Medical Capability of Large Language Models

Wang, Bingning, Zhao, Haizhou, Zhou, Huozhi, Song, Liang, Xu, Mingyu, Cheng, Wei, Zeng, Xiangrong, Zhang, Yupeng, Huo, Yuqi, Wang, Zecheng, Zhao, Zhengyun, Pan, Da, Yang, Fan, Kou, Fei, Li, Fei, Chen, Fuzhong, Dong, Guosheng, Liu, Han, Zhang, Hongda, He, Jin, Yang, Jinjie, Wu, Kangxi, Wu, Kegeng, Su, Lei, Niu, Linlin, Sun, Linzhuang, Wang, Mang, Fan, Pengcheng, Shen, Qianli, Xin, Rihui, Dang, Shunya, Zhou, Songchi, Chen, Weipeng, Luo, Wenjing, Chen, Xin, Men, Xin, Lin, Xionghai, Dong, Xuezhen, Zhang, Yan, Duan, Yifei, Zhou, Yuyan, Ma, Zhi, Wu, Zhiying

arXiv.org Artificial IntelligenceFeb-18-2025

The current generation of large language models (LLMs) is typically designed for broad, general-purpose applications, while domain-specific LLMs, especially in vertical fields like medicine, remain relatively scarce. In particular, the development of highly efficient and practical LLMs for the medical domain is challenging due to the complexity of medical knowledge and the limited availability of high-quality data. To bridge this gap, we introduce Baichuan-M1, a series of large language models specifically optimized for medical applications. Unlike traditional approaches that simply continue pretraining on existing models or apply post-training to a general base model, Baichuan-M1 is trained from scratch with a dedicated focus on enhancing medical capabilities. Our model is trained on 20 trillion tokens and incorporates a range of effective training methods that strike a balance between general capabilities and medical expertise. As a result, Baichuan-M1 not only performs strongly across general domains such as mathematics and coding but also excels in specialized medical fields. We have open-sourced Baichuan-M1-14B, a mini version of our model, which can be accessed through the following links.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2502.12671

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)
Health & Medicine > Health Care Technology > Medical Record (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Baichuan 2: Open Large-scale Language Models

Yang, Aiyuan, Xiao, Bin, Wang, Bingning, Zhang, Borong, Bian, Ce, Yin, Chao, Lv, Chenxu, Pan, Da, Wang, Dian, Yan, Dong, Yang, Fan, Deng, Fei, Wang, Feng, Liu, Feng, Ai, Guangwei, Dong, Guosheng, Zhao, Haizhou, Xu, Hang, Sun, Haoze, Zhang, Hongda, Liu, Hui, Ji, Jiaming, Xie, Jian, Dai, JunTao, Fang, Kun, Su, Lei, Song, Liang, Liu, Lifeng, Ru, Liyun, Ma, Luyao, Wang, Mang, Liu, Mickel, Lin, MingAn, Nie, Nuolan, Guo, Peidong, Sun, Ruiyang, Zhang, Tao, Li, Tianpeng, Li, Tianyu, Cheng, Wei, Chen, Weipeng, Zeng, Xiangrong, Wang, Xiaochuan, Chen, Xiaoxi, Men, Xin, Yu, Xin, Pan, Xuehai, Shen, Yanjun, Wang, Yiding, Li, Yiyu, Jiang, Youxin, Gao, Yuchen, Zhang, Yupeng, Zhou, Zenan, Wu, Zhiying

arXiv.org Artificial IntelligenceSep-20-2023

Large language models (LLMs) have demonstrated remarkable performance on a variety of natural language tasks based on just a few examples of natural language instructions, reducing the need for extensive feature engineering. However, most powerful LLMs are closed-source or limited in their capability for languages other than English. In this technical report, we present Baichuan 2, a series of large-scale multilingual language models containing 7 billion and 13 billion parameters, trained from scratch, on 2.6 trillion tokens. Baichuan 2 matches or outperforms other open-source models of similar size on public benchmarks like MMLU, CMMLU, GSM8K, and HumanEval. Furthermore, Baichuan 2 excels in vertical domains such as medicine and law. We will release all pre-training model checkpoints to benefit the research community in better understanding the training dynamics of Baichuan 2.

large language model, natural language, open large-scale language model, (2 more...)

arXiv.org Artificial Intelligence

2309.10305

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.73)

Add feedback

Path-based knowledge reasoning with textual semantic information for medical knowledge graph completion

Lan, Yinyu, He, Shizhu, Zeng, Xiangrong, Liu, Shengping, Liu, Kang, Zhao, Jun

arXiv.org Artificial IntelligenceMay-27-2021

Background Knowledge graphs (KGs), especially medical knowledge graphs, are often significantly incomplete, so it necessitating a demand for medical knowledge graph completion (MedKGC). MedKGC can find new facts based on the exited knowledge in the KGs. The path-based knowledge reasoning algorithm is one of the most important approaches to this task. This type of method has received great attention in recent years because of its high performance and interpretability. In fact, traditional methods such as path ranking algorithm (PRA) take the paths between an entity pair as atomic features. However, the medical KGs are very sparse, which makes it difficult to model effective semantic representation for extremely sparse path features. The sparsity in the medical KGs is mainly reflected in the long-tailed distribution of entities and paths. Previous methods merely consider the context structure in the paths of the knowledge graph and ignore the textual semantics of the symbols in the path. Therefore, their performance cannot be further improved due to the two aspects of entity sparseness and path sparseness. To address the above issues, this paper proposes two novel path-based reasoning methods to solve the sparsity issues of entity and path respectively, which adopts the textual semantic information of entities and paths for MedKGC. By using the pre-trained model BERT, combining the textual semantic representations of the entities and the relationships, we model the task of symbolic reasoning in the medical KG as a numerical computing issue in textual semantic representation.

neurology, representation, text processing, (18 more...)

arXiv.org Artificial Intelligence

2105.13074

Country:

Asia > China (0.15)
Asia > Middle East > Qatar (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Large Scaled Relation Extraction With Reinforcement Learning

Zeng, Xiangrong (Institute of Automation, Chinese Academy of Sciences) | He, Shizhu (Institute of Automation, Chinese Academy of Sciences) | Liu, Kang (Institute of Automation, Chinese Academy of Sciences) | Zhao, Jun (Institute of Automation, Chinese Academy of Sciences)

AAAI ConferencesFeb-8-2018

Sentence relation extraction aims to extract relational facts from sentences, which is an important task in natural language processing field. Previous models rely on the manually labeled supervised dataset. However, the human annotation is costly and limits to the number of relation and data size, which is difficult to scale to large domains. In order to conduct largely scaled relation extraction, we utilize an existing knowledge base to heuristically align with texts, which not rely on human annotation and easy to scale. However, using distant supervised data for relation extraction is facing a new challenge: sentences in the distant supervised dataset are not directly labeled and not all sentences that mentioned an entity pair can represent the relation between them. To solve this problem, we propose a novel model with reinforcement learning. The relation of the entity pair is used as distant supervision and guide the training of relation extractor with the help of reinforcement learning method. We conduct two types of experiments on a publicly released dataset. Experiment results demonstrate the effectiveness of the proposed method compared with baseline models, which achieves 13.36\% improvement.

deep learning, neural network, relation, (20 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

Asia > China (0.15)
North America > United States (0.14)
Asia > India (0.14)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A novel sparsity and clustering regularization

Zeng, Xiangrong, Figueiredo, Mário A. T.

arXiv.org Machine LearningFeb-20-2014

We propose a novel SPARsity and Clustering (SPARC) regularizer, which is a modified version of the previous octagonal shrinkage and clustering algorithm for regression (OSCAR), where, the proposed regularizer consists of a $K$-sparse constraint and a pair-wise $\ell_{\infty}$ norm restricted on the $K$ largest components in magnitude. The proposed regularizer is able to separably enforce $K$-sparsity and encourage the non-zeros to be equal in magnitude. Moreover, it can accurately group the features without shrinking their magnitude. In fact, SPARC is closely related to OSCAR, so that the proximity operator of the former can be efficiently computed based on that of the latter, allowing using proximal splitting algorithms to solve problems with SPARC regularization. Experiments on synthetic data and with benchmark breast cancer data show that SPARC is a competitive group-sparsity inducing regularizer for regression and classification.

algorithm, health & medicine, oncology, (18 more...)

arXiv.org Machine Learning

1310.4945

Country: Europe > Portugal (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.58)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

Add feedback

Group-sparse Matrix Recovery

Zeng, Xiangrong, Figueiredo, Mário A. T.

arXiv.org Machine LearningFeb-20-2014

We apply the OSCAR (octagonal selection and clustering algorithms for regression) in recovering group-sparse matrices (two-dimensional---2D---arrays) from compressive measurements. We propose a 2D version of OSCAR (2OSCAR) consisting of the $\ell_1$ norm and the pair-wise $\ell_{\infty}$ norm, which is convex but non-differentiable. We show that the proximity operator of 2OSCAR can be computed based on that of OSCAR. The 2OSCAR problem can thus be efficiently solved by state-of-the-art proximal splitting algorithms. Experiments on group-sparse 2D array recovery show that 2OSCAR regularization solved by the SpaRSA algorithm is the fastest choice, while the PADMM algorithm (with debiasing) yields the most accurate results.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

1402.5077

Country: Europe > Portugal (0.14)

Genre: Research Report (0.40)

Industry: Media (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
Information Technology > Data Science (0.88)

Add feedback

Solving OSCAR regularization problems by proximal splitting algorithms

Zeng, Xiangrong, Figueiredo, Mário A. T.

arXiv.org Machine LearningSep-27-2013

The OSCAR (octagonal selection and clustering algorithm for regression) regularizer consists of a L_1 norm plus a pair-wise L_inf norm (responsible for its grouping behavior) and was proposed to encourage group sparsity in scenarios where the groups are a priori unknown. The OSCAR regularizer has a non-trivial proximity operator, which limits its applicability. We reformulate this regularizer as a weighted sorted L_1 norm, and propose its grouping proximity operator (GPO) and approximate proximity operator (APO), thus making state-of-the-art proximal splitting algorithms (PSAs) available to solve inverse problems with OSCAR regularization. The GPO is in fact the APO followed by additional grouping and averaging operations, which are costly in time and storage, explaining the reason why algorithms with APO are much faster than that with GPO. The convergences of PSAs with GPO are guaranteed since GPO is an exact proximity operator. Although convergence of PSAs with APO is may not be guaranteed, we have experimentally found that APO behaves similarly to GPO when the regularization parameter of the pair-wise L_inf norm is set to an appropriately small value. Experiments on recovery of group-sparse signals (with unknown groups) show that PSAs with APO are very fast and accurate.

algorithm, artificial intelligence, optimization problem, (17 more...)

arXiv.org Machine Learning

1309.6301

Country: Europe > Portugal (0.14)

Genre: Research Report (0.64)

Industry:

Media > Film (0.55)
Leisure & Entertainment (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback