AITopics | Chen, Xiaoxi

Collaborating Authors

Chen, Xiaoxi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Baichuan-Omni-1.5 Technical Report

Li, Yadong, Liu, Jun, Zhang, Tao, Zhang, Tao, Chen, Song, Li, Tianpeng, Li, Zehuan, Liu, Lijun, Ming, Lingfeng, Dong, Guosheng, Pan, Da, Li, Chong, Fang, Yuanbo, Kuang, Dongdong, Wang, Mingrui, Zhu, Chenglin, Zhang, Youwei, Guo, Hongyu, Zhang, Fengyu, Wang, Yuran, Ding, Bowen, Song, Wei, Li, Xu, Huo, Yuqi, Liang, Zheng, Zhang, Shusen, Wu, Xin, Zhao, Shuai, Xiong, Linchu, Wu, Yozhen, Ye, Jiahui, Lu, Wenhao, Li, Bowen, Zhang, Yan, Zhou, Yaqi, Chen, Xin, Su, Lei, Zhang, Hongda, Chen, Fuzhong, Dong, Xuezhen, Nie, Na, Wu, Zhiying, Xiao, Bin, Li, Ting, Dang, Shunya, Zhang, Ping, Sun, Yijia, Wu, Jincheng, Yang, Jinjie, Lin, Xionghai, Ma, Zhi, Wu, Kegeng, li, Jia, Yang, Aiyuan, Liu, Hui, Zhang, Jianqiang, Chen, Xiaoxi, Ai, Guangwei, Zhang, Wentao, Chen, Yicong, Huang, Xiaoqin, Li, Kun, Luo, Wenjing, Duan, Yifei, Zhu, Lingling, Xiao, Ran, Su, Zhe, Pu, Jiani, Wang, Dian, Jia, Xu, Zhang, Tianyu, Ai, Mengyu, Wang, Mang, Qiao, Yujing, Zhang, Lei, Shen, Yanjun, Yang, Fan, Zhen, Miao, Zhou, Yijie, Chen, Mingyang, Li, Fei, Zhu, Chenzheng, Lu, Keer, Zhao, Yaqi, Liang, Hao, Li, Youquan, Qin, Yanzhao, Sun, Linzhuang, Xu, Jianhua, Sun, Haoze, Lin, Mingan, Zhou, Zenan, Chen, Weipeng

arXiv.org Artificial IntelligenceJan-25-2025

We introduce Baichuan-Omni-1.5, an omni-modal model that not only has omni-modal understanding capabilities but also provides end-to-end audio generation capabilities. To achieve fluent and high-quality interaction across modalities without compromising the capabilities of any modality, we prioritized optimizing three key aspects. First, we establish a comprehensive data cleaning and synthesis pipeline for multimodal data, obtaining about 500B high-quality data (text, audio, and vision). Second, an audio-tokenizer (Baichuan-Audio-Tokenizer) has been designed to capture both semantic and acoustic information from audio, enabling seamless integration and enhanced compatibility with MLLM. Lastly, we designed a multi-stage training strategy that progressively integrates multimodal alignment and multitask fine-tuning, ensuring effective synergy across all modalities. Baichuan-Omni-1.5 leads contemporary models (including GPT4o-mini and MiniCPM-o 2.6) in terms of comprehensive omni-modal capabilities. Notably, it achieves results comparable to leading models such as Qwen2-VL-72B across various multimodal medical benchmarks.

arxiv preprint arxiv, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2501.15368

Country:

Europe > Switzerland (0.14)
Asia > Thailand (0.14)

Genre: Research Report (0.63)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Education (0.93)
Health & Medicine > Diagnostic Medicine > Imaging (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(5 more...)

Add feedback

Leveraging AI Predicted and Expert Revised Annotations in Interactive Segmentation: Continual Tuning or Full Training?

Zhang, Tiezheng, Chen, Xiaoxi, Qu, Chongyu, Yuille, Alan, Zhou, Zongwei

arXiv.org Artificial IntelligenceFeb-29-2024

Interactive segmentation, an integration of AI algorithms and human expertise, premises to improve the accuracy and efficiency of curating large-scale, detailed-annotated datasets in healthcare. Human experts revise the annotations predicted by AI, and in turn, AI improves its predictions by learning from these revised annotations. This interactive process continues to enhance the quality of annotations until no major revision is needed from experts. The key challenge is how to leverage AI predicted and expert revised annotations to iteratively improve the AI. Two problems arise: (1) The risk of catastrophic forgetting--the AI tends to forget the previously learned classes if it is only retrained using the expert revised classes. (2) Computational inefficiency when retraining the AI using both AI predicted and expert revised annotations; moreover, given the dominant AI predicted annotations in the dataset, the contribution of newly revised annotations--often account for a very small fraction--to the AI training remains marginal. This paper proposes Continual Tuning to address the problems from two perspectives: network design and data reuse. Firstly, we design a shared network for all classes followed by class-specific networks dedicated to individual classes. To mitigate forgetting, we freeze the shared network for previously learned classes and only update the class-specific network for revised classes. Secondly, we reuse a small fraction of data with previous annotations to avoid over-computing. The selection of such data relies on the importance estimate of each data. The importance score is computed by combining the uncertainty and consistency of AI predictions. Our experiments demonstrate that Continual Tuning achieves a speed 16x greater than repeatedly training AI from scratch without compromising the performance.

annotation, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2402.19423

Genre: Research Report (0.82)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.55)
Health & Medicine > Therapeutic Area (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.96)

Add feedback

MedShapeNet -- A Large-Scale Dataset of 3D Medical Shapes for Computer Vision

Li, Jianning, Zhou, Zongwei, Yang, Jiancheng, Pepe, Antonio, Gsaxner, Christina, Luijten, Gijs, Qu, Chongyu, Zhang, Tiezheng, Chen, Xiaoxi, Li, Wenxuan, Wodzinski, Marek, Friedrich, Paul, Xie, Kangxian, Jin, Yuan, Ambigapathy, Narmada, Nasca, Enrico, Solak, Naida, Melito, Gian Marco, Vu, Viet Duc, Memon, Afaque R., Schlachta, Christopher, De Ribaupierre, Sandrine, Patel, Rajnikant, Eagleson, Roy, Chen, Xiaojun, Mächler, Heinrich, Kirschke, Jan Stefan, de la Rosa, Ezequiel, Christ, Patrick Ferdinand, Li, Hongwei Bran, Ellis, David G., Aizenberg, Michele R., Gatidis, Sergios, Küstner, Thomas, Shusharina, Nadya, Heller, Nicholas, Andrearczyk, Vincent, Depeursinge, Adrien, Hatt, Mathieu, Sekuboyina, Anjany, Löffler, Maximilian, Liebl, Hans, Dorent, Reuben, Vercauteren, Tom, Shapey, Jonathan, Kujawa, Aaron, Cornelissen, Stefan, Langenhuizen, Patrick, Ben-Hamadou, Achraf, Rekik, Ahmed, Pujades, Sergi, Boyer, Edmond, Bolelli, Federico, Grana, Costantino, Lumetti, Luca, Salehi, Hamidreza, Ma, Jun, Zhang, Yao, Gharleghi, Ramtin, Beier, Susann, Sowmya, Arcot, Garza-Villarreal, Eduardo A., Balducci, Thania, Angeles-Valdez, Diego, Souza, Roberto, Rittner, Leticia, Frayne, Richard, Ji, Yuanfeng, Ferrari, Vincenzo, Chatterjee, Soumick, Dubost, Florian, Schreiber, Stefanie, Mattern, Hendrik, Speck, Oliver, Haehn, Daniel, John, Christoph, Nürnberger, Andreas, Pedrosa, João, Ferreira, Carlos, Aresta, Guilherme, Cunha, António, Campilho, Aurélio, Suter, Yannick, Garcia, Jose, Lalande, Alain, Vandenbossche, Vicky, Van Oevelen, Aline, Duquesne, Kate, Mekhzoum, Hamza, Vandemeulebroucke, Jef, Audenaert, Emmanuel, Krebs, Claudia, van Leeuwen, Timo, Vereecke, Evie, Heidemeyer, Hauke, Röhrig, Rainer, Hölzle, Frank, Badeli, Vahid, Krieger, Kathrin, Gunzer, Matthias, Chen, Jianxu, van Meegdenburg, Timo, Dada, Amin, Balzer, Miriam, Fragemann, Jana, Jonske, Frederic, Rempe, Moritz, Malorodov, Stanislav, Bahnsen, Fin H., Seibold, Constantin, Jaus, Alexander, Marinov, Zdravko, Jaeger, Paul F., Stiefelhagen, Rainer, Santos, Ana Sofia, Lindo, Mariana, Ferreira, André, Alves, Victor, Kamp, Michael, Abourayya, Amr, Nensa, Felix, Hörst, Fabian, Brehmer, Alexander, Heine, Lukas, Hanusrichter, Yannik, Weßling, Martin, Dudda, Marcel, Podleska, Lars E., Fink, Matthias A., Keyl, Julius, Tserpes, Konstantinos, Kim, Moon-Sung, Elhabian, Shireen, Lamecker, Hans, Zukić, Dženan, Paniagua, Beatriz, Wachinger, Christian, Urschler, Martin, Duong, Luc, Wasserthal, Jakob, Hoyer, Peter F., Basu, Oliver, Maal, Thomas, Witjes, Max J. H., Schiele, Gregor, Chang, Ti-chiun, Ahmadi, Seyed-Ahmad, Luo, Ping, Menze, Bjoern, Reyes, Mauricio, Deserno, Thomas M., Davatzikos, Christos, Puladi, Behrus, Fua, Pascal, Yuille, Alan L., Kleesiek, Jens, Egger, Jan

arXiv.org Artificial IntelligenceDec-12-2023

Prior to the deep learning era, shape was commonly used to describe the objects. Nowadays, state-of-the-art (SOTA) algorithms in medical imaging are predominantly diverging from computer vision, where voxel grids, meshes, point clouds, and implicit surface models are used. This is seen from numerous shape-related publications in premier vision conferences as well as the growing popularity of ShapeNet (about 51,300 models) and Princeton ModelNet (127,915 models). For the medical domain, we present a large collection of anatomical shapes (e.g., bones, organs, vessels) and 3D models of surgical instrument, called MedShapeNet, created to facilitate the translation of data-driven vision algorithms to medical applications and to adapt SOTA vision algorithms to medical problems. As a unique feature, we directly model the majority of shapes on the imaging data of real patients. As of today, MedShapeNet includes 23 dataset with more than 100,000 shapes that are paired with annotations (ground truth). Our data is freely accessible via a web interface and a Python application programming interface (API) and can be used for discriminative, reconstructive, and variational benchmarks as well as various applications in virtual, augmented, or mixed reality, and 3D printing. Exemplary, we present use cases in the fields of classification of brain tumors, facial and skull reconstructions, multi-class anatomy completion, education, and 3D printing. In future, we will extend the data and improve the interfaces. The project pages are: https://medshapenet.ikim.nrw/ and https://github.com/Jianningli/medshapenet-feedback

artificial intelligence, machine learning, segmentation, (19 more...)

arXiv.org Artificial Intelligence

2308.16139

Country:

North America > Canada (1.00)
Europe > Switzerland (0.94)
Asia (0.93)
(5 more...)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Baichuan 2: Open Large-scale Language Models

Yang, Aiyuan, Xiao, Bin, Wang, Bingning, Zhang, Borong, Bian, Ce, Yin, Chao, Lv, Chenxu, Pan, Da, Wang, Dian, Yan, Dong, Yang, Fan, Deng, Fei, Wang, Feng, Liu, Feng, Ai, Guangwei, Dong, Guosheng, Zhao, Haizhou, Xu, Hang, Sun, Haoze, Zhang, Hongda, Liu, Hui, Ji, Jiaming, Xie, Jian, Dai, JunTao, Fang, Kun, Su, Lei, Song, Liang, Liu, Lifeng, Ru, Liyun, Ma, Luyao, Wang, Mang, Liu, Mickel, Lin, MingAn, Nie, Nuolan, Guo, Peidong, Sun, Ruiyang, Zhang, Tao, Li, Tianpeng, Li, Tianyu, Cheng, Wei, Chen, Weipeng, Zeng, Xiangrong, Wang, Xiaochuan, Chen, Xiaoxi, Men, Xin, Yu, Xin, Pan, Xuehai, Shen, Yanjun, Wang, Yiding, Li, Yiyu, Jiang, Youxin, Gao, Yuchen, Zhang, Yupeng, Zhou, Zenan, Wu, Zhiying

arXiv.org Artificial IntelligenceSep-20-2023

Large language models (LLMs) have demonstrated remarkable performance on a variety of natural language tasks based on just a few examples of natural language instructions, reducing the need for extensive feature engineering. However, most powerful LLMs are closed-source or limited in their capability for languages other than English. In this technical report, we present Baichuan 2, a series of large-scale multilingual language models containing 7 billion and 13 billion parameters, trained from scratch, on 2.6 trillion tokens. Baichuan 2 matches or outperforms other open-source models of similar size on public benchmarks like MMLU, CMMLU, GSM8K, and HumanEval. Furthermore, Baichuan 2 excels in vertical domains such as medicine and law. We will release all pre-training model checkpoints to benefit the research community in better understanding the training dynamics of Baichuan 2.

large language model, natural language, open large-scale language model, (2 more...)

arXiv.org Artificial Intelligence

2309.10305

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.73)

Add feedback