Liu, Runyu
The infrastructure powering IBM's Gen AI model development
Gershon, Talia, Seelam, Seetharami, Belgodere, Brian, Bonilla, Milton, Hoang, Lan, Barnett, Danny, Chung, I-Hsin, Mohan, Apoorve, Chen, Ming-Hung, Luo, Lixiang, Walkup, Robert, Evangelinos, Constantinos, Salaria, Shweta, Dombrowa, Marc, Park, Yoonho, Kayi, Apo, Schour, Liran, Alim, Alim, Sydney, Ali, Maniotis, Pavlos, Schares, Laurent, Metzler, Bernard, Karacali-Akyamac, Bengi, Wen, Sophia, Chiba, Tatsuhiro, Choochotkaew, Sunyanan, Yoshimura, Takeshi, Misale, Claudia, Elengikal, Tonia, Connor, Kevin O, Liu, Zhuoran, Molina, Richard, Schneidenbach, Lars, Caden, James, Laibinis, Christopher, Fonseca, Carlos, Tarasov, Vasily, Sundararaman, Swaminathan, Schmuck, Frank, Guthridge, Scott, Cohn, Jeremy, Eshel, Marc, Muench, Paul, Liu, Runyu, Pointer, William, Wyskida, Drew, Krull, Bob, Rose, Ray, Wolfe, Brent, Cornejo, William, Walter, John, Malone, Colm, Perucci, Clifford, Franco, Frank, Hinds, Nigel, Calio, Bob, Druyan, Pavel, Kilduff, Robert, Kienle, John, McStay, Connor, Figueroa, Andrew, Connolly, Matthew, Fost, Edie, Roma, Gina, Fonseca, Jake, Levy, Ido, Payne, Michele, Schenkel, Ryan, Malki, Amir, Schneider, Lion, Narkhede, Aniruddha, Moshref, Shekeba, Kisin, Alexandra, Dodin, Olga, Rippon, Bill, Wrieth, Henry, Ganci, John, Colino, Johnny, Habeger-Rose, Donna, Pandey, Rakesh, Gidh, Aditya, Gaur, Aditya, Patterson, Dennis, Salmani, Samsuddin, Varma, Rambilas, Rumana, Rumana, Sharma, Shubham, Gaur, Aditya, Mishra, Mayank, Panda, Rameswar, Prasad, Aditya, Stallone, Matt, Zhang, Gaoyuan, Shen, Yikang, Cox, David, Puri, Ruchir, Agrawal, Dakshi, Thorstensen, Drew, Belog, Joel, Tang, Brent, Gupta, Saurabh Kumar, Biswas, Amitabha, Maheshwari, Anup, Gampel, Eran, Van Patten, Jason, Runion, Matthew, Kaki, Sai, Bogin, Yigal, Reitz, Brian, Pritko, Steve, Najam, Shahan, Nambala, Surya, Chirra, Radhika, Welp, Rick, DiMitri, Frank, Telles, Felipe, Arvelo, Amilcar, Chu, King, Seminaro, Ed, Schram, Andrew, Eickhoff, Felix, Hanson, William, Mckeever, Eric, Joseph, Dinakaran, Chaudhary, Piyush, Shivam, Piyush, Chaudhary, Puneet, Jones, Wesley, Guthrie, Robert, Bostic, Chris, Islam, Rezaul, Duersch, Steve, Sawdon, Wayne, Lewars, John, Klos, Matthew, Spriggs, Michael, McMillan, Bill, Gao, George, Kamra, Ashish, Singh, Gaurav, Curry, Marc, Katarki, Tushar, Talerico, Joe, Shi, Zenghui, Malleni, Sai Sindhur, Gallen, Erwan
AI Infrastructure plays a key role in the speed and cost-competitiveness of developing and deploying advanced AI models. The current demand for powerful AI infrastructure for model training is driven by the emergence of generative AI and foundational models, where on occasion thousands of GPUs must cooperate on a single training job for the model to be trained in a reasonable time. Delivering efficient and high-performing AI training requires an end-to-end solution that combines hardware, software and holistic telemetry to cater for multiple types of AI workloads. In this report, we describe IBM's hybrid cloud infrastructure that powers our generative AI model development. This infrastructure includes (1) Vela: an AI-optimized supercomputing capability directly integrated into the IBM Cloud, delivering scalable, dynamic, multi-tenant and geographically distributed infrastructure for large-scale model training and other AI workflow steps and (2) Blue Vela: a large-scale, purpose-built, on-premises hosting environment that is optimized to support our largest and most ambitious AI model training tasks. Vela provides IBM with the dual benefit of high performance for internal use along with the flexibility to adapt to an evolving commercial landscape. Blue Vela provides us with the benefits of rapid development of our largest and most ambitious models, as well as future-proofing against the evolving model landscape in the industry. Taken together, they provide IBM with the ability to rapidly innovate in the development of both AI models and commercial offerings.
An Ontology-Based Artificial Intelligence Model for Medicine Side-Effect Prediction: Taking Traditional Chinese Medicine as An Example
Wang, Zeheng, Lu, Kun, Cao, Jun, Yao, Yuanzhe, Li, Liang, Liu, Runyu, Liu, Zhiyuan, Yan, Jing
Artificial intelligence is a modern technology that is utilized in various fields of medicine [1-3]. At the meantime, Chinese Traditional Medicine (TCM) is now widely considered as a promising alternative medicine for complementary treatment in cancers or chronic diseases due to the effective methodology practically developed by generations of doctors for almost 4000 years [4]. Based on previous verification, it is undeniable that there are many correlations between the TCM syndromes and western diseases, turning out novel approaches for enhancing the treatment efficiency and developing medicines regarding with TCM methodologies [5]. Unfortunately, hindered by the remarkable gap between the modern informatics and the fundament of TCM: antient Chinese philosophy, such correlations are still too elusive to be formulated precisely. Therefore, recently, in order to figure out the deep connection between modern science and TCM, the research combining TCM with AI for valid knowledge acquisition and mining attracts extremely attention, and hereby, leading to many profound works, such as ontology information system design [6], latent tree models design [7], TCM warehouse for AI application [8], and digital knowledge graph development [2]. On the other hand, researchers face, however, many difficulties in setting up AI for TCM in terms of directly interpreting TCM semantic system (almost recorded by ancient Chinese doctrines) into structured database. Because in this way, considerable workload must be undertaken by limited numbers of experts who are proficient in both AI and TCM to translate the TCM terminologies and then formulate the modern model thereof. In contrast, as shown in Figure 1, the digestion of using TCM methodology in dealing with issues of modern science, new medicine design for example, is relatively lacking and thus of significant worth to explore.