Choi, Michael
Iterative Multi-Agent Reinforcement Learning: A Novel Approach Toward Real-World Multi-Echelon Inventory Optimization
Ziegner, Georg, Choi, Michael, Le, Hung Mac Chan, Sakhuja, Sahil, Sarmadi, Arash
Multi-echelon inventory optimization (MEIO) is critical for effective supply chain management, but its inherent complexity can pose significant challenges. Heuristics are commonly used to address this complexity, yet they often face limitations in scope and scalability. Recent research has found deep reinforcement learning (DRL) to be a promising alternative to traditional heuristics, offering greater versatility by utilizing dynamic decision-making capabilities. However, since DRL is known to struggle with the curse of dimensionality, its relevance to complex real-life supply chain scenarios is still to be determined. This thesis investigates DRL's applicability to MEIO problems of increasing complexity. A state-of-the-art DRL model was replicated, enhanced, and tested across 13 supply chain scenarios, combining diverse network structures and parameters. To address DRL's challenges with dimensionality, additional models leveraging graph neural networks (GNNs) and multi-agent reinforcement learning (MARL) were developed, culminating in the novel iterative multi-agent reinforcement learning (IMARL) approach. IMARL demonstrated superior scalability, effectiveness, and reliability in optimizing inventory policies, consistently outperforming benchmarks. These findings confirm the potential of DRL, particularly IMARL, to address real-world supply chain challenges and call for additional research to further expand its applicability.
Humanity's Last Exam
Phan, Long, Gatti, Alice, Han, Ziwen, Li, Nathaniel, Hu, Josephina, Zhang, Hugh, Zhang, Chen Bo Calvin, Shaaban, Mohamed, Ling, John, Shi, Sean, Choi, Michael, Agrawal, Anish, Chopra, Arnav, Khoja, Adam, Kim, Ryan, Ren, Richard, Hausenloy, Jason, Zhang, Oliver, Mazeika, Mantas, Nguyen, Tung, Anderson, Daron, Shah, Imad Ali, Doroshenko, Mikhail, Stokes, Alun Cennyth, Mahmood, Mobeen, Lee, Jaeho, Pokutnyi, Oleksandr, Iskra, Oleg, Wang, Jessica P., Gerbicz, Robert, Levin, John-Clark, Popov, Serguei, Feng, Fiona, Feng, Steven Y., Zhao, Haoran, Yu, Michael, Gangal, Varun, Zou, Chelsea, Wang, Zihan, Kazakov, Mstyslav, Galgon, Geoff, Schmitt, Johannes, Sanchez, Alvaro, Lee, Yongki, Yeadon, Will, Sauers, Scott, Roth, Marc, Agu, Chidozie, Riis, Sรธren, Giska, Fabian, Utpala, Saiteja, Cheatom, Antrell, Giboney, Zachary, Goshu, Gashaw M., Crowson, Sarah-Jane, Naiya, Mohinder Maheshbhai, Burns, Noah, Finke, Lennart, Cheng, Zerui, Park, Hyunwoo, Fournier-Facio, Francesco, Zampese, Jennifer, Wydallis, John, Wydallis, John B., Hoerr, Ryan G., Nandor, Mark, Gehrunger, Tim, Cai, Jiaqi, McCarty, Ben, Nam, Jungbae, Taylor, Edwin, Jin, Jun, Loume, Gautier Abou, Cao, Hangrui, Garretson, Alexis C, Sileo, Damien, Ren, Qiuyu, Cojoc, Doru, Arkhipov, Pavel, Qazi, Usman, Bacho, Aras, Li, Lianghui, Motwani, Sumeet, de Witt, Christian Schroeder, Kopylov, Alexei, Veith, Johannes, Singer, Eric, Rissone, Paolo, Jin, Jaehyeok, Shi, Jack Wei Lun, Willcocks, Chris G., Prabhu, Ameya, Tang, Longke, Zhou, Kevin, Santos, Emily de Oliveira, Maksimov, Andrey Pupasov, Vendrow, Edward, Zenitani, Kengo, Robinson, Joshua, Mikov, Aleksandar, Guillod, Julien, Li, Yuqi, Pageler, Ben, Vendrow, Joshua, Kuchkin, Vladyslav, Marion, Pierre, Efremov, Denis, Lynch, Jayson, Liang, Kaiqu, Gritsevskiy, Andrew, Martinez, Dakotah, Crispino, Nick, Zvonkine, Dimitri, Fraga, Natanael Wildner, Soori, Saeed, Press, Ori, Tang, Henry, Salazar, Julian, Green, Sean R., Brรผssel, Lina, Twayana, Moon, Dieuleveut, Aymeric, Rogers, T. Ryan, Zhang, Wenjin, Finocchio, Ross, Li, Bikun, Yang, Jinzhou, Rao, Arun, Loiseau, Gabriel, Kalinin, Mikhail, Lukas, Marco, Manolescu, Ciprian, Stambaugh, Nate, Mishra, Subrata, Kamdoum, Ariel Ghislain Kemogne, Hogg, Tad, Jin, Alvin, Bosio, Carlo, Sun, Gongbo, Coppola, Brian P, Heidinger, Haline, Sayous, Rafael, Ivanov, Stefan, Cavanagh, Joseph M, Shen, Jiawei, Imperial, Joseph Marvin, Schwaller, Philippe, Senthilkuma, Shaipranesh, Bran, Andres M, Algaba, Andres, Verbeken, Brecht, Houte, Kelsey Van den, Van Der Sypt, Lynn, Noever, David, Schut, Lisa, Sucholutsky, Ilia, Zheltonozhskii, Evgenii, Yuan, Qiaochu, Lim, Derek, Stanley, Richard, Sivarajan, Shankar, Yang, Tong, Maar, John, Wykowski, Julian, Oller, Martรญ, Sandlin, Jennifer, Sahu, Anmol, Ardito, Cesare Giulio, Hu, Yuzheng, Dias, Felipe Meneguitti, Kreiman, Tobias, Rawal, Kaivalya, Vilchis, Tobias Garcia, Zu, Yuexuan, Lackner, Martin, Koppel, James, Nguyen, Jeremy, Antonenko, Daniil S., Chern, Steffi, Zhao, Bingchen, Arsene, Pierrot, Ivanov, Sergey, Poลwiata, Rafaล, Wang, Chenguang, Li, Daofeng, Crisostomi, Donato, Dehghan, Ali, Achilleos, Andrea, Ambay, John Arnold, Myklebust, Benjamin, Sen, Archan, Perrella, David, Kaparov, Nurdin, Inlow, Mark H, Zang, Allen, Ramakrishnan, Kalyan, Orel, Daniil, Poritski, Vladislav, Ben-David, Shalev, Berger, Zachary, Whitfill, Parker, Foster, Michael, Munro, Daniel, Ho, Linh, Hava, Dan Bar, Kuchkin, Aleksey, Lauff, Robert, Holmes, David, Sommerhage, Frank, Zhang, Anji, Moat, Richard, Schneider, Keith, Pyda, Daniel, Kazibwe, Zakayo, Singh, Mukhwinder, Clarke, Don, Kim, Dae Hyun, Fish, Sara, Elser, Veit, Vilchis, Victor Efren Guadarrama, Klose, Immo, Demian, Christoph, Anantheswaran, Ujjwala, Zweiger, Adam, Albani, Guglielmo, Li, Jeffery, Daans, Nicolas, Radionov, Maksim, Rozhoล, Vรกclav, Ginis, Vincent, Ma, Ziqiao, Stump, Christian, Platnick, Jacob, Nevirkovets, Volodymyr, Basler, Luke, Piccardo, Marco, Cohen, Niv, Singh, Virendra, Tkadlec, Josef, Rosu, Paul, Goldfarb, Alan, Padlewski, Piotr, Barzowski, Stanislaw, Montgomery, Kyle, Menezes, Aline, Patel, Arkil, Wang, Zixuan, Tucker-Foltz, Jamie, Stade, Jack, Grabb, Declan, Goertzen, Tom, Kazemi, Fereshteh, Milbauer, Jeremiah, Shukla, Abhishek, Elgnainy, Hossam, Labrador, Yan Carlos Leyva, He, Hao, Zhang, Ling, Givrรฉ, Alan, Wolff, Hew, Demir, Gรถzdenur, Aziz, Muhammad Fayez, Kaddar, Younesse, รngquist, Ivar, Chen, Yanxu, Thornley, Elliott, Zhang, Robin, Pan, Jiayi, Terpin, Antonio, Muennighoff, Niklas, Schoelkopf, Hailey, Zheng, Eric, Carmi, Avishy, Shah, Jainam, Brown, Ethan D. L., Zhu, Kelin, Bartolo, Max, Wheeler, Richard, Ho, Andrew, Barkan, Shaul, Wang, Jiaqi, Stehberger, Martin, Kretov, Egor, Bradshaw, Peter, Heimonen, JP, Sridhar, Kaustubh, Hossain, Zaki, Akov, Ido, Makarychev, Yury, Tam, Joanna, Hoang, Hieu, Cunningham, David M., Goryachev, Vladimir, Patramanis, Demosthenes, Krause, Michael, Redenti, Andrew, Aldous, David, Lai, Jesyin, Coleman, Shannon, Xu, Jiangnan, Lee, Sangwon, Magoulas, Ilias, Zhao, Sandy, Tang, Ning, Cohen, Michael K., Carroll, Micah, Paradise, Orr, Kirchner, Jan Hendrik, Steinerberger, Stefan, Ovchynnikov, Maksym, Matos, Jason O., Shenoy, Adithya, Wang, Michael, Nie, Yuzhou, Giordano, Paolo, Petersen, Philipp, Sztyber-Betley, Anna, Faraboschi, Paolo, Riblet, Robin, Crozier, Jonathan, Halasyamani, Shiv, Pinto, Antonella, Verma, Shreyas, Joshi, Prashant, Meril, Eli, Yong, Zheng-Xin, Tee, Allison, Andrรฉoletti, Jรฉrรฉmy, Weller, Orion, Singhal, Raghav, Zhang, Gang, Ivanov, Alexander, Khoury, Seri, Gustafsson, Nils, Mostaghimi, Hamid, Thaman, Kunvar, Chen, Qijia, Khรกnh, Tran Quoc, Loader, Jacob, Cavalleri, Stefano, Szlyk, Hannah, Brown, Zachary, Narayan, Himanshu, Roberts, Jonathan, Alley, William, Sun, Kunyang, Stendall, Ryan, Lamparth, Max, Reuel, Anka, Wang, Ting, Xu, Hanmeng, Hernรกndez-Cรกmara, Pablo, Martin, Freddie, Preu, Thomas, Korbak, Tomek, Abramovitch, Marcus, Williamson, Dominic, Bosio, Ida, Chen, Ziye, Bรกlint, Birรณ, Lo, Eve J. Y., Nunes, Maria Inรชs S., Jiang, Yibo, Bari, M Saiful, Kassani, Peyman, Wang, Zihao, Ansarinejad, Behzad, Sun, Yewen, Durand, Stephane, Douville, Guillaume, Tordera, Daniel, Balabanian, George, Anderson, Earth, Kvistad, Lynna, Moyano, Alejandro Josรฉ, Milliron, Hsiaoyun, Sakor, Ahmad, Eron, Murat, McAlister, Isaac C., O., Andrew Favre D., Shah, Shailesh, Zhou, Xiaoxiang, Kamalov, Firuz, Clark, Ronald, Abdoli, Sherwin, Santens, Tim, Wang, Harrison K, Chen, Evan, Tomasiello, Alessandro, De Luca, G. Bruno, Looi, Shi-Zhuo, Le, Vinh-Kha, Kolt, Noam, Mรผndler, Niels, Semler, Avi, Rodman, Emma, Drori, Jacob, Fossum, Carl J, Gloor, Luk, Jagota, Milind, Pradeep, Ronak, Fan, Honglu, Shah, Tej, Eicher, Jonathan, Chen, Michael, Thaman, Kushal, Merrill, William, Firsching, Moritz, Harris, Carter, Ciobรขcฤ, Stefan, Gross, Jason, Pandey, Rohan, Gusev, Ilya, Jones, Adam, Agnihotri, Shashank, Zhelnov, Pavel, Usawasutsakorn, Siranut, Mofayezi, Mohammadreza, Piperski, Alexander, Carauleanu, Marc, Zhang, David K., Dobarskyi, Kostiantyn, Ler, Dylan, Leventov, Roman, Soroko, Ignat, Jansen, Thorben, Creighton, Scott, Lauer, Pascal, Duersch, Joshua, Taamazyan, Vage, Bezzi, Dario, Morak, Wiktor, Ma, Wenjie, Held, William, Huy, Tran ฤuc, Xian, Ruicheng, Zebaze, Armel Randy, Mohamed, Mohanad, Leser, Julian Noah, Yuan, Michelle X, Yacar, Laila, Lengler, Johannes, Olszewska, Katarzyna, Shahrtash, Hossein, Oliveira, Edson, Jackson, Joseph W., Gonzalez, Daniel Espinosa, Zou, Andy, Chidambaram, Muthu, Manik, Timothy, Haffenden, Hector, Stander, Dashiell, Dasouqi, Ali, Shen, Alexander, Duc, Emilien, Golshani, Bita, Stap, David, Uzhou, Mikalai, Zhidkovskaya, Alina Borisovna, Lewark, Lukas, Rodriguez, Miguel Orbegozo, Vincze, Mรกtyรกs, Wehr, Dustin, Tang, Colin, Phillips, Shaun, Samuele, Fortuna, Muzhen, Jiang, Ekstrรถm, Fredrik, Hammon, Angela, Patel, Oam, Farhidi, Faraz, Medley, George, Mohammadzadeh, Forough, Peรฑaflor, Madellene, Kassahun, Haile, Friedrich, Alena, Sparrow, Claire, Perez, Rayner Hernandez, Sakal, Taom, Dhamane, Omkar, Mirabadi, Ali Khajegili, Hallman, Eric, Okutsu, Kenchi, Battaglia, Mike, Maghsoudimehrabani, Mohammad, Amit, Alon, Hulbert, Dave, Pereira, Roberto, Weber, Simon, Handoko, null, Peristyy, Anton, Malina, Stephen, Albanie, Samuel, Cai, Will, Mehkary, Mustafa, Aly, Rami, Reidegeld, Frank, Dick, Anna-Katharina, Friday, Cary, Sidhu, Jasdeep, Shapourian, Hassan, Kim, Wanyoung, Costa, Mariana, Gurdogan, Hubeyb, Weber, Brian, Kumar, Harsh, Jiang, Tong, Agarwal, Arunim, Ceconello, Chiara, Vaz, Warren S., Zhuang, Chao, Park, Haon, Tawfeek, Andrew R., Aggarwal, Daattavya, Kirchhof, Michael, Dai, Linjie, Kim, Evan, Ferret, Johan, Wang, Yuzhou, Yan, Minghao, Burdzy, Krzysztof, Zhang, Lixin, Franca, Antonio, Pham, Diana T., Loh, Kang Yong, Robinson, Joshua, Jackson, Abram, Gul, Shreen, Chhablani, Gunjan, Du, Zhehang, Cosma, Adrian, Colino, Jesus, White, Colin, Votava, Jacob, Vinnikov, Vladimir, Delaney, Ethan, Spelda, Petr, Stritecky, Vit, Shahid, Syed M., Mourrat, Jean-Christophe, Vetoshkin, Lavr, Sponselee, Koen, Bacho, Renas, de la Rosa, Florencia, Li, Xiuyu, Malod, Guillaume, Lang, Leon, Laurendeau, Julien, Kazakov, Dmitry, Adesanya, Fatimah, Portier, Julien, Hollom, Lawrence, Souza, Victor, Zhou, Yuchen Anna, Degorre, Julien, Yalฤฑn, Yiฤit, Obikoya, Gbenga Daniel, Arnaboldi, Luca, Rai, null, Bigi, Filippo, Boscรก, M. C., Shumar, Oleg, Bacho, Kaniuar, Clavier, Pierre, Recchia, Gabriel, Popescu, Mara, Shulga, Nikita, Tanwie, Ngefor Mildred, Peskoff, Denis, Lux, Thomas C. H., Rank, Ben, Ni, Colin, Brooks, Matthew, Yakimchyk, Alesia, Huanxu, null, Liu, null, Hรคggstrรถm, Olle, Verkama, Emil, Gundlach, Hans, Brito-Santana, Leonor, Amaro, Brian, Vajipey, Vivek, Grover, Rynaa, Fan, Yiyang, Silva, Gabriel Poesia Reis e, Xin, Linwei, Kratish, Yosi, ลucki, Jakub, Li, Wen-Ding, Gopi, Sivakanth, Caciolai, Andrea, Xu, Justin, Scaria, Kevin Joseph, Vargus, Freddie, Habibi, Farzad, Long, null, Lian, null, Rodolร , Emanuele, Robins, Jules, Cheng, Vincent, Fruhauff, Tony, Raynor, Brad, Qi, Hao, Jiang, Xi, Segev, Ben, Fan, Jingxuan, Martinson, Sarah, Wang, Erik Y., Hausknecht, Kaylie, Brenner, Michael P., Mao, Mao, Zhang, Xinyu, Avagian, David, Scipio, Eshawn Jessica, Ragoler, Alon, Tan, Justin, Sims, Blake, Plecnik, Rebeka, Kirtland, Aaron, Bodur, Omer Faruk, Shinde, D. P., Adoul, Zahra, Zekry, Mohamed, Karakoc, Ali, Santos, Tania C. B., Shamseldeen, Samir, Karim, Loukmane, Liakhovitskaia, Anna, Resman, Nate, Farina, Nicholas, Gonzalez, Juan Carlos, Maayan, Gabe, Hoback, Sarah, Pena, Rodrigo De Oliveira, Sherman, Glen, Kelley, Elizabeth, Mariji, Hodjat, Pouriamanesh, Rasoul, Wu, Wentao, Mendoza, Sandra, Alarab, Ismail, Cole, Joshua, Ferreira, Danyelle, Johnson, Bryan, Safdari, Mohammad, Dai, Liangti, Arthornthurasuk, Siriphan, Pronin, Alexey, Fan, Jing, Ramirez-Trinidad, Angel, Cartwright, Ashley, Pottmaier, Daphiny, Taheri, Omid, Outevsky, David, Stepanic, Stanley, Perry, Samuel, Askew, Luke, Rodrรญguez, Raรบl Adriรกn Huerta, Minissi, Ali M. R., Ali, Sam, Lorena, Ricardo, Iyer, Krishnamurthy, Fasiludeen, Arshad Anil, Salauddin, Sk Md, Islam, Murat, Gonzalez, Juan, Ducey, Josh, Somrak, Maja, Mavroudis, Vasilios, Vergo, Eric, Qin, Juehang, Borbรกs, Benjรกmin, Chu, Eric, Lindsey, Jack, Radhakrishnan, Anil, Jallon, Antoine, McInnis, I. M. J., Kumar, Pawan, Goswami, Laxman Prasad, Bugas, Daniel, Heydari, Nasser, Jeanplong, Ferenc, Apronti, Archimedes, Galal, Abdallah, Ze-An, Ng, Singh, Ankit, Xavier, Joan of Arc, Agarwal, Kanu Priya, Berkani, Mohammed, Junior, Benedito Alves de Oliveira, Malishev, Dmitry, Remy, Nicolas, Hartman, Taylor D., Tarver, Tim, Mensah, Stephen, Gimenez, Javier, Montecillo, Roselynn Grace, Campbell, Russell, Sharma, Asankhaya, Meer, Khalida, Alapont, Xavier, Patil, Deepakkumar, Maheshwari, Rajat, Dendane, Abdelkader, Shukla, Priti, Bogdanov, Sergei, Mรถller, Sรถren, Siddiqi, Muhammad Rehan, Saxena, Prajvi, Gupta, Himanshu, Enyekwe, Innocent, P, Ragavendran V, EL-Wasif, Zienab, Maksapetyan, Aleksandr, Rossbach, Vivien, Harjadi, Chris, Bahaloohoreh, Mohsen, Bian, Song, Lai, John, Uro, Justine Leon, Bateman, Greg, Sayed, Mohamed, Menshawy, Ahmed, Duclosel, Darling, Jain, Yashaswini, Aaron, Ashley, Tiryakioglu, Murat, Siddh, Sheeshram, Krenek, Keith, Hoover, Alex, McGowan, Joseph, Patwardhan, Tejal, Yue, Summer, Wang, Alexandr, Hendrycks, Dan
Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of human knowledge, designed to be the final closed-ended academic benchmark of its kind with broad subject coverage. HLE consists of 3,000 questions across dozens of subjects, including mathematics, humanities, and the natural sciences. HLE is developed globally by subject-matter experts and consists of multiple-choice and short-answer questions suitable for automated grading. Each question has a known solution that is unambiguous and easily verifiable, but cannot be quickly answered via internet retrieval. State-of-the-art LLMs demonstrate low accuracy and calibration on HLE, highlighting a significant gap between current LLM capabilities and the expert human frontier on closed-ended academic questions. To inform research and policymaking upon a clear understanding of model capabilities, we publicly release HLE at https://lastexam.ai.
Jailbreaking to Jailbreak
Kritz, Jeremy, Robinson, Vaughn, Vacareanu, Robert, Varjavand, Bijan, Choi, Michael, Gogov, Bobby, Team, Scale Red, Yue, Summer, Primack, Willow E., Wang, Zifan
Refusal training on Large Language Models (LLMs) prevents harmful outputs, yet this defense remains vulnerable to both automated and human-crafted jailbreaks. We present a novel LLM-as-red-teamer approach in which a human jailbreaks a refusal-trained LLM to make it willing to jailbreak itself or other LLMs. We refer to the jailbroken LLMs as $J_2$ attackers, which can systematically evaluate target models using various red teaming strategies and improve its performance via in-context learning from the previous failures. Our experiments demonstrate that Sonnet 3.5 and Gemini 1.5 pro outperform other LLMs as $J_2$, achieving 93.0% and 91.0% attack success rates (ASRs) respectively against GPT-4o (and similar results across other capable LLMs) on Harmbench. Our work not only introduces a scalable approach to strategic red teaming, drawing inspiration from human red teamers, but also highlights jailbreaking-to-jailbreak as an overlooked failure mode of the safeguard. Specifically, an LLM can bypass its own safeguards by employing a jailbroken version of itself that is willing to assist in further jailbreaking. To prevent any direct misuse with $J_2$, while advancing research in AI safety, we publicly share our methodology while keeping specific prompting details private.