Shah, Meghan
OpenAI o1 System Card
OpenAI, null, :, null, Jaech, Aaron, Kalai, Adam, Lerer, Adam, Richardson, Adam, El-Kishky, Ahmed, Low, Aiden, Helyar, Alec, Madry, Aleksander, Beutel, Alex, Carney, Alex, Iftimie, Alex, Karpenko, Alex, Passos, Alex Tachard, Neitz, Alexander, Prokofiev, Alexander, Wei, Alexander, Tam, Allison, Bennett, Ally, Kumar, Ananya, Saraiva, Andre, Vallone, Andrea, Duberstein, Andrew, Kondrich, Andrew, Mishchenko, Andrey, Applebaum, Andy, Jiang, Angela, Nair, Ashvin, Zoph, Barret, Ghorbani, Behrooz, Rossen, Ben, Sokolowsky, Benjamin, Barak, Boaz, McGrew, Bob, Minaiev, Borys, Hao, Botao, Baker, Bowen, Houghton, Brandon, McKinzie, Brandon, Eastman, Brydon, Lugaresi, Camillo, Bassin, Cary, Hudson, Cary, Li, Chak Ming, de Bourcy, Charles, Voss, Chelsea, Shen, Chen, Zhang, Chong, Koch, Chris, Orsinger, Chris, Hesse, Christopher, Fischer, Claudia, Chan, Clive, Roberts, Dan, Kappler, Daniel, Levy, Daniel, Selsam, Daniel, Dohan, David, Farhi, David, Mely, David, Robinson, David, Tsipras, Dimitris, Li, Doug, Oprica, Dragos, Freeman, Eben, Zhang, Eddie, Wong, Edmund, Proehl, Elizabeth, Cheung, Enoch, Mitchell, Eric, Wallace, Eric, Ritter, Erik, Mays, Evan, Wang, Fan, Such, Felipe Petroski, Raso, Filippo, Leoni, Florencia, Tsimpourlas, Foivos, Song, Francis, von Lohmann, Fred, Sulit, Freddie, Salmon, Geoff, Parascandolo, Giambattista, Chabot, Gildas, Zhao, Grace, Brockman, Greg, Leclerc, Guillaume, Salman, Hadi, Bao, Haiming, Sheng, Hao, Andrin, Hart, Bagherinezhad, Hessam, Ren, Hongyu, Lightman, Hunter, Chung, Hyung Won, Kivlichan, Ian, O'Connell, Ian, Osband, Ian, Gilaberte, Ignasi Clavera, Akkaya, Ilge, Kostrikov, Ilya, Sutskever, Ilya, Kofman, Irina, Pachocki, Jakub, Lennon, James, Wei, Jason, Harb, Jean, Twore, Jerry, Feng, Jiacheng, Yu, Jiahui, Weng, Jiayi, Tang, Jie, Yu, Jieqi, Candela, Joaquin Quiรฑonero, Palermo, Joe, Parish, Joel, Heidecke, Johannes, Hallman, John, Rizzo, John, Gordon, Jonathan, Uesato, Jonathan, Ward, Jonathan, Huizinga, Joost, Wang, Julie, Chen, Kai, Xiao, Kai, Singhal, Karan, Nguyen, Karina, Cobbe, Karl, Shi, Katy, Wood, Kayla, Rimbach, Kendra, Gu-Lemberg, Keren, Liu, Kevin, Lu, Kevin, Stone, Kevin, Yu, Kevin, Ahmad, Lama, Yang, Lauren, Liu, Leo, Maksin, Leon, Ho, Leyton, Fedus, Liam, Weng, Lilian, Li, Linden, McCallum, Lindsay, Held, Lindsey, Kuhn, Lorenz, Kondraciuk, Lukas, Kaiser, Lukasz, Metz, Luke, Boyd, Madelaine, Trebacz, Maja, Joglekar, Manas, Chen, Mark, Tintor, Marko, Meyer, Mason, Jones, Matt, Kaufer, Matt, Schwarzer, Max, Shah, Meghan, Yatbaz, Mehmet, Guan, Melody Y., Xu, Mengyuan, Yan, Mengyuan, Glaese, Mia, Chen, Mianna, Lampe, Michael, Malek, Michael, Wang, Michele, Fradin, Michelle, McClay, Mike, Pavlov, Mikhail, Wang, Miles, Wang, Mingxuan, Murati, Mira, Bavarian, Mo, Rohaninejad, Mostafa, McAleese, Nat, Chowdhury, Neil, Chowdhury, Neil, Ryder, Nick, Tezak, Nikolas, Brown, Noam, Nachum, Ofir, Boiko, Oleg, Murk, Oleg, Watkins, Olivia, Chao, Patrick, Ashbourne, Paul, Izmailov, Pavel, Zhokhov, Peter, Dias, Rachel, Arora, Rahul, Lin, Randall, Lopes, Rapha Gontijo, Gaon, Raz, Miyara, Reah, Leike, Reimar, Hwang, Renny, Garg, Rhythm, Brown, Robin, James, Roshan, Shu, Rui, Cheu, Ryan, Greene, Ryan, Jain, Saachi, Altman, Sam, Toizer, Sam, Toyer, Sam, Miserendino, Samuel, Agarwal, Sandhini, Hernandez, Santiago, Baker, Sasha, McKinney, Scott, Yan, Scottie, Zhao, Shengjia, Hu, Shengli, Santurkar, Shibani, Chaudhuri, Shraman Ray, Zhang, Shuyuan, Fu, Siyuan, Papay, Spencer, Lin, Steph, Balaji, Suchir, Sanjeev, Suvansh, Sidor, Szymon, Broda, Tal, Clark, Aidan, Wang, Tao, Gordon, Taylor, Sanders, Ted, Patwardhan, Tejal, Sottiaux, Thibault, Degry, Thomas, Dimson, Thomas, Zheng, Tianhao, Garipov, Timur, Stasi, Tom, Bansal, Trapit, Creech, Trevor, Peterson, Troy, Eloundou, Tyna, Qi, Valerie, Kosaraju, Vineet, Monaco, Vinnie, Pong, Vitchyr, Fomenko, Vlad, Zheng, Weiyi, Zhou, Wenda, McCabe, Wes, Zaremba, Wojciech, Dubois, Yann, Lu, Yinghai, Chen, Yining, Cha, Young, Bai, Yu, He, Yuchen, Zhang, Yuchen, Wang, Yunyun, Shao, Zheng, Li, Zhuohan
The o1 model series is trained with large-scale reinforcement learning to reason using chain of thought. These advanced reasoning capabilities provide new avenues for improving the safety and robustness of our models. In particular, our models can reason about our safety policies in context when responding to potentially unsafe prompts, through deliberative alignment. This leads to state-of-the-art performance on certain benchmarks for risks such as generating illicit advice, choosing stereotyped responses, and succumbing to known jailbreaks. Training models to incorporate a chain of thought before answering has the potential to unlock substantial benefits, while also increasing potential risks that stem from heightened intelligence. Our results underscore the need for building robust alignment methods, extensively stress-testing their efficacy, and maintaining meticulous risk management protocols. This report outlines the safety work carried out for the OpenAI o1 and OpenAI o1-mini models, including safety evaluations, external red teaming, and Preparedness Framework evaluations.
GPT-4o System Card
OpenAI, null, :, null, Hurst, Aaron, Lerer, Adam, Goucher, Adam P., Perelman, Adam, Ramesh, Aditya, Clark, Aidan, Ostrow, AJ, Welihinda, Akila, Hayes, Alan, Radford, Alec, Mฤ dry, Aleksander, Baker-Whitcomb, Alex, Beutel, Alex, Borzunov, Alex, Carney, Alex, Chow, Alex, Kirillov, Alex, Nichol, Alex, Paino, Alex, Renzin, Alex, Passos, Alex Tachard, Kirillov, Alexander, Christakis, Alexi, Conneau, Alexis, Kamali, Ali, Jabri, Allan, Moyer, Allison, Tam, Allison, Crookes, Amadou, Tootoochian, Amin, Tootoonchian, Amin, Kumar, Ananya, Vallone, Andrea, Karpathy, Andrej, Braunstein, Andrew, Cann, Andrew, Codispoti, Andrew, Galu, Andrew, Kondrich, Andrew, Tulloch, Andrew, Mishchenko, Andrey, Baek, Angela, Jiang, Angela, Pelisse, Antoine, Woodford, Antonia, Gosalia, Anuj, Dhar, Arka, Pantuliano, Ashley, Nayak, Avi, Oliver, Avital, Zoph, Barret, Ghorbani, Behrooz, Leimberger, Ben, Rossen, Ben, Sokolowsky, Ben, Wang, Ben, Zweig, Benjamin, Hoover, Beth, Samic, Blake, McGrew, Bob, Spero, Bobby, Giertler, Bogo, Cheng, Bowen, Lightcap, Brad, Walkin, Brandon, Quinn, Brendan, Guarraci, Brian, Hsu, Brian, Kellogg, Bright, Eastman, Brydon, Lugaresi, Camillo, Wainwright, Carroll, Bassin, Cary, Hudson, Cary, Chu, Casey, Nelson, Chad, Li, Chak, Shern, Chan Jun, Conger, Channing, Barette, Charlotte, Voss, Chelsea, Ding, Chen, Lu, Cheng, Zhang, Chong, Beaumont, Chris, Hallacy, Chris, Koch, Chris, Gibson, Christian, Kim, Christina, Choi, Christine, McLeavey, Christine, Hesse, Christopher, Fischer, Claudia, Winter, Clemens, Czarnecki, Coley, Jarvis, Colin, Wei, Colin, Koumouzelis, Constantin, Sherburn, Dane, Kappler, Daniel, Levin, Daniel, Levy, Daniel, Carr, David, Farhi, David, Mely, David, Robinson, David, Sasaki, David, Jin, Denny, Valladares, Dev, Tsipras, Dimitris, Li, Doug, Nguyen, Duc Phong, Findlay, Duncan, Oiwoh, Edede, Wong, Edmund, Asdar, Ehsan, Proehl, Elizabeth, Yang, Elizabeth, Antonow, Eric, Kramer, Eric, Peterson, Eric, Sigler, Eric, Wallace, Eric, Brevdo, Eugene, Mays, Evan, Khorasani, Farzad, Such, Felipe Petroski, Raso, Filippo, Zhang, Francis, von Lohmann, Fred, Sulit, Freddie, Goh, Gabriel, Oden, Gene, Salmon, Geoff, Starace, Giulio, Brockman, Greg, Salman, Hadi, Bao, Haiming, Hu, Haitang, Wong, Hannah, Wang, Haoyu, Schmidt, Heather, Whitney, Heather, Jun, Heewoo, Kirchner, Hendrik, Pinto, Henrique Ponde de Oliveira, Ren, Hongyu, Chang, Huiwen, Chung, Hyung Won, Kivlichan, Ian, O'Connell, Ian, O'Connell, Ian, Osband, Ian, Silber, Ian, Sohl, Ian, Okuyucu, Ibrahim, Lan, Ikai, Kostrikov, Ilya, Sutskever, Ilya, Kanitscheider, Ingmar, Gulrajani, Ishaan, Coxon, Jacob, Menick, Jacob, Pachocki, Jakub, Aung, James, Betker, James, Crooks, James, Lennon, James, Kiros, Jamie, Leike, Jan, Park, Jane, Kwon, Jason, Phang, Jason, Teplitz, Jason, Wei, Jason, Wolfe, Jason, Chen, Jay, Harris, Jeff, Varavva, Jenia, Lee, Jessica Gan, Shieh, Jessica, Lin, Ji, Yu, Jiahui, Weng, Jiayi, Tang, Jie, Yu, Jieqi, Jang, Joanne, Candela, Joaquin Quinonero, Beutler, Joe, Landers, Joe, Parish, Joel, Heidecke, Johannes, Schulman, John, Lachman, Jonathan, McKay, Jonathan, Uesato, Jonathan, Ward, Jonathan, Kim, Jong Wook, Huizinga, Joost, Sitkin, Jordan, Kraaijeveld, Jos, Gross, Josh, Kaplan, Josh, Snyder, Josh, Achiam, Joshua, Jiao, Joy, Lee, Joyce, Zhuang, Juntang, Harriman, Justyn, Fricke, Kai, Hayashi, Kai, Singhal, Karan, Shi, Katy, Karthik, Kavin, Wood, Kayla, Rimbach, Kendra, Hsu, Kenny, Nguyen, Kenny, Gu-Lemberg, Keren, Button, Kevin, Liu, Kevin, Howe, Kiel, Muthukumar, Krithika, Luther, Kyle, Ahmad, Lama, Kai, Larry, Itow, Lauren, Workman, Lauren, Pathak, Leher, Chen, Leo, Jing, Li, Guy, Lia, Fedus, Liam, Zhou, Liang, Mamitsuka, Lien, Weng, Lilian, McCallum, Lindsay, Held, Lindsey, Ouyang, Long, Feuvrier, Louis, Zhang, Lu, Kondraciuk, Lukas, Kaiser, Lukasz, Hewitt, Luke, Metz, Luke, Doshi, Lyric, Aflak, Mada, Simens, Maddie, Boyd, Madelaine, Thompson, Madeleine, Dukhan, Marat, Chen, Mark, Gray, Mark, Hudnall, Mark, Zhang, Marvin, Aljubeh, Marwan, Litwin, Mateusz, Zeng, Matthew, Johnson, Max, Shetty, Maya, Gupta, Mayank, Shah, Meghan, Yatbaz, Mehmet, Yang, Meng Jia, Zhong, Mengchao, Glaese, Mia, Chen, Mianna, Janner, Michael, Lampe, Michael, Petrov, Michael, Wu, Michael, Wang, Michele, Fradin, Michelle, Pokrass, Michelle, Castro, Miguel, de Castro, Miguel Oom Temudo, Pavlov, Mikhail, Brundage, Miles, Wang, Miles, Khan, Minal, Murati, Mira, Bavarian, Mo, Lin, Molly, Yesildal, Murat, Soto, Nacho, Gimelshein, Natalia, Cone, Natalie, Staudacher, Natalie, Summers, Natalie, LaFontaine, Natan, Chowdhury, Neil, Ryder, Nick, Stathas, Nick, Turley, Nick, Tezak, Nik, Felix, Niko, Kudige, Nithanth, Keskar, Nitish, Deutsch, Noah, Bundick, Noel, Puckett, Nora, Nachum, Ofir, Okelola, Ola, Boiko, Oleg, Murk, Oleg, Jaffe, Oliver, Watkins, Olivia, Godement, Olivier, Campbell-Moore, Owen, Chao, Patrick, McMillan, Paul, Belov, Pavel, Su, Peng, Bak, Peter, Bakkum, Peter, Deng, Peter, Dolan, Peter, Hoeschele, Peter, Welinder, Peter, Tillet, Phil, Pronin, Philip, Tillet, Philippe, Dhariwal, Prafulla, Yuan, Qiming, Dias, Rachel, Lim, Rachel, Arora, Rahul, Troll, Rajan, Lin, Randall, Lopes, Rapha Gontijo, Puri, Raul, Miyara, Reah, Leike, Reimar, Gaubert, Renaud, Zamani, Reza, Wang, Ricky, Donnelly, Rob, Honsby, Rob, Smith, Rocky, Sahai, Rohan, Ramchandani, Rohit, Huet, Romain, Carmichael, Rory, Zellers, Rowan, Chen, Roy, Chen, Ruby, Nigmatullin, Ruslan, Cheu, Ryan, Jain, Saachi, Altman, Sam, Schoenholz, Sam, Toizer, Sam, Miserendino, Samuel, Agarwal, Sandhini, Culver, Sara, Ethersmith, Scott, Gray, Scott, Grove, Sean, Metzger, Sean, Hermani, Shamez, Jain, Shantanu, Zhao, Shengjia, Wu, Sherwin, Jomoto, Shino, Wu, Shirong, Shuaiqi, null, Xia, null, Phene, Sonia, Papay, Spencer, Narayanan, Srinivas, Coffey, Steve, Lee, Steve, Hall, Stewart, Balaji, Suchir, Broda, Tal, Stramer, Tal, Xu, Tao, Gogineni, Tarun, Christianson, Taya, Sanders, Ted, Patwardhan, Tejal, Cunninghman, Thomas, Degry, Thomas, Dimson, Thomas, Raoux, Thomas, Shadwell, Thomas, Zheng, Tianhao, Underwood, Todd, Markov, Todor, Sherbakov, Toki, Rubin, Tom, Stasi, Tom, Kaftan, Tomer, Heywood, Tristan, Peterson, Troy, Walters, Tyce, Eloundou, Tyna, Qi, Valerie, Moeller, Veit, Monaco, Vinnie, Kuo, Vishal, Fomenko, Vlad, Chang, Wayne, Zheng, Weiyi, Zhou, Wenda, Manassra, Wesam, Sheu, Will, Zaremba, Wojciech, Patil, Yash, Qian, Yilei, Kim, Yongjik, Cheng, Youlong, Zhang, Yu, He, Yuchen, Zhang, Yuchen, Jin, Yujia, Dai, Yunxing, Malkov, Yury
GPT-4o is an autoregressive omni model that accepts as input any combination of text, audio, image, and video, and generates any combination of text, audio, and image outputs. It's trained end-to-end across text, vision, and audio, meaning all inputs and outputs are processed by the same neural network. GPT-4o can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time in conversation. It matches GPT-4 Turbo performance on text in English and code, with significant improvement on text in non-English languages, while also being much faster and 50\% cheaper in the API. GPT-4o is especially better at vision and audio understanding compared to existing models. In line with our commitment to building AI safely and consistent with our voluntary commitments to the White House, we are sharing the GPT-4o System Card, which includes our Preparedness Framework evaluations. In this System Card, we provide a detailed look at GPT-4o's capabilities, limitations, and safety evaluations across multiple categories, focusing on speech-to-speech while also evaluating text and image capabilities, and measures we've implemented to ensure the model is safe and aligned. We also include third-party assessments on dangerous capabilities, as well as discussion of potential societal impacts of GPT-4o's text and vision capabilities.
First-Person Fairness in Chatbots
Eloundou, Tyna, Beutel, Alex, Robinson, David G., Gu-Lemberg, Keren, Brakman, Anna-Luisa, Mishkin, Pamela, Shah, Meghan, Heidecke, Johannes, Weng, Lilian, Kalai, Adam Tauman
Chatbots like ChatGPT are used for diverse purposes, ranging from resume writing to entertainment. These real-world applications are different from the institutional uses, such as resume screening or credit scoring, which have been the focus of much of AI research on fairness. Ensuring equitable treatment for all users in these first-person contexts is critical. In this work, we study "first-person fairness," which means fairness toward the chatbot user. This includes providing high-quality responses to all users regardless of their identity or background and avoiding harmful stereotypes. We propose a scalable, privacy-preserving method for evaluating one aspect of first-person fairness across a large, heterogeneous corpus of real-world chatbot interactions. Specifically, we assess potential bias linked to users' names, which can serve as proxies for demographic attributes like gender or race, in chatbot systems such as ChatGPT, which provide mechanisms for storing and using user names. Our method leverages a second language model to privately analyze name-sensitivity in the chatbot's responses. We verify the validity of these annotations through independent human evaluation. Further, we show that post-training interventions, including RL, significantly mitigate harmful stereotypes. Our approach also yields succinct descriptions of response differences across tasks. For instance, in the "writing a story" task, chatbot responses show a tendency to create protagonists whose gender matches the likely gender inferred from the user's name. Moreover, a pattern emerges where users with female-associated names receive responses with friendlier and simpler language slightly more often than users with male-associated names. Finally, we provide the system messages required for external researchers to further investigate ChatGPT's behavior with hypothetical user profiles.