Ouyang, Long
GPT-4o System Card
OpenAI, null, :, null, Hurst, Aaron, Lerer, Adam, Goucher, Adam P., Perelman, Adam, Ramesh, Aditya, Clark, Aidan, Ostrow, AJ, Welihinda, Akila, Hayes, Alan, Radford, Alec, Mądry, Aleksander, Baker-Whitcomb, Alex, Beutel, Alex, Borzunov, Alex, Carney, Alex, Chow, Alex, Kirillov, Alex, Nichol, Alex, Paino, Alex, Renzin, Alex, Passos, Alex Tachard, Kirillov, Alexander, Christakis, Alexi, Conneau, Alexis, Kamali, Ali, Jabri, Allan, Moyer, Allison, Tam, Allison, Crookes, Amadou, Tootoochian, Amin, Tootoonchian, Amin, Kumar, Ananya, Vallone, Andrea, Karpathy, Andrej, Braunstein, Andrew, Cann, Andrew, Codispoti, Andrew, Galu, Andrew, Kondrich, Andrew, Tulloch, Andrew, Mishchenko, Andrey, Baek, Angela, Jiang, Angela, Pelisse, Antoine, Woodford, Antonia, Gosalia, Anuj, Dhar, Arka, Pantuliano, Ashley, Nayak, Avi, Oliver, Avital, Zoph, Barret, Ghorbani, Behrooz, Leimberger, Ben, Rossen, Ben, Sokolowsky, Ben, Wang, Ben, Zweig, Benjamin, Hoover, Beth, Samic, Blake, McGrew, Bob, Spero, Bobby, Giertler, Bogo, Cheng, Bowen, Lightcap, Brad, Walkin, Brandon, Quinn, Brendan, Guarraci, Brian, Hsu, Brian, Kellogg, Bright, Eastman, Brydon, Lugaresi, Camillo, Wainwright, Carroll, Bassin, Cary, Hudson, Cary, Chu, Casey, Nelson, Chad, Li, Chak, Shern, Chan Jun, Conger, Channing, Barette, Charlotte, Voss, Chelsea, Ding, Chen, Lu, Cheng, Zhang, Chong, Beaumont, Chris, Hallacy, Chris, Koch, Chris, Gibson, Christian, Kim, Christina, Choi, Christine, McLeavey, Christine, Hesse, Christopher, Fischer, Claudia, Winter, Clemens, Czarnecki, Coley, Jarvis, Colin, Wei, Colin, Koumouzelis, Constantin, Sherburn, Dane, Kappler, Daniel, Levin, Daniel, Levy, Daniel, Carr, David, Farhi, David, Mely, David, Robinson, David, Sasaki, David, Jin, Denny, Valladares, Dev, Tsipras, Dimitris, Li, Doug, Nguyen, Duc Phong, Findlay, Duncan, Oiwoh, Edede, Wong, Edmund, Asdar, Ehsan, Proehl, Elizabeth, Yang, Elizabeth, Antonow, Eric, Kramer, Eric, Peterson, Eric, Sigler, Eric, Wallace, Eric, Brevdo, Eugene, Mays, Evan, Khorasani, Farzad, Such, Felipe Petroski, Raso, Filippo, Zhang, Francis, von Lohmann, Fred, Sulit, Freddie, Goh, Gabriel, Oden, Gene, Salmon, Geoff, Starace, Giulio, Brockman, Greg, Salman, Hadi, Bao, Haiming, Hu, Haitang, Wong, Hannah, Wang, Haoyu, Schmidt, Heather, Whitney, Heather, Jun, Heewoo, Kirchner, Hendrik, Pinto, Henrique Ponde de Oliveira, Ren, Hongyu, Chang, Huiwen, Chung, Hyung Won, Kivlichan, Ian, O'Connell, Ian, O'Connell, Ian, Osband, Ian, Silber, Ian, Sohl, Ian, Okuyucu, Ibrahim, Lan, Ikai, Kostrikov, Ilya, Sutskever, Ilya, Kanitscheider, Ingmar, Gulrajani, Ishaan, Coxon, Jacob, Menick, Jacob, Pachocki, Jakub, Aung, James, Betker, James, Crooks, James, Lennon, James, Kiros, Jamie, Leike, Jan, Park, Jane, Kwon, Jason, Phang, Jason, Teplitz, Jason, Wei, Jason, Wolfe, Jason, Chen, Jay, Harris, Jeff, Varavva, Jenia, Lee, Jessica Gan, Shieh, Jessica, Lin, Ji, Yu, Jiahui, Weng, Jiayi, Tang, Jie, Yu, Jieqi, Jang, Joanne, Candela, Joaquin Quinonero, Beutler, Joe, Landers, Joe, Parish, Joel, Heidecke, Johannes, Schulman, John, Lachman, Jonathan, McKay, Jonathan, Uesato, Jonathan, Ward, Jonathan, Kim, Jong Wook, Huizinga, Joost, Sitkin, Jordan, Kraaijeveld, Jos, Gross, Josh, Kaplan, Josh, Snyder, Josh, Achiam, Joshua, Jiao, Joy, Lee, Joyce, Zhuang, Juntang, Harriman, Justyn, Fricke, Kai, Hayashi, Kai, Singhal, Karan, Shi, Katy, Karthik, Kavin, Wood, Kayla, Rimbach, Kendra, Hsu, Kenny, Nguyen, Kenny, Gu-Lemberg, Keren, Button, Kevin, Liu, Kevin, Howe, Kiel, Muthukumar, Krithika, Luther, Kyle, Ahmad, Lama, Kai, Larry, Itow, Lauren, Workman, Lauren, Pathak, Leher, Chen, Leo, Jing, Li, Guy, Lia, Fedus, Liam, Zhou, Liang, Mamitsuka, Lien, Weng, Lilian, McCallum, Lindsay, Held, Lindsey, Ouyang, Long, Feuvrier, Louis, Zhang, Lu, Kondraciuk, Lukas, Kaiser, Lukasz, Hewitt, Luke, Metz, Luke, Doshi, Lyric, Aflak, Mada, Simens, Maddie, Boyd, Madelaine, Thompson, Madeleine, Dukhan, Marat, Chen, Mark, Gray, Mark, Hudnall, Mark, Zhang, Marvin, Aljubeh, Marwan, Litwin, Mateusz, Zeng, Matthew, Johnson, Max, Shetty, Maya, Gupta, Mayank, Shah, Meghan, Yatbaz, Mehmet, Yang, Meng Jia, Zhong, Mengchao, Glaese, Mia, Chen, Mianna, Janner, Michael, Lampe, Michael, Petrov, Michael, Wu, Michael, Wang, Michele, Fradin, Michelle, Pokrass, Michelle, Castro, Miguel, de Castro, Miguel Oom Temudo, Pavlov, Mikhail, Brundage, Miles, Wang, Miles, Khan, Minal, Murati, Mira, Bavarian, Mo, Lin, Molly, Yesildal, Murat, Soto, Nacho, Gimelshein, Natalia, Cone, Natalie, Staudacher, Natalie, Summers, Natalie, LaFontaine, Natan, Chowdhury, Neil, Ryder, Nick, Stathas, Nick, Turley, Nick, Tezak, Nik, Felix, Niko, Kudige, Nithanth, Keskar, Nitish, Deutsch, Noah, Bundick, Noel, Puckett, Nora, Nachum, Ofir, Okelola, Ola, Boiko, Oleg, Murk, Oleg, Jaffe, Oliver, Watkins, Olivia, Godement, Olivier, Campbell-Moore, Owen, Chao, Patrick, McMillan, Paul, Belov, Pavel, Su, Peng, Bak, Peter, Bakkum, Peter, Deng, Peter, Dolan, Peter, Hoeschele, Peter, Welinder, Peter, Tillet, Phil, Pronin, Philip, Tillet, Philippe, Dhariwal, Prafulla, Yuan, Qiming, Dias, Rachel, Lim, Rachel, Arora, Rahul, Troll, Rajan, Lin, Randall, Lopes, Rapha Gontijo, Puri, Raul, Miyara, Reah, Leike, Reimar, Gaubert, Renaud, Zamani, Reza, Wang, Ricky, Donnelly, Rob, Honsby, Rob, Smith, Rocky, Sahai, Rohan, Ramchandani, Rohit, Huet, Romain, Carmichael, Rory, Zellers, Rowan, Chen, Roy, Chen, Ruby, Nigmatullin, Ruslan, Cheu, Ryan, Jain, Saachi, Altman, Sam, Schoenholz, Sam, Toizer, Sam, Miserendino, Samuel, Agarwal, Sandhini, Culver, Sara, Ethersmith, Scott, Gray, Scott, Grove, Sean, Metzger, Sean, Hermani, Shamez, Jain, Shantanu, Zhao, Shengjia, Wu, Sherwin, Jomoto, Shino, Wu, Shirong, Shuaiqi, null, Xia, null, Phene, Sonia, Papay, Spencer, Narayanan, Srinivas, Coffey, Steve, Lee, Steve, Hall, Stewart, Balaji, Suchir, Broda, Tal, Stramer, Tal, Xu, Tao, Gogineni, Tarun, Christianson, Taya, Sanders, Ted, Patwardhan, Tejal, Cunninghman, Thomas, Degry, Thomas, Dimson, Thomas, Raoux, Thomas, Shadwell, Thomas, Zheng, Tianhao, Underwood, Todd, Markov, Todor, Sherbakov, Toki, Rubin, Tom, Stasi, Tom, Kaftan, Tomer, Heywood, Tristan, Peterson, Troy, Walters, Tyce, Eloundou, Tyna, Qi, Valerie, Moeller, Veit, Monaco, Vinnie, Kuo, Vishal, Fomenko, Vlad, Chang, Wayne, Zheng, Weiyi, Zhou, Wenda, Manassra, Wesam, Sheu, Will, Zaremba, Wojciech, Patil, Yash, Qian, Yilei, Kim, Yongjik, Cheng, Youlong, Zhang, Yu, He, Yuchen, Zhang, Yuchen, Jin, Yujia, Dai, Yunxing, Malkov, Yury
GPT-4o is an autoregressive omni model that accepts as input any combination of text, audio, image, and video, and generates any combination of text, audio, and image outputs. It's trained end-to-end across text, vision, and audio, meaning all inputs and outputs are processed by the same neural network. GPT-4o can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time in conversation. It matches GPT-4 Turbo performance on text in English and code, with significant improvement on text in non-English languages, while also being much faster and 50\% cheaper in the API. GPT-4o is especially better at vision and audio understanding compared to existing models. In line with our commitment to building AI safely and consistent with our voluntary commitments to the White House, we are sharing the GPT-4o System Card, which includes our Preparedness Framework evaluations. In this System Card, we provide a detailed look at GPT-4o's capabilities, limitations, and safety evaluations across multiple categories, focusing on speech-to-speech while also evaluating text and image capabilities, and measures we've implemented to ensure the model is safe and aligned. We also include third-party assessments on dangerous capabilities, as well as discussion of potential societal impacts of GPT-4o's text and vision capabilities.
GPT-4 Technical Report
OpenAI, null, :, null, Achiam, Josh, Adler, Steven, Agarwal, Sandhini, Ahmad, Lama, Akkaya, Ilge, Aleman, Florencia Leoni, Almeida, Diogo, Altenschmidt, Janko, Altman, Sam, Anadkat, Shyamal, Avila, Red, Babuschkin, Igor, Balaji, Suchir, Balcom, Valerie, Baltescu, Paul, Bao, Haiming, Bavarian, Mo, Belgum, Jeff, Bello, Irwan, Berdine, Jake, Bernadett-Shapiro, Gabriel, Berner, Christopher, Bogdonoff, Lenny, Boiko, Oleg, Boyd, Madelaine, Brakman, Anna-Luisa, Brockman, Greg, Brooks, Tim, Brundage, Miles, Button, Kevin, Cai, Trevor, Campbell, Rosie, Cann, Andrew, Carey, Brittany, Carlson, Chelsea, Carmichael, Rory, Chan, Brooke, Chang, Che, Chantzis, Fotis, Chen, Derek, Chen, Sully, Chen, Ruby, Chen, Jason, Chen, Mark, Chess, Ben, Cho, Chester, Chu, Casey, Chung, Hyung Won, Cummings, Dave, Currier, Jeremiah, Dai, Yunxing, Decareaux, Cory, Degry, Thomas, Deutsch, Noah, Deville, Damien, Dhar, Arka, Dohan, David, Dowling, Steve, Dunning, Sheila, Ecoffet, Adrien, Eleti, Atty, Eloundou, Tyna, Farhi, David, Fedus, Liam, Felix, Niko, Fishman, Simón Posada, Forte, Juston, Fulford, Isabella, Gao, Leo, Georges, Elie, Gibson, Christian, Goel, Vik, Gogineni, Tarun, Goh, Gabriel, Gontijo-Lopes, Rapha, Gordon, Jonathan, Grafstein, Morgan, Gray, Scott, Greene, Ryan, Gross, Joshua, Gu, Shixiang Shane, Guo, Yufei, Hallacy, Chris, Han, Jesse, Harris, Jeff, He, Yuchen, Heaton, Mike, Heidecke, Johannes, Hesse, Chris, Hickey, Alan, Hickey, Wade, Hoeschele, Peter, Houghton, Brandon, Hsu, Kenny, Hu, Shengli, Hu, Xin, Huizinga, Joost, Jain, Shantanu, Jain, Shawn, Jang, Joanne, Jiang, Angela, Jiang, Roger, Jin, Haozhun, Jin, Denny, Jomoto, Shino, Jonn, Billie, Jun, Heewoo, Kaftan, Tomer, Kaiser, Łukasz, Kamali, Ali, Kanitscheider, Ingmar, Keskar, Nitish Shirish, Khan, Tabarak, Kilpatrick, Logan, Kim, Jong Wook, Kim, Christina, Kim, Yongjik, Kirchner, Hendrik, Kiros, Jamie, Knight, Matt, Kokotajlo, Daniel, Kondraciuk, Łukasz, Kondrich, Andrew, Konstantinidis, Aris, Kosic, Kyle, Krueger, Gretchen, Kuo, Vishal, Lampe, Michael, Lan, Ikai, Lee, Teddy, Leike, Jan, Leung, Jade, Levy, Daniel, Li, Chak Ming, Lim, Rachel, Lin, Molly, Lin, Stephanie, Litwin, Mateusz, Lopez, Theresa, Lowe, Ryan, Lue, Patricia, Makanju, Anna, Malfacini, Kim, Manning, Sam, Markov, Todor, Markovski, Yaniv, Martin, Bianca, Mayer, Katie, Mayne, Andrew, McGrew, Bob, McKinney, Scott Mayer, McLeavey, Christine, McMillan, Paul, McNeil, Jake, Medina, David, Mehta, Aalok, Menick, Jacob, Metz, Luke, Mishchenko, Andrey, Mishkin, Pamela, Monaco, Vinnie, Morikawa, Evan, Mossing, Daniel, Mu, Tong, Murati, Mira, Murk, Oleg, Mély, David, Nair, Ashvin, Nakano, Reiichiro, Nayak, Rajeev, Neelakantan, Arvind, Ngo, Richard, Noh, Hyeonwoo, Ouyang, Long, O'Keefe, Cullen, Pachocki, Jakub, Paino, Alex, Palermo, Joe, Pantuliano, Ashley, Parascandolo, Giambattista, Parish, Joel, Parparita, Emy, Passos, Alex, Pavlov, Mikhail, Peng, Andrew, Perelman, Adam, Peres, Filipe de Avila Belbute, Petrov, Michael, Pinto, Henrique Ponde de Oliveira, Michael, null, Pokorny, null, Pokrass, Michelle, Pong, Vitchyr, Powell, Tolly, Power, Alethea, Power, Boris, Proehl, Elizabeth, Puri, Raul, Radford, Alec, Rae, Jack, Ramesh, Aditya, Raymond, Cameron, Real, Francis, Rimbach, Kendra, Ross, Carl, Rotsted, Bob, Roussez, Henri, Ryder, Nick, Saltarelli, Mario, Sanders, Ted, Santurkar, Shibani, Sastry, Girish, Schmidt, Heather, Schnurr, David, Schulman, John, Selsam, Daniel, Sheppard, Kyla, Sherbakov, Toki, Shieh, Jessica, Shoker, Sarah, Shyam, Pranav, Sidor, Szymon, Sigler, Eric, Simens, Maddie, Sitkin, Jordan, Slama, Katarina, Sohl, Ian, Sokolowsky, Benjamin, Song, Yang, Staudacher, Natalie, Such, Felipe Petroski, Summers, Natalie, Sutskever, Ilya, Tang, Jie, Tezak, Nikolas, Thompson, Madeleine, Tillet, Phil, Tootoonchian, Amin, Tseng, Elizabeth, Tuggle, Preston, Turley, Nick, Tworek, Jerry, Uribe, Juan Felipe Cerón, Vallone, Andrea, Vijayvergiya, Arun, Voss, Chelsea, Wainwright, Carroll, Wang, Justin Jay, Wang, Alvin, Wang, Ben, Ward, Jonathan, Wei, Jason, Weinmann, CJ, Welihinda, Akila, Welinder, Peter, Weng, Jiayi, Weng, Lilian, Wiethoff, Matt, Willner, Dave, Winter, Clemens, Wolrich, Samuel, Wong, Hannah, Workman, Lauren, Wu, Sherwin, Wu, Jeff, Wu, Michael, Xiao, Kai, Xu, Tao, Yoo, Sarah, Yu, Kevin, Yuan, Qiming, Zaremba, Wojciech, Zellers, Rowan, Zhang, Chong, Zhang, Marvin, Zhao, Shengjia, Zheng, Tianhao, Zhuang, Juntang, Zhuk, William, Zoph, Barret
We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based model pre-trained to predict the next token in a document. The post-training alignment process results in improved performance on measures of factuality and adherence to desired behavior. A core component of this project was developing infrastructure and optimization methods that behave predictably across a wide range of scales. This allowed us to accurately predict some aspects of GPT-4's performance based on models trained with no more than 1/1,000th the compute of GPT-4.
WebGPT: Browser-assisted question-answering with human feedback
Nakano, Reiichiro, Hilton, Jacob, Balaji, Suchir, Wu, Jeff, Ouyang, Long, Kim, Christina, Hesse, Christopher, Jain, Shantanu, Kosaraju, Vineet, Saunders, William, Jiang, Xu, Cobbe, Karl, Eloundou, Tyna, Krueger, Gretchen, Button, Kevin, Knight, Matthew, Chess, Benjamin, Schulman, John
We fine-tune GPT-3 to answer long-form questions using a text-based web-browsing environment, which allows the model to search and navigate the web. By setting up the task so that it can be performed by humans, we are able to train models on the task using imitation learning, and then optimize answer quality with human feedback. To make human evaluation of factual accuracy easier, models must collect references while browsing in support of their answers. We train and evaluate our models on ELI5, a dataset of questions asked by Reddit users. Our best model is obtained by fine-tuning GPT-3 using behavior cloning, and then performing rejection sampling against a reward model trained to predict human preferences. This model's answers are preferred by humans 56% of the time to those of our human demonstrators, and 69% of the time to the highest-voted answer from Reddit.
Recursively Summarizing Books with Human Feedback
Wu, Jeff, Ouyang, Long, Ziegler, Daniel M., Stiennon, Nissan, Lowe, Ryan, Leike, Jan, Christiano, Paul
A major challenge for scaling machine learning is training models to perform tasks that are very difficult or time-consuming for humans to evaluate. We present progress on this problem on the task of abstractive summarization of entire fiction novels. Our method combines learning from human feedback with recursive task decomposition: we use models trained on smaller parts of the task to assist humans in giving feedback on the broader task. We collect a large volume of demonstrations and comparisons from human labelers, and fine-tune GPT-3 using behavioral cloning and reward modeling to do summarization recursively. At inference time, the model first summarizes small sections of the book and then recursively summarizes these summaries to produce a summary of the entire book. Our human labelers are able to supervise and evaluate the models quickly, despite not having read the entire books themselves. Our resulting model generates sensible summaries of entire books, even matching the quality of human-written summaries in a few cases ($\sim5\%$ of books). We achieve state-of-the-art results on the recent BookSum dataset for book-length summarization. A zero-shot question-answering model using these summaries achieves state-of-the-art results on the challenging NarrativeQA benchmark for answering questions about books and movie scripts. We release datasets of samples from our model.
Learning to summarize from human feedback
Stiennon, Nisan, Ouyang, Long, Wu, Jeff, Ziegler, Daniel M., Lowe, Ryan, Voss, Chelsea, Radford, Alec, Amodei, Dario, Christiano, Paul
As language models become more powerful, training and evaluation are increasingly bottlenecked by the data and metrics used for a particular task. For example, summarization models are often trained to predict human reference summaries and evaluated using ROUGE, but both of these metrics are rough proxies for what we really care about---summary quality. In this work, we show that it is possible to significantly improve summary quality by training a model to optimize for human preferences. We collect a large, high-quality dataset of human comparisons between summaries, train a model to predict the human-preferred summary, and use that model as a reward function to fine-tune a summarization policy using reinforcement learning. We apply our method to a version of the TL;DR dataset of Reddit posts and find that our models significantly outperform both human reference summaries and much larger models fine-tuned with supervised learning alone. Our models also transfer to CNN/DM news articles, producing summaries nearly as good as the human reference without any news-specific fine-tuning. We conduct extensive analyses to understand our human feedback dataset and fine-tuned models. We establish that our reward model generalizes to new datasets, and that optimizing our reward model results in better summaries than optimizing ROUGE according to humans. We hope the evidence from our paper motivates machine learning researchers to pay closer attention to how their training loss affects the model behavior they actually want.
Bayesian Inference of Regular Expressions from Human-Generated Example Strings
Ouyang, Long
In programming by example, users "write" programs by generating a small number of input-output examples and asking the computer to synthesize consistent programs. We consider an unsolved problem in this domain: learning regular expressions (regexes) from positive and negative example strings. This problem is challenging, as (1) user-generated examples may not be informative enough to sufficiently constrain the hypothesis space, and (2) even if user-generated examples are in principle informative, there is still a massive search space to examine. We frame regex induction as the problem of inferring a probabilistic regular grammar and propose an efficient inference approach that uses a novel stochastic process recognition model. This model incrementally "grows" a grammar using positive examples as a scaffold. We show that this approach is competitive with human ability to learn regexes from examples.