Vincent, Damien
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Gemini Team, null, Georgiev, Petko, Lei, Ving Ian, Burnell, Ryan, Bai, Libin, Gulati, Anmol, Tanzer, Garrett, Vincent, Damien, Pan, Zhufeng, Wang, Shibo, Mariooryad, Soroosh, Ding, Yifan, Geng, Xinyang, Alcober, Fred, Frostig, Roy, Omernick, Mark, Walker, Lexi, Paduraru, Cosmin, Sorokin, Christina, Tacchetti, Andrea, Gaffney, Colin, Daruki, Samira, Sercinoglu, Olcan, Gleicher, Zach, Love, Juliette, Voigtlaender, Paul, Jain, Rohan, Surita, Gabriela, Mohamed, Kareem, Blevins, Rory, Ahn, Junwhan, Zhu, Tao, Kawintiranon, Kornraphop, Firat, Orhan, Gu, Yiming, Zhang, Yujing, Rahtz, Matthew, Faruqui, Manaal, Clay, Natalie, Gilmer, Justin, Co-Reyes, JD, Penchev, Ivo, Zhu, Rui, Morioka, Nobuyuki, Hui, Kevin, Haridasan, Krishna, Campos, Victor, Mahdieh, Mahdis, Guo, Mandy, Hassan, Samer, Kilgour, Kevin, Vezer, Arpi, Cheng, Heng-Tze, de Liedekerke, Raoul, Goyal, Siddharth, Barham, Paul, Strouse, DJ, Noury, Seb, Adler, Jonas, Sundararajan, Mukund, Vikram, Sharad, Lepikhin, Dmitry, Paganini, Michela, Garcia, Xavier, Yang, Fan, Valter, Dasha, Trebacz, Maja, Vodrahalli, Kiran, Asawaroengchai, Chulayuth, Ring, Roman, Kalb, Norbert, Soares, Livio Baldini, Brahma, Siddhartha, Steiner, David, Yu, Tianhe, Mentzer, Fabian, He, Antoine, Gonzalez, Lucas, Xu, Bibo, Kaufman, Raphael Lopez, Shafey, Laurent El, Oh, Junhyuk, Hennigan, Tom, Driessche, George van den, Odoom, Seth, Lucic, Mario, Roelofs, Becca, Lall, Sid, Marathe, Amit, Chan, Betty, Ontanon, Santiago, He, Luheng, Teplyashin, Denis, Lai, Jonathan, Crone, Phil, Damoc, Bogdan, Ho, Lewis, Riedel, Sebastian, Lenc, Karel, Yeh, Chih-Kuan, Chowdhery, Aakanksha, Xu, Yang, Kazemi, Mehran, Amid, Ehsan, Petrushkina, Anastasia, Swersky, Kevin, Khodaei, Ali, Chen, Gowoon, Larkin, Chris, Pinto, Mario, Yan, Geng, Badia, Adria Puigdomenech, Patil, Piyush, Hansen, Steven, Orr, Dave, Arnold, Sebastien M. R., Grimstad, Jordan, Dai, Andrew, Douglas, Sholto, Sinha, Rishika, Yadav, Vikas, Chen, Xi, Gribovskaya, Elena, Austin, Jacob, Zhao, Jeffrey, Patel, Kaushal, Komarek, Paul, Austin, Sophia, Borgeaud, Sebastian, Friso, Linda, Goyal, Abhimanyu, Caine, Ben, Cao, Kris, Chung, Da-Woon, Lamm, Matthew, Barth-Maron, Gabe, Kagohara, Thais, Olszewska, Kate, Chen, Mia, Shivakumar, Kaushik, Agarwal, Rishabh, Godhia, Harshal, Rajwar, Ravi, Snaider, Javier, Dotiwalla, Xerxes, Liu, Yuan, Barua, Aditya, Ungureanu, Victor, Zhang, Yuan, Batsaikhan, Bat-Orgil, Wirth, Mateo, Qin, James, Danihelka, Ivo, Doshi, Tulsee, Chadwick, Martin, Chen, Jilin, Jain, Sanil, Le, Quoc, Kar, Arjun, Gurumurthy, Madhu, Li, Cheng, Sang, Ruoxin, Liu, Fangyu, Lamprou, Lampros, Munoz, Rich, Lintz, Nathan, Mehta, Harsh, Howard, Heidi, Reynolds, Malcolm, Aroyo, Lora, Wang, Quan, Blanco, Lorenzo, Cassirer, Albin, Griffith, Jordan, Das, Dipanjan, Lee, Stephan, Sygnowski, Jakub, Fisher, Zach, Besley, James, Powell, Richard, Ahmed, Zafarali, Paulus, Dominik, Reitter, David, Borsos, Zalan, Joshi, Rishabh, Pope, Aedan, Hand, Steven, Selo, Vittorio, Jain, Vihan, Sethi, Nikhil, Goel, Megha, Makino, Takaki, May, Rhys, Yang, Zhen, Schalkwyk, Johan, Butterfield, Christina, Hauth, Anja, Goldin, Alex, Hawkins, Will, Senter, Evan, Brin, Sergey, Woodman, Oliver, Ritter, Marvin, Noland, Eric, Giang, Minh, Bolina, Vijay, Lee, Lisa, Blyth, Tim, Mackinnon, Ian, Reid, Machel, Sarvana, Obaid, Silver, David, Chen, Alexander, Wang, Lily, Maggiore, Loren, Chang, Oscar, Attaluri, Nithya, Thornton, Gregory, Chiu, Chung-Cheng, Bunyan, Oskar, Levine, Nir, Chung, Timothy, Eltyshev, Evgenii, Si, Xiance, Lillicrap, Timothy, Brady, Demetra, Aggarwal, Vaibhav, Wu, Boxi, Xu, Yuanzhong, McIlroy, Ross, Badola, Kartikeya, Sandhu, Paramjit, Moreira, Erica, Stokowiec, Wojciech, Hemsley, Ross, Li, Dong, Tudor, Alex, Shyam, Pranav, Rahimtoroghi, Elahe, Haykal, Salem, Sprechmann, Pablo, Zhou, Xiang, Mincu, Diana, Li, Yujia, Addanki, Ravi, Krishna, Kalpesh, Wu, Xiao, Frechette, Alexandre, Eyal, Matan, Dafoe, Allan, Lacey, Dave, Whang, Jay, Avrahami, Thi, Zhang, Ye, Taropa, Emanuel, Lin, Hanzhao, Toyama, Daniel, Rutherford, Eliza, Sano, Motoki, Choe, HyunJeong, Tomala, Alex, Safranek-Shrader, Chalence, Kassner, Nora, Pajarskas, Mantas, Harvey, Matt, Sechrist, Sean, Fortunato, Meire, Lyu, Christina, Elsayed, Gamaleldin, Kuang, Chenkai, Lottes, James, Chu, Eric, Jia, Chao, Chen, Chih-Wei, Humphreys, Peter, Baumli, Kate, Tao, Connie, Samuel, Rajkumar, Santos, Cicero Nogueira dos, Andreassen, Anders, Rakiฤeviฤ, Nemanja, Grewe, Dominik, Kumar, Aviral, Winkler, Stephanie, Caton, Jonathan, Brock, Andrew, Dalmia, Sid, Sheahan, Hannah, Barr, Iain, Miao, Yingjie, Natsev, Paul, Devlin, Jacob, Behbahani, Feryal, Prost, Flavien, Sun, Yanhua, Myaskovsky, Artiom, Pillai, Thanumalayan Sankaranarayana, Hurt, Dan, Lazaridou, Angeliki, Xiong, Xi, Zheng, Ce, Pardo, Fabio, Li, Xiaowei, Horgan, Dan, Stanton, Joe, Ambar, Moran, Xia, Fei, Lince, Alejandro, Wang, Mingqiu, Mustafa, Basil, Webson, Albert, Lee, Hyo, Anil, Rohan, Wicke, Martin, Dozat, Timothy, Sinha, Abhishek, Piqueras, Enrique, Dabir, Elahe, Upadhyay, Shyam, Boral, Anudhyan, Hendricks, Lisa Anne, Fry, Corey, Djolonga, Josip, Su, Yi, Walker, Jake, Labanowski, Jane, Huang, Ronny, Misra, Vedant, Chen, Jeremy, Skerry-Ryan, RJ, Singh, Avi, Rijhwani, Shruti, Yu, Dian, Castro-Ros, Alex, Changpinyo, Beer, Datta, Romina, Bagri, Sumit, Hrafnkelsson, Arnar Mar, Maggioni, Marcello, Zheng, Daniel, Sulsky, Yury, Hou, Shaobo, Paine, Tom Le, Yang, Antoine, Riesa, Jason, Rogozinska, Dominika, Marcus, Dror, Badawy, Dalia El, Zhang, Qiao, Wang, Luyu, Miller, Helen, Greer, Jeremy, Sjos, Lars Lowe, Nova, Azade, Zen, Heiga, Chaabouni, Rahma, Rosca, Mihaela, Jiang, Jiepu, Chen, Charlie, Liu, Ruibo, Sainath, Tara, Krikun, Maxim, Polozov, Alex, Lespiau, Jean-Baptiste, Newlan, Josh, Cankara, Zeyncep, Kwak, Soo, Xu, Yunhan, Chen, Phil, Coenen, Andy, Meyer, Clemens, Tsihlas, Katerina, Ma, Ada, Gottweis, Juraj, Xing, Jinwei, Gu, Chenjie, Miao, Jin, Frank, Christian, Cankara, Zeynep, Ganapathy, Sanjay, Dasgupta, Ishita, Hughes-Fitt, Steph, Chen, Heng, Reid, David, Rong, Keran, Fan, Hongmin, van Amersfoort, Joost, Zhuang, Vincent, Cohen, Aaron, Gu, Shixiang Shane, Mohananey, Anhad, Ilic, Anastasija, Tobin, Taylor, Wieting, John, Bortsova, Anna, Thacker, Phoebe, Wang, Emma, Caveness, Emily, Chiu, Justin, Sezener, Eren, Kaskasoli, Alex, Baker, Steven, Millican, Katie, Elhawaty, Mohamed, Aisopos, Kostas, Lebsack, Carl, Byrd, Nathan, Dai, Hanjun, Jia, Wenhao, Wiethoff, Matthew, Davoodi, Elnaz, Weston, Albert, Yagati, Lakshman, Ahuja, Arun, Gao, Isabel, Pundak, Golan, Zhang, Susan, Azzam, Michael, Sim, Khe Chai, Caelles, Sergi, Keeling, James, Sharma, Abhanshu, Swing, Andy, Li, YaGuang, Liu, Chenxi, Bostock, Carrie Grimes, Bansal, Yamini, Nado, Zachary, Anand, Ankesh, Lipschultz, Josh, Karmarkar, Abhijit, Proleev, Lev, Ittycheriah, Abe, Yeganeh, Soheil Hassas, Polovets, George, Faust, Aleksandra, Sun, Jiao, Rrustemi, Alban, Li, Pen, Shivanna, Rakesh, Liu, Jeremiah, Welty, Chris, Lebron, Federico, Baddepudi, Anirudh, Krause, Sebastian, Parisotto, Emilio, Soricut, Radu, Xu, Zheng, Bloxwich, Dawn, Johnson, Melvin, Neyshabur, Behnam, Mao-Jones, Justin, Wang, Renshen, Ramasesh, Vinay, Abbas, Zaheer, Guez, Arthur, Segal, Constant, Nguyen, Duc Dung, Svensson, James, Hou, Le, York, Sarah, Milan, Kieran, Bridgers, Sophie, Gworek, Wiktor, Tagliasacchi, Marco, Lee-Thorp, James, Chang, Michael, Guseynov, Alexey, Hartman, Ale Jakse, Kwong, Michael, Zhao, Ruizhe, Kashem, Sheleem, Cole, Elizabeth, Miech, Antoine, Tanburn, Richard, Phuong, Mary, Pavetic, Filip, Cevey, Sebastien, Comanescu, Ramona, Ives, Richard, Yang, Sherry, Du, Cosmo, Li, Bo, Zhang, Zizhao, Iinuma, Mariko, Hu, Clara Huiyi, Roy, Aurko, Bijwadia, Shaan, Zhu, Zhenkai, Martins, Danilo, Saputro, Rachel, Gergely, Anita, Zheng, Steven, Jia, Dawei, Antonoglou, Ioannis, Sadovsky, Adam, Gu, Shane, Bi, Yingying, Andreev, Alek, Samangooei, Sina, Khan, Mina, Kocisky, Tomas, Filos, Angelos, Kumar, Chintu, Bishop, Colton, Yu, Adams, Hodkinson, Sarah, Mittal, Sid, Shah, Premal, Moufarek, Alexandre, Cheng, Yong, Bloniarz, Adam, Lee, Jaehoon, Pejman, Pedram, Michel, Paul, Spencer, Stephen, Feinberg, Vladimir, Xiong, Xuehan, Savinov, Nikolay, Smith, Charlotte, Shakeri, Siamak, Tran, Dustin, Chesus, Mary, Bohnet, Bernd, Tucker, George, von Glehn, Tamara, Muir, Carrie, Mao, Yiran, Kazawa, Hideto, Slone, Ambrose, Soparkar, Kedar, Shrivastava, Disha, Cobon-Kerr, James, Sharman, Michael, Pavagadhi, Jay, Araya, Carlos, Misiunas, Karolis, Ghelani, Nimesh, Laskin, Michael, Barker, David, Li, Qiujia, Briukhov, Anton, Houlsby, Neil, Glaese, Mia, Lakshminarayanan, Balaji, Schucher, Nathan, Tang, Yunhao, Collins, Eli, Lim, Hyeontaek, Feng, Fangxiaoyu, Recasens, Adria, Lai, Guangda, Magni, Alberto, De Cao, Nicola, Siddhant, Aditya, Ashwood, Zoe, Orbay, Jordi, Dehghani, Mostafa, Brennan, Jenny, He, Yifan, Xu, Kelvin, Gao, Yang, Saroufim, Carl, Molloy, James, Wu, Xinyi, Arnold, Seb, Chang, Solomon, Schrittwieser, Julian, Buchatskaya, Elena, Radpour, Soroush, Polacek, Martin, Giordano, Skye, Bapna, Ankur, Tokumine, Simon, Hellendoorn, Vincent, Sottiaux, Thibault, Cogan, Sarah, Severyn, Aliaksei, Saleh, Mohammad, Thakoor, Shantanu, Shefey, Laurent, Qiao, Siyuan, Gaba, Meenu, Chang, Shuo-yiin, Swanson, Craig, Zhang, Biao, Lee, Benjamin, Rubenstein, Paul Kishan, Song, Gan, Kwiatkowski, Tom, Koop, Anna, Kannan, Ajay, Kao, David, Schuh, Parker, Stjerngren, Axel, Ghiasi, Golnaz, Gibson, Gena, Vilnis, Luke, Yuan, Ye, Ferreira, Felipe Tiengo, Kamath, Aishwarya, Klimenko, Ted, Franko, Ken, Xiao, Kefan, Bhattacharya, Indro, Patel, Miteyan, Wang, Rui, Morris, Alex, Strudel, Robin, Sharma, Vivek, Choy, Peter, Hashemi, Sayed Hadi, Landon, Jessica, Finkelstein, Mara, Jhakra, Priya, Frye, Justin, Barnes, Megan, Mauger, Matthew, Daun, Dennis, Baatarsukh, Khuslen, Tung, Matthew, Farhan, Wael, Michalewski, Henryk, Viola, Fabio, Quitry, Felix de Chaumont, Lan, Charline Le, Hudson, Tom, Wang, Qingze, Fischer, Felix, Zheng, Ivy, White, Elspeth, Dragan, Anca, Alayrac, Jean-baptiste, Ni, Eric, Pritzel, Alexander, Iwanicki, Adam, Isard, Michael, Bulanova, Anna, Zilka, Lukas, Dyer, Ethan, Sachan, Devendra, Srinivasan, Srivatsan, Muckenhirn, Hannah, Cai, Honglong, Mandhane, Amol, Tariq, Mukarram, Rae, Jack W., Wang, Gary, Ayoub, Kareem, FitzGerald, Nicholas, Zhao, Yao, Han, Woohyun, Alberti, Chris, Garrette, Dan, Krishnakumar, Kashyap, Gimenez, Mai, Levskaya, Anselm, Sohn, Daniel, Matak, Josip, Iturrate, Inaki, Chang, Michael B., Xiang, Jackie, Cao, Yuan, Ranka, Nishant, Brown, Geoff, Hutter, Adrian, Mirrokni, Vahab, Chen, Nanxin, Yao, Kaisheng, Egyed, Zoltan, Galilee, Francois, Liechty, Tyler, Kallakuri, Praveen, Palmer, Evan, Ghemawat, Sanjay, Liu, Jasmine, Tao, David, Thornton, Chloe, Green, Tim, Jasarevic, Mimi, Lin, Sharon, Cotruta, Victor, Tan, Yi-Xuan, Fiedel, Noah, Yu, Hongkun, Chi, Ed, Neitz, Alexander, Heitkaemper, Jens, Sinha, Anu, Zhou, Denny, Sun, Yi, Kaed, Charbel, Hulse, Brice, Mishra, Swaroop, Georgaki, Maria, Kudugunta, Sneha, Farabet, Clement, Shafran, Izhak, Vlasic, Daniel, Tsitsulin, Anton, Ananthanarayanan, Rajagopal, Carin, Alen, Su, Guolong, Sun, Pei, V, Shashank, Carvajal, Gabriel, Broder, Josef, Comsa, Iulia, Repina, Alena, Wong, William, Chen, Warren Weilun, Hawkins, Peter, Filonov, Egor, Loher, Lucia, Hirnschall, Christoph, Wang, Weiyi, Ye, Jingchen, Burns, Andrea, Cate, Hardie, Wright, Diana Gage, Piccinini, Federico, Zhang, Lei, Lin, Chu-Cheng, Gog, Ionel, Kulizhskaya, Yana, Sreevatsa, Ashwin, Song, Shuang, Cobo, Luis C., Iyer, Anand, Tekur, Chetan, Garrido, Guillermo, Xiao, Zhuyun, Kemp, Rupert, Zheng, Huaixiu Steven, Li, Hui, Agarwal, Ananth, Ngani, Christel, Goshvadi, Kati, Santamaria-Fernandez, Rebeca, Fica, Wojciech, Chen, Xinyun, Gorgolewski, Chris, Sun, Sean, Garg, Roopal, Ye, Xinyu, Eslami, S. M. Ali, Hua, Nan, Simon, Jon, Joshi, Pratik, Kim, Yelin, Tenney, Ian, Potluri, Sahitya, Thiet, Lam Nguyen, Yuan, Quan, Luisier, Florian, Chronopoulou, Alexandra, Scellato, Salvatore, Srinivasan, Praveen, Chen, Minmin, Koverkathu, Vinod, Dalibard, Valentin, Xu, Yaming, Saeta, Brennan, Anderson, Keith, Sellam, Thibault, Fernando, Nick, Huot, Fantine, Jung, Junehyuk, Varadarajan, Mani, Quinn, Michael, Raul, Amit, Le, Maigo, Habalov, Ruslan, Clark, Jon, Jalan, Komal, Bullard, Kalesha, Singhal, Achintya, Luong, Thang, Wang, Boyu, Rajayogam, Sujeevan, Eisenschlos, Julian, Jia, Johnson, Finchelstein, Daniel, Yakubovich, Alex, Balle, Daniel, Fink, Michael, Agarwal, Sameer, Li, Jing, Dvijotham, Dj, Pal, Shalini, Kang, Kai, Konzelmann, Jaclyn, Beattie, Jennifer, Dousse, Olivier, Wu, Diane, Crocker, Remi, Elkind, Chen, Jonnalagadda, Siddhartha Reddy, Lee, Jong, Holtmann-Rice, Dan, Kallarackal, Krystal, Liu, Rosanne, Vnukov, Denis, Vats, Neera, Invernizzi, Luca, Jafari, Mohsen, Zhou, Huanjie, Taylor, Lilly, Prendki, Jennifer, Wu, Marcus, Eccles, Tom, Liu, Tianqi, Kopparapu, Kavya, Beaufays, Francoise, Angermueller, Christof, Marzoca, Andreea, Sarcar, Shourya, Dib, Hilal, Stanway, Jeff, Perbet, Frank, Trdin, Nejc, Sterneck, Rachel, Khorlin, Andrey, Li, Dinghua, Wu, Xihui, Goenka, Sonam, Madras, David, Goldshtein, Sasha, Gierke, Willi, Zhou, Tong, Liu, Yaxin, Liang, Yannie, White, Anais, Li, Yunjie, Singh, Shreya, Bahargam, Sanaz, Epstein, Mark, Basu, Sujoy, Lao, Li, Ozturel, Adnan, Crous, Carl, Zhai, Alex, Lu, Han, Tung, Zora, Gaur, Neeraj, Walton, Alanna, Dixon, Lucas, Zhang, Ming, Globerson, Amir, Uy, Grant, Bolt, Andrew, Wiles, Olivia, Nasr, Milad, Shumailov, Ilia, Selvi, Marco, Piccinno, Francesco, Aguilar, Ricardo, McCarthy, Sara, Khalman, Misha, Shukla, Mrinal, Galic, Vlado, Carpenter, John, Villela, Kevin, Zhang, Haibin, Richardson, Harry, Martens, James, Bosnjak, Matko, Belle, Shreyas Rammohan, Seibert, Jeff, Alnahlawi, Mahmoud, McWilliams, Brian, Singh, Sankalp, Louis, Annie, Ding, Wen, Popovici, Dan, Simicich, Lenin, Knight, Laura, Mehta, Pulkit, Gupta, Nishesh, Shi, Chongyang, Fatehi, Saaber, Mitrovic, Jovana, Grills, Alex, Pagadora, Joseph, Petrova, Dessie, Eisenbud, Danielle, Zhang, Zhishuai, Yates, Damion, Mittal, Bhavishya, Tripuraneni, Nilesh, Assael, Yannis, Brovelli, Thomas, Jain, Prateek, Velimirovic, Mihajlo, Akbulut, Canfer, Mu, Jiaqi, Macherey, Wolfgang, Kumar, Ravin, Xu, Jun, Qureshi, Haroon, Comanici, Gheorghe, Wiesner, Jeremy, Gong, Zhitao, Ruddock, Anton, Bauer, Matthias, Felt, Nick, GP, Anirudh, Arnab, Anurag, Zelle, Dustin, Rothfuss, Jonas, Rosgen, Bill, Shenoy, Ashish, Seybold, Bryan, Li, Xinjian, Mudigonda, Jayaram, Erdogan, Goker, Xia, Jiawei, Simsa, Jiri, Michi, Andrea, Yao, Yi, Yew, Christopher, Kan, Steven, Caswell, Isaac, Radebaugh, Carey, Elisseeff, Andre, Valenzuela, Pedro, McKinney, Kay, Paterson, Kim, Cui, Albert, Latorre-Chimoto, Eri, Kim, Solomon, Zeng, William, Durden, Ken, Ponnapalli, Priya, Sosea, Tiberiu, Choquette-Choo, Christopher A., Manyika, James, Robenek, Brona, Vashisht, Harsha, Pereira, Sebastien, Lam, Hoi, Velic, Marko, Owusu-Afriyie, Denese, Lee, Katherine, Bolukbasi, Tolga, Parrish, Alicia, Lu, Shawn, Park, Jane, Venkatraman, Balaji, Talbert, Alice, Rosique, Lambert, Cheng, Yuchung, Sozanschi, Andrei, Paszke, Adam, Kumar, Praveen, Austin, Jessica, Li, Lu, Salama, Khalid, Kim, Wooyeol, Dukkipati, Nandita, Baryshnikov, Anthony, Kaplanis, Christos, Sheng, XiangHai, Chervonyi, Yuri, Unlu, Caglar, Casas, Diego de Las, Askham, Harry, Tunyasuvunakool, Kathryn, Gimeno, Felix, Poder, Siim, Kwak, Chester, Miecnikowski, Matt, Mirrokni, Vahab, Dimitriev, Alek, Parisi, Aaron, Liu, Dangyi, Tsai, Tomy, Shevlane, Toby, Kouridi, Christina, Garmon, Drew, Goedeckemeyer, Adrian, Brown, Adam R., Vijayakumar, Anitha, Elqursh, Ali, Jazayeri, Sadegh, Huang, Jin, Carthy, Sara Mc, Hoover, Jay, Kim, Lucy, Kumar, Sandeep, Chen, Wei, Biles, Courtney, Bingham, Garrett, Rosen, Evan, Wang, Lisa, Tan, Qijun, Engel, David, Pongetti, Francesco, de Cesare, Dario, Hwang, Dongseong, Yu, Lily, Pullman, Jennifer, Narayanan, Srini, Levin, Kyle, Gopal, Siddharth, Li, Megan, Aharoni, Asaf, Trinh, Trieu, Lo, Jessica, Casagrande, Norman, Vij, Roopali, Matthey, Loic, Ramadhana, Bramandia, Matthews, Austin, Carey, CJ, Johnson, Matthew, Goranova, Kremena, Shah, Rohin, Ashraf, Shereen, Dasgupta, Kingshuk, Larsen, Rasmus, Wang, Yicheng, Vuyyuru, Manish Reddy, Jiang, Chong, Ijazi, Joana, Osawa, Kazuki, Smith, Celine, Boppana, Ramya Sree, Bilal, Taylan, Koizumi, Yuma, Xu, Ying, Altun, Yasemin, Shabat, Nir, Bariach, Ben, Korchemniy, Alex, Choo, Kiam, Ronneberger, Olaf, Iwuanyanwu, Chimezie, Zhao, Shubin, Soergel, David, Hsieh, Cho-Jui, Cai, Irene, Iqbal, Shariq, Sundermeyer, Martin, Chen, Zhe, Bursztein, Elie, Malaviya, Chaitanya, Biadsy, Fadi, Shroff, Prakash, Dhillon, Inderjit, Latkar, Tejasi, Dyer, Chris, Forbes, Hannah, Nicosia, Massimo, Nikolaev, Vitaly, Greene, Somer, Georgiev, Marin, Wang, Pidong, Martin, Nina, Sedghi, Hanie, Zhang, John, Banzal, Praseem, Fritz, Doug, Rao, Vikram, Wang, Xuezhi, Zhang, Jiageng, Patraucean, Viorica, Du, Dayou, Mordatch, Igor, Jurin, Ivan, Liu, Lewis, Dubey, Ayush, Mohan, Abhi, Nowakowski, Janek, Ion, Vlad-Doru, Wei, Nan, Tojo, Reiko, Raad, Maria Abi, Hudson, Drew A., Keshava, Vaishakh, Agrawal, Shubham, Ramirez, Kevin, Wu, Zhichun, Nguyen, Hoang, Liu, Ji, Sewak, Madhavi, Petrini, Bryce, Choi, DongHyun, Philips, Ivan, Wang, Ziyue, Bica, Ioana, Garg, Ankush, Wilkiewicz, Jarek, Agrawal, Priyanka, Li, Xiaowei, Guo, Danhao, Xue, Emily, Shaik, Naseer, Leach, Andrew, Khan, Sadh MNM, Wiesinger, Julia, Jerome, Sammy, Chakladar, Abhishek, Wang, Alek Wenjiao, Ornduff, Tina, Abu, Folake, Ghaffarkhah, Alireza, Wainwright, Marcus, Cortes, Mario, Liu, Frederick, Maynez, Joshua, Petrov, Slav, Wu, Yonghui, Hassabis, Demis, Kavukcuoglu, Koray, Dean, Jeffrey, Vinyals, Oriol
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content.
MusicRL: Aligning Music Generation to Human Preferences
Cideron, Geoffrey, Girgin, Sertan, Verzetti, Mauro, Vincent, Damien, Kastelic, Matej, Borsos, Zalรกn, McWilliams, Brian, Ungureanu, Victor, Bachem, Olivier, Pietquin, Olivier, Geist, Matthieu, Hussenot, Lรฉonard, Zeghidour, Neil, Agostinelli, Andrea
We propose MusicRL, the first music generation system finetuned from human feedback. Appreciation of text-to-music models is particularly subjective since the concept of musicality as well as the specific intention behind a caption are user-dependent (e.g. a caption such as "upbeat work-out music" can map to a retro guitar solo or a techno pop beat). Not only this makes supervised training of such models challenging, but it also calls for integrating continuous human feedback in their post-deployment finetuning. MusicRL is a pretrained autoregressive MusicLM (Agostinelli et al., 2023) model of discrete audio tokens finetuned with reinforcement learning to maximise sequence-level rewards. We design reward functions related specifically to text-adherence and audio quality with the help from selected raters, and use those to finetune MusicLM into MusicRL-R. We deploy MusicLM to users and collect a substantial dataset comprising 300,000 pairwise preferences. Using Reinforcement Learning from Human Feedback (RLHF), we train MusicRL-U, the first text-to-music model that incorporates human feedback at scale. Human evaluations show that both MusicRL-R and MusicRL-U are preferred to the baseline. Ultimately, MusicRL-RU combines the two approaches and results in the best model according to human raters. Ablation studies shed light on the musical attributes influencing human preferences, indicating that text adherence and quality only account for a part of it. This underscores the prevalence of subjectivity in musical appreciation and calls for further involvement of human listeners in the finetuning of music generation models.
AudioLM: a Language Modeling Approach to Audio Generation
Borsos, Zalรกn, Marinier, Raphaรซl, Vincent, Damien, Kharitonov, Eugene, Pietquin, Olivier, Sharifi, Matt, Roblek, Dominik, Teboul, Olivier, Grangier, David, Tagliasacchi, Marco, Zeghidour, Neil
We introduce AudioLM, a framework for high-quality audio generation with long-term consistency. AudioLM maps the input audio to a sequence of discrete tokens and casts audio generation as a language modeling task in this representation space. We show how existing audio tokenizers provide different trade-offs between reconstruction quality and long-term structure, and we propose a hybrid tokenization scheme to achieve both objectives. Namely, we leverage the discretized activations of a masked language model pre-trained on audio to capture long-term structure and the discrete codes produced by a neural audio codec to achieve high-quality synthesis. By training on large corpora of raw audio waveforms, AudioLM learns to generate natural and coherent continuations given short prompts. When trained on speech, and without any transcript or annotation, AudioLM generates syntactically and semantically plausible speech continuations while also maintaining speaker identity and prosody for unseen speakers. Furthermore, we demonstrate how our approach extends beyond speech by generating coherent piano music continuations, despite being trained without any symbolic representation of music.
AudioPaLM: A Large Language Model That Can Speak and Listen
Rubenstein, Paul K., Asawaroengchai, Chulayuth, Nguyen, Duc Dung, Bapna, Ankur, Borsos, Zalรกn, Quitry, Fรฉlix de Chaumont, Chen, Peter, Badawy, Dalia El, Han, Wei, Kharitonov, Eugene, Muckenhirn, Hannah, Padfield, Dirk, Qin, James, Rozenberg, Danny, Sainath, Tara, Schalkwyk, Johan, Sharifi, Matt, Ramanovich, Michelle Tadmor, Tagliasacchi, Marco, Tudor, Alexandru, Velimiroviฤ, Mihajlo, Vincent, Damien, Yu, Jiahui, Wang, Yongqiang, Zayats, Vicky, Zeghidour, Neil, Zhang, Yu, Zhang, Zhishuai, Zilka, Lukas, Frank, Christian
We introduce AudioPaLM, a large language model for speech understanding and generation. AudioPaLM fuses text-based and speech-based language models, PaLM-2 [Anil et al., 2023] and AudioLM [Borsos et al., 2022], into a unified multimodal architecture that can process and generate text and speech with applications including speech recognition and speech-to-speech translation. AudioPaLM inherits the capability to preserve paralinguistic information such as speaker identity and intonation from AudioLM and the linguistic knowledge present only in text large language models such as PaLM-2. We demonstrate that initializing AudioPaLM with the weights of a text-only large language model improves speech processing, successfully leveraging the larger quantity of text training data used in pretraining to assist with the speech tasks. The resulting model significantly outperforms existing systems for speech translation tasks and has the ability to perform zero-shot speech-to-text translation for many languages for which input/target language combinations were not seen in training. AudioPaLM also demonstrates features of audio language models, such as transferring a voice across languages based on a short spoken prompt.
SoundStorm: Efficient Parallel Audio Generation
Borsos, Zalรกn, Sharifi, Matt, Vincent, Damien, Kharitonov, Eugene, Zeghidour, Neil, Tagliasacchi, Marco
We present SoundStorm, a model for efficient, non-autoregressive audio generation. SoundStorm receives as input the semantic tokens of AudioLM, and relies on bidirectional attention and confidence-based parallel decoding to generate the tokens of a neural audio codec. Compared to the autoregressive generation approach of AudioLM, our model produces audio of the same quality and with higher consistency in voice and acoustic conditions, while being two orders of magnitude faster. SoundStorm generates 30 seconds of audio in 0.5 seconds on a TPU-v4. We demonstrate the ability of our model to scale audio generation to longer sequences by synthesizing high-quality, natural dialogue segments, given a transcript annotated with speaker turns and a short prompt with the speakers' voices.
Continuous Control with Action Quantization from Demonstrations
Dadashi, Robert, Hussenot, Lรฉonard, Vincent, Damien, Girgin, Sertan, Raichuk, Anton, Geist, Matthieu, Pietquin, Olivier
In Reinforcement Learning (RL), discrete actions, as opposed to continuous actions, result in less complex exploration problems and the immediate computation of the maximum of the action-value function which is central to dynamic programming-based methods. In this paper, we propose a novel method: Action Quantization from Demonstrations (AQuaDem) to learn a discretization of continuous action spaces by leveraging the priors of demonstrations. This dramatically reduces the exploration problem, since the actions faced by the agent not only are in a finite number but also are plausible in light of the demonstrator's behavior. By discretizing the action space we can apply any discrete action deep RL algorithm to the continuous control problem. We evaluate the proposed method on three different setups: RL with demonstrations, RL with play data --demonstrations of a human playing in an environment but not solving any specific task-- and Imitation Learning. For all three setups, we only consider human data, which is more challenging than synthetic data. We found that AQuaDem consistently outperforms state-of-the-art continuous control methods, both in terms of performance and sample efficiency. We provide visualizations and videos in the paper's website: https://google-research.github.io/aquadem.
What Matters for Adversarial Imitation Learning?
Orsini, Manu, Raichuk, Anton, Hussenot, Lรฉonard, Vincent, Damien, Dadashi, Robert, Girgin, Sertan, Geist, Matthieu, Bachem, Olivier, Pietquin, Olivier, Andrychowicz, Marcin
Adversarial imitation learning has become a popular framework for imitation in continuous control. Over the years, several variations of its components were proposed to enhance the performance of the learned policies as well as the sample complexity of the algorithm. In practice, these choices are rarely tested all together in rigorous empirical studies. It is therefore difficult to discuss and understand what choices, among the high-level algorithmic options as well as low-level implementation details, matter. To tackle this issue, we implement more than 50 of these choices in a generic adversarial imitation learning framework and investigate their impacts in a large-scale study ( 500k trained agents) with both synthetic and human-generated demonstrations. While many of our findings confirm common practices, some of them are surprising or even contradict prior work. In particular, our results suggest that artificial demonstrations are not a good proxy for human data and that the very common practice of evaluating imitation algorithms only with synthetic demonstrations may lead to algorithms which perform poorly in the more realistic scenarios with human demonstrations.
Google Research Football: A Novel Reinforcement Learning Environment
Kurach, Karol, Raichuk, Anton, Staลczyk, Piotr, Zajฤ c, Michaล, Bachem, Olivier, Espeholt, Lasse, Riquelme, Carlos, Vincent, Damien, Michalski, Marcin, Bousquet, Olivier, Gelly, Sylvain
Recent progress in the field of reinforcement learning has been accelerated by virtual learning environments such as video games, where novel algorithms and ideas can be quickly tested in a safe and reproducible manner. We introduce the Google Research F ootball Environment, a new reinforcement learning environment where agents are trained to play football in an advanced, physics-based 3D simulator. The resulting environment is challenging, easy to use and customize, and it is available under a permissive open-source license. In addition, it provides support for multiplayer and multi-agent experiments. We propose three full-game scenarios of varying difficulty with the F ootball Benchmarks and report baseline results for three commonly used reinforcement algorithms (IMP ALA, PPO, and Ape-X DQN). We also provide a diverse set of simpler scenarios with the F ootball Academy and showcase several promising research directions.
MULEX: Disentangling Exploitation from Exploration in Deep RL
Beyer, Lucas, Vincent, Damien, Teboul, Olivier, Gelly, Sylvain, Geist, Matthieu, Pietquin, Olivier
An agent learning through interactions should balance its action selection process between probing the environment to discover new rewards and using the information acquired in the past to adopt useful behaviour. This trade-off is usually obtained by perturbing either the agent's actions (e.g., e-greedy or Gibbs sampling) or the agent's parameters (e.g., NoisyNet), or by modifying the reward it receives (e.g., exploration bonus, intrinsic motivation, or hand-shaped rewards). Here, we adopt a disruptive but simple and generic perspective, where we explicitly disentangle exploration and exploitation. Different losses are optimized in parallel, one of them coming from the true objective (maximizing cumulative rewards from the environment) and others being related to exploration. Every loss is used in turn to learn a policy that generates transitions, all shared in a single replay buffer. Off-policy methods are then applied to these transitions to optimize each loss. We showcase our approach on a hard-exploration environment, show its sample-efficiency and robustness, and discuss further implications.
Adaptive Temporal-Difference Learning for Policy Evaluation with Per-State Uncertainty Estimates
Penedones, Hugo, Riquelme, Carlos, Vincent, Damien, Maennel, Hartmut, Mann, Timothy, Barreto, Andre, Gelly, Sylvain, Neu, Gergely
We consider the core reinforcement-learning problem of on-policy value function approximation from a batch of trajectory data, and focus on various issues of Temporal Difference (TD) learning and Monte Carlo (MC) policy evaluation. The two methods are known to achieve complementary bias-variance trade-off properties, with TD tending to achieve lower variance but potentially higher bias. In this paper, we argue that the larger bias of TD can be a result of the amplification of local approximation errors. We address this by proposing an algorithm that adaptively switches between TD and MC in each state, thus mitigating the propagation of errors. Our method is based on learned confidence intervals that detect biases of TD estimates. We demonstrate in a variety of policy evaluation tasks that this simple adaptive algorithm performs competitively with the best approach in hindsight, suggesting that learned confidence intervals are a powerful technique for adapting policy evaluation to use TD or MC returns in a data-driven way.