Rao, Anand
Gemma 3 Technical Report
Gemma Team, null, Kamath, Aishwarya, Ferret, Johan, Pathak, Shreya, Vieillard, Nino, Merhej, Ramona, Perrin, Sarah, Matejovicova, Tatiana, Ramé, Alexandre, Rivière, Morgane, Rouillard, Louis, Mesnard, Thomas, Cideron, Geoffrey, Grill, Jean-bastien, Ramos, Sabela, Yvinec, Edouard, Casbon, Michelle, Pot, Etienne, Penchev, Ivo, Liu, Gaël, Visin, Francesco, Kenealy, Kathleen, Beyer, Lucas, Zhai, Xiaohai, Tsitsulin, Anton, Busa-Fekete, Robert, Feng, Alex, Sachdeva, Noveen, Coleman, Benjamin, Gao, Yi, Mustafa, Basil, Barr, Iain, Parisotto, Emilio, Tian, David, Eyal, Matan, Cherry, Colin, Peter, Jan-Thorsten, Sinopalnikov, Danila, Bhupatiraju, Surya, Agarwal, Rishabh, Kazemi, Mehran, Malkin, Dan, Kumar, Ravin, Vilar, David, Brusilovsky, Idan, Luo, Jiaming, Steiner, Andreas, Friesen, Abe, Sharma, Abhanshu, Sharma, Abheesht, Gilady, Adi Mayrav, Goedeckemeyer, Adrian, Saade, Alaa, Feng, Alex, Kolesnikov, Alexander, Bendebury, Alexei, Abdagic, Alvin, Vadi, Amit, György, András, Pinto, André Susano, Das, Anil, Bapna, Ankur, Miech, Antoine, Yang, Antoine, Paterson, Antonia, Shenoy, Ashish, Chakrabarti, Ayan, Piot, Bilal, Wu, Bo, Shahriari, Bobak, Petrini, Bryce, Chen, Charlie, Lan, Charline Le, Choquette-Choo, Christopher A., Carey, CJ, Brick, Cormac, Deutsch, Daniel, Eisenbud, Danielle, Cattle, Dee, Cheng, Derek, Paparas, Dimitris, Sreepathihalli, Divyashree Shivakumar, Reid, Doug, Tran, Dustin, Zelle, Dustin, Noland, Eric, Huizenga, Erwin, Kharitonov, Eugene, Liu, Frederick, Amirkhanyan, Gagik, Cameron, Glenn, Hashemi, Hadi, Klimczak-Plucińska, Hanna, Singh, Harman, Mehta, Harsh, Lehri, Harshal Tushar, Hazimeh, Hussein, Ballantyne, Ian, Szpektor, Idan, Nardini, Ivan, Pouget-Abadie, Jean, Chan, Jetha, Stanton, Joe, Wieting, John, Lai, Jonathan, Orbay, Jordi, Fernandez, Joseph, Newlan, Josh, Ji, Ju-yeong, Singh, Jyotinder, Black, Kat, Yu, Kathy, Hui, Kevin, Vodrahalli, Kiran, Greff, Klaus, Qiu, Linhai, Valentine, Marcella, Coelho, Marina, Ritter, Marvin, Hoffman, Matt, Watson, Matthew, Chaturvedi, Mayank, Moynihan, Michael, Ma, Min, Babar, Nabila, Noy, Natasha, Byrd, Nathan, Roy, Nick, Momchev, Nikola, Chauhan, Nilay, Sachdeva, Noveen, Bunyan, Oskar, Botarda, Pankil, Caron, Paul, Rubenstein, Paul Kishan, Culliton, Phil, Schmid, Philipp, Sessa, Pier Giuseppe, Xu, Pingmei, Stanczyk, Piotr, Tafti, Pouya, Shivanna, Rakesh, Wu, Renjie, Pan, Renke, Rokni, Reza, Willoughby, Rob, Vallu, Rohith, Mullins, Ryan, Jerome, Sammy, Smoot, Sara, Girgin, Sertan, Iqbal, Shariq, Reddy, Shashir, Sheth, Shruti, Põder, Siim, Bhatnagar, Sijal, Panyam, Sindhu Raghuram, Eiger, Sivan, Zhang, Susan, Liu, Tianqi, Yacovone, Trevor, Liechty, Tyler, Kalra, Uday, Evci, Utku, Misra, Vedant, Roseberry, Vincent, Feinberg, Vlad, Kolesnikov, Vlad, Han, Woohyun, Kwon, Woosuk, Chen, Xi, Chow, Yinlam, Zhu, Yuvein, Wei, Zichuan, Egyed, Zoltan, Cotruta, Victor, Giang, Minh, Kirk, Phoebe, Rao, Anand, Black, Kat, Babar, Nabila, Lo, Jessica, Moreira, Erica, Martins, Luiz Gustavo, Sanseviero, Omar, Gonzalez, Lucas, Gleicher, Zach, Warkentin, Tris, Mirrokni, Vahab, Senter, Evan, Collins, Eli, Barral, Joelle, Ghahramani, Zoubin, Hadsell, Raia, Matias, Yossi, Sculley, D., Petrov, Slav, Fiedel, Noah, Shazeer, Noam, Vinyals, Oriol, Dean, Jeff, Hassabis, Demis, Kavukcuoglu, Koray, Farabet, Clement, Buchatskaya, Elena, Alayrac, Jean-Baptiste, Anil, Rohan, Dmitry, null, Lepikhin, null, Borgeaud, Sebastian, Bachem, Olivier, Joulin, Armand, Andreev, Alek, Hardin, Cassidy, Dadashi, Robert, Hussenot, Léonard
We introduce Gemma 3, a multimodal addition to the Gemma family of lightweight open models, ranging in scale from 1 to 27 billion parameters. This version introduces vision understanding abilities, a wider coverage of languages and longer context - at least 128K tokens. We also change the architecture of the model to reduce the KV-cache memory that tends to explode with long context. This is achieved by increasing the ratio of local to global attention layers, and keeping the span on local attention short. The Gemma 3 models are trained with distillation and achieve superior performance to Gemma 2 for both pre-trained and instruction finetuned versions. In particular, our novel post-training recipe significantly improves the math, chat, instruction-following and multilingual abilities, making Gemma3-4B-IT competitive with Gemma2-27B-IT and Gemma3-27B-IT comparable to Gemini-1.5-Pro across benchmarks. We release all our models to the community.
An AI-Driven Data Mesh Architecture Enhancing Decision-Making in Infrastructure Construction and Public Procurement
Mishra, Saurabh, Shinde, Mahendra, Yadav, Aniket, Ayyub, Bilal, Rao, Anand
Infrastructure construction, often dubbed an "industry of industries," is closely linked with government spending and public procurement, offering significant opportunities for improved efficiency and productivity through better transparency and information access. By leveraging these opportunities, we can achieve notable gains in productivity, cost savings, and broader economic benefits. Our approach introduces an integrated software ecosystem utilizing Data Mesh and Service Mesh architectures. This system includes the largest training dataset for infrastructure and procurement, encompassing over 100 billion tokens, scientific publications, activities, and risk data, all structured by a systematic AI framework. Supported by a Knowledge Graph linked to domain-specific multi-agent tasks and Q&A capabilities, our platform standardizes and ingests diverse data sources, transforming them into structured knowledge. Leveraging large language models (LLMs) and automation, our system revolutionizes data structuring and knowledge creation, aiding decision-making in early-stage project planning, detailed research, market trend analysis, and qualitative assessments. Its web-scalable architecture delivers domain-curated information, enabling AI agents to facilitate reasoning and manage uncertainties, while preparing for future expansions with specialized agents targeting particular challenges. This integration of AI with domain expertise not only boosts efficiency and decision-making in construction and infrastructure but also establishes a framework for enhancing government efficiency and accelerating the transition of traditional industries to digital workflows. This work is poised to significantly influence AI-driven initiatives in this sector and guide best practices in AI Operations.
Reliability, Resilience and Human Factors Engineering for Trustworthy AI Systems
Mishra, Saurabh, Rao, Anand, Krishnan, Ramayya, Ayyub, Bilal, Aria, Amin, Zio, Enrico
As AI systems become integral to critical operations across industries and services, ensuring their reliability and safety is essential. We offer a framework that integrates established reliability and resilience engineering principles into AI systems. By applying traditional metrics such as failure rate and Mean Time Between Failures (MTBF) along with resilience engineering and human reliability analysis, we propose an integrate framework to manage AI system performance, and prevent or efficiently recover from failures. Our work adapts classical engineering methods to AI systems and outlines a research agenda for future technical studies. We apply our framework to a real-world AI system, using system status data from platforms such as openAI, to demonstrate its practical applicability. This framework aligns with emerging global standards and regulatory frameworks, providing a methodology to enhance the trustworthiness of AI systems. Our aim is to guide policy, regulation, and the development of reliable, safe, and adaptable AI technologies capable of consistent performance in real-world environments.
Consumer Demand Modeling During COVID-19 Pandemic
Hoda, Shaz, Singh, Amitoj, Rao, Anand, Ural, Remzi, Hodson, Nicholas
The current pandemic has introduced substantial uncertainty to traditional methods for demand planning. These uncertainties stem from the disease progression, government interventions, economy and consumer behavior. While most of the emerging literature on the pandemic has focused on disease progression, a few have focused on consequent regulations and their impact on individual behavior. The contributions of this paper include a quantitative behavior model of fear of COVID-19, impact of government interventions on consumer behavior, and impact of consumer behavior on consumer choice and hence demand for goods. It brings together multiple models for disease progression, consumer behavior and demand estimation-thus bridging the gap between disease progression and consumer demand. We use panel regression to understand the drivers of demand during the pandemic and Bayesian inference to simplify the regulation landscape that can help build scenarios for resilient demand planning. We illustrate this resilient demand planning model using a specific example of gas retailing. We find that demand is sensitive to fear of COVID-19: as the number of COVID-19 cases increase over the previous week, the demand for gas decreases -- though this dissipates over time. Further, government regulations restrict access to different services, thereby reducing mobility, which in itself reduces demand.