Generative AI
A Comprehensive Survey of Foundation Models in Medicine
Khan, Wasif, Leem, Seowung, See, Kyle B., Wong, Joshua K., Zhang, Shaoting, Fang, Ruogu
Foundation models (FMs) are large-scale deep-learning models trained on extensive datasets using self-supervised techniques. These models serve as a base for various downstream tasks, including healthcare. FMs have been adopted with great success across various domains within healthcare, including natural language processing (NLP), computer vision, graph learning, biology, and omics. Existing healthcare-based surveys have not yet included all of these domains. Therefore, this survey provides a comprehensive overview of FMs in healthcare. We focus on the history, learning strategies, flagship models, applications, and challenges of FMs. We explore how FMs such as the BERT and GPT families are reshaping various healthcare domains, including clinical large language models, medical image analysis, and omics data. Furthermore, we provide a detailed taxonomy of healthcare applications facilitated by FMs, such as clinical NLP, medical computer vision, graph learning, and other biology-related tasks. Despite the promising opportunities FMs provide, they also have several associated challenges, which are explained in detail. We also outline potential future directions to provide researchers and practitioners with insights into the potential and limitations of FMs in healthcare to advance their deployment and mitigate associated risks.
Large Language Models Playing Mixed Strategy Nash Equilibrium Games
Generative artificial intelligence (Generative AI), and in particular Large Language Models (LLMs) have gained significant popularity among researchers and industrial communities, paving the way for integrating LLMs in different domains, such as robotics, telecom, and healthcare. In this paper, we study the intersection of game theory and generative artificial intelligence, focusing on the capabilities of LLMs to find the Nash equilibrium in games with a mixed strategy Nash equilibrium and no pure strategy Nash equilibrium (that we denote mixed strategy Nash equilibrium games). The study reveals a significant enhancement in the performance of LLMs when they are equipped with the possibility to run code and are provided with a specific prompt to incentivize them to do so. However, our research also highlights the limitations of LLMs when the randomization strategy of the game is not easy to deduce. It is evident that while LLMs exhibit remarkable proficiency in well-known standard games, their performance dwindles when faced with slight modifications of the same games. This paper aims to contribute to the growing body of knowledge on the intersection of game theory and generative artificial intelligence while providing valuable insights into LLMs strengths and weaknesses. It also underscores the need for further research to overcome the limitations of LLMs, particularly in dealing with even slightly more complex scenarios, to harness their full potential.
Generative AI and Digital Neocolonialism in Global Education: Towards an Equitable Framework
Nyaaba, Matthew, Wright, Alyson, Choi, Gyu Lim
This paper critically discusses how generative artificial intelligence (GenAI) might impose Western ideologies on non-Western societies, perpetuating digital neocolonialism in education through its inherent biases. It further suggests strategies for local and global stakeholders to mitigate these effects. Our discussions demonstrated that GenAI can foster cultural imperialism by generating content that primarily incorporates cultural references and examples relevant to Western students, thereby alienating students from non-Western backgrounds. Also, the predominant use of Western languages by GenAI can marginalize non-dominant languages, making educational content less accessible to speakers of indigenous languages and potentially impacting their ability to learn in their first language. Additionally, GenAI often generates content and curricula that reflect the perspectives of technologically dominant countries, overshadowing marginalized indigenous knowledge and practices. Moreover, the cost of access to GenAI intensifies educational inequality and the control of GenAI data could lead to commercial exploitation without benefiting local students and their communities. We propose human-centric reforms to prioritize cultural diversity and equity in GenAI development; a liberatory design to empower educators and students to identify and dismantle the oppressive structures within GenAI applications; foresight by design to create an adjustable GenAI system to meet future educational needs; and finally, effective prompting skills to reduce the retrieval of neocolonial outputs.
A Generation of AI Guinea Pigs
This spring, the Los Angeles Unified School District--the second-largest public school district in the United States--introduced students and parents to a new "educational friend" named Ed. A learning platform that includes a chatbot represented by a small illustration of a smiling sun, Ed is being tested in 100 schools within the district and is accessible at all hours through a website. It can answer questions about a child's courses, grades, and attendance, and point users to optional activities. As Superintendent Alberto M. Carvalho put it to me, "AI is here to stay. If you don't master it, it will master you." Carvalho says he wants to empower teachers and students to learn to use AI safely.
Reduce AI Hallucinations With This Neat Software Trick
If you've ever used a generative artificial intelligence tool, it's lied to you. These recurring fabrications are often called AI hallucinations, and developers are feverishly working to make generative AI tools more reliable by reigning in these unfortunate fibs. One of the most popular approaches to reducing AI hallucinations--and one that is quickly growing more popular in Silicon Valley--is called retrieval augmented generation. The RAG process is quite complicated, but on a basic level it augments your prompts by gathering info from a custom database, and then the large language model generates an answer based on that data. For example, a company could upload all of its HR policies and benefits to a RAG database and have the AI chatbot just focus on answers that can be found in those documents.
How Pope Francis became the AI ethicist for tech titans and world leaders
The European Union is readying a landmark antitrust law that could limit more advanced generative AI models. The Federal Trade Commission is investigating a deal that Microsoft made with the AI start-up Inflection, probing whether the tech giant deliberately set up the investment to avoid a merger review. And U.S. enforcers reached a deal that will open the company to greater scrutiny of how it wields power to dominate artificial intelligence, including its multibillion-dollar investments in ChatGPT maker OpenAI. That relationship has also exposed Microsoft to new reputational risks, as OpenAI chief executive Sam Altman frequently invites controversy.
Generative AI-based Prompt Evolution Engineering Design Optimization With Vision-Language Model
Wong, Melvin, Rios, Thiago, Menzel, Stefan, Ong, Yew Soon
Engineering design optimization requires an efficient combination of a 3D shape representation, an optimization algorithm, and a design performance evaluation method, which is often computationally expensive. We present a prompt evolution design optimization (PEDO) framework contextualized in a vehicle design scenario that leverages a vision-language model for penalizing impractical car designs synthesized by a generative model. The backbone of our framework is an evolutionary strategy coupled with an optimization objective function that comprises a physics-based solver and a vision-language model for practical or functional guidance in the generated car designs. In the prompt evolutionary search, the optimizer iteratively generates a population of text prompts, which embed user specifications on the aerodynamic performance and visual preferences of the 3D car designs. Then, in addition to the computational fluid dynamics simulations, the pre-trained vision-language model is used to penalize impractical designs and, thus, foster the evolutionary algorithm to seek more viable designs. Our investigations on a car design optimization problem show a wide spread of potential car designs generated at the early phase of the search, which indicates a good diversity of designs in the initial populations, and an increase of over 20\% in the probability of generating practical designs compared to a baseline framework without using a vision-language model. Visual inspection of the designs against the performance results demonstrates prompt evolution as a very promising paradigm for finding novel designs with good optimization performance while providing ease of use in specifying design specifications and preferences via a natural language interface.
Gemini & Physical World: Large Language Models Can Estimate the Intensity of Earthquake Shaking from Multi-Modal Social Media Posts
Mousavi, S. Mostafa, Stogaitis, Marc, Gadh, Tajinder, Allen, Richard M, Barski, Alexei, Bosch, Robert, Robertson, Patrick, Thiruverahan, Nivetha, Cho, Youngmin, Raj, Aman
This paper presents a novel approach to extract scientifically valuable information about Earth's physical phenomena from unconventional sources, such as multi-modal social media posts. Employing a state-of-the-art large language model (LLM), Gemini 1.5 Pro (Reid et al. 2024), we estimate earthquake ground shaking intensity from these unstructured posts. The model's output, in the form of Modified Mercalli Intensity (MMI) values, aligns well with independent observational data. Furthermore, our results suggest that LLMs, trained on vast internet data, may have developed a unique understanding of physical phenomena. Specifically, Google's Gemini models demonstrate a simplified understanding of the general relationship between earthquake magnitude, distance, and MMI intensity, accurately describing observational data even though it's not identical to established models. These findings raise intriguing questions about the extent to which Gemini's training has led to a broader understanding of the physical world and its phenomena. The ability of Generative AI models like Gemini to generate results consistent with established scientific knowledge highlights their potential to augment our understanding of complex physical phenomena like earthquakes. The flexible and effective approach proposed in this study holds immense potential for enriching our understanding of the impact of physical phenomena and improving resilience during natural disasters. This research is a significant step toward harnessing the power of social media and AI for natural disaster mitigation, opening new avenues for understanding the emerging capabilities of Generative AI and LLMs for scientific applications.
OpenAI adds Trump-appointed former NSA director to its board
Nakasone joins OpenAI's board following a dramatic board shake-up. Amid a tougher regulatory environment and increased efforts to digitize government and military services, tech companies are increasingly seeking board members with military expertise. Amazon's board includes Keith Alexander, who was previously the commander of U.S. Cyber Command and the director of the NSA. Google Public Sector, a division of the company that focuses on selling cloud services to governments, also has retired generals on its board.
Apple Proved That AI Is a Feature, Not a Product
Apple's otherworldly, flying-saucer headquarters in Cupertino, California, felt like a suitable venue this week for a bold and futuristic revamp of the company's most prized products. With iPhone sales slowing and rivals gaining ground thanks to the rise of tools like ChatGPT, Apple offered its own generative artificial intelligence vision at its Worldwide Developer Conference (WWDC). Apple has lately been perceived as a generative AI laggard. Its WWDC offerings failed to persuade some critics, who have branded WWDC's announcements as downright boring. But with the focus on infusing existing apps and OS features with what the company calls "Apple Intelligence," the big takeaway is that generative AI is a feature rather than a product in and of itself.