Generative AI
An 'iPhone of AI' Makes No Sense. Jony Ive Needs To Carefully Construct The Whole Damn System
In the past week or so, we've had a logo upgrade, a big New York Times profile, and a Moncler outerwear collaboration from LoveFrom, Jony Ive and Marc Newson's San Franciscoโheadquartered design studio. The real news, though, is confirmation that LoveFrom is working with OpenAI's founder Sam Altman to build a secretive as-yet-unnamed AI device with investors including Laurene Powell Jobs' Emerson Collective, and Ive himself. The former Apple chief design officer is sometimes gently mocked for his obsession with seemingly small details, but when it comes to a potential mainstream human-AI interface, the man who has spent the past five years preoccupied with buttons--going so far as to create a five-volume history of garment fasteners--could be, in a somewhat inevitable way, the exact kind of person required to walk this particular tightrope of ethics and ambition. Details so far are scarce but revealing, at least where intentions are concerned. LoveFrom is designing "a product that uses AI to create a computing experience that is less socially disruptive than the iPhone."
Thematic Analysis with Open-Source Generative AI and Machine Learning: A New Method for Inductive Qualitative Codebook Development
Katz, Andrew, Fleming, Gabriella Coloyan, Main, Joyce
This paper aims to answer one central question: to what extent can open-source generative text models be used in a workflow to approximate thematic analysis in social science research? To answer this question, we present the Generative AI-enabled Theme Organization and Structuring (GATOS) workflow, which uses open-source machine learning techniques, natural language processing tools, and generative text models to facilitate thematic analysis. To establish validity of the method, we present three case studies applying the GATOS workflow, leveraging these models and techniques to inductively create codebooks similar to traditional procedures using thematic analysis. Specifically, we investigate the extent to which a workflow comprising open-source models and tools can inductively produce codebooks that approach the known space of themes and sub-themes. To address the challenge of gleaning insights from these texts, we combine open-source generative text models, retrieval-augmented generation, and prompt engineering to identify codes and themes in large volumes of text, i.e., generate a qualitative codebook. The process mimics an inductive coding process that researchers might use in traditional thematic analysis by reading text one unit of analysis at a time, considering existing codes already in the codebook, and then deciding whether or not to generate a new code based on whether the extant codebook provides adequate thematic coverage. We demonstrate this workflow using three synthetic datasets from hypothetical organizational research settings: a study of teammate feedback in teamwork settings, a study of organizational cultures of ethical behavior, and a study of employee perspectives about returning to their offices after the pandemic. We show that the GATOS workflow is able to identify themes in the text that were used to generate the original synthetic datasets.
Nonlinear Inverse Design of Mechanical Multi-Material Metamaterials Enabled by Video Denoising Diffusion and Structure Identifier
Park, Jaewan, Kushwaha, Shashank, He, Junyan, Koric, Seid, Liu, Qibang, Jasiuk, Iwona, Abueidda, Diab
Metamaterials, synthetic materials with customized properties, have emerged as a promising field due to advancements in additive manufacturing. These materials derive unique mechanical properties from their internal lattice structures, which are often composed of multiple materials that repeat geometric patterns. While traditional inverse design approaches have shown potential, they struggle to map nonlinear material behavior to multiple possible structural configurations. This paper presents a novel framework leveraging video diffusion models, a type of generative artificial Intelligence (AI), for inverse multi-material design based on nonlinear stress-strain responses. Our approach consists of two key components: (1) a fields generator using a video diffusion model to create solution fields based on target nonlinear stress-strain responses, and (2) a structure identifier employing two UNet models to determine the corresponding multi-material 2D design. By incorporating multiple materials, plasticity, and large deformation, our innovative design method allows for enhanced control over the highly nonlinear mechanical behavior of metamaterials commonly seen in real-world applications. It offers a promising solution for generating next-generation metamaterials with finely tuned mechanical characteristics.
OpenAI reportedly plans to increase ChatGPT's price to 44 within five years
OpenAI is reportedly telling investors that it plans on charging 22 a month to use ChatGPT by the end of the year. The company also plans to aggressively increase the monthly price over the next five years up to 44. The documents obtained by The New York Times shows that OpenAI took in 300 million in revenue this August, and expects to make 3.7 billion in sales by the end of the year. Various expenses such as salaries, rent and operational costs will cause the company to lose 5 billion this year. OpenAI is reportedly circulating the documents the NYT reported on as part of a drive to find new investors to prevent or lessen its financial shortfall.
The Download: safer space travel, and generative AI in video games
Long-distance space travel can wreak havoc on human health. There's radiation and microgravity to contend with, as well as the psychological toll of isolation and confinement. Research on identical twin astronauts has also revealed a slew of genetic changes that happen when a person spends a year in space. That's why some bioethicists are exploring the idea of radical treatments for future astronauts. Once we've figured out all the health impacts of space travel, they argue, we should edit the genomes of astronauts ahead of launch to offer them the best protection.
OpenAI shift to for-profit company may lead it to cut corners, says whistleblower
OpenAI's plan to become a for-profit company could encourage the artificial intelligence startup to cut corners on safety, a whistleblower has said. William Saunders, a former research engineer at OpenAI, told the Guardian he was concerned by reports that the ChatGPT developer was preparing to change its corporate structure and would no longer be controlled by its non-profit board. Saunders, who flagged his concerns in testimony to the US Senate this month, said he was also concerned by reports that OpenAI's chief executive, Sam Altman, could hold a stake in the restructured business. "I'm most concerned about what this means for governance of safety decisions at OpenAI," he said. "If the non-profit board is no longer in control of these decisions and Sam Altman holds a significant equity stake, this creates more incentive to race and cut corners."
Watch: Can BBC reporter's AI clone fool his colleagues?
Companies are being warned about the increasing use of AI to carry out so-called CEO Fraud. More victims are coming forward with their stories of being targeted using generative AI techniques and one case in Hong Kong reportedly saw an AI clone used during a video meeting to trick staff into losing 25m. But while some fear the rise of AI clones, companies including Zoom say we should be excited about a future where your clone can go to a meeting on your behalf. Cyber correspondent Joe Tidy has had an AI clone of himself built by engineers at Fraia AI. Watch to see if he can fool his colleagues with it.
Local Transcription Models in Home Care Nursing in Switzerland: an Interdisciplinary Case Study
Kramer, Jeremy, Kravchenko, Tetiana, Kaufmann, Beatrice, Thilo, Friederike J. S., Kurpicz-Briki, Mascha
Latest advances in the field of natural language processing (NLP) enable new use cases for different domains, including the medical sector. In particular, transcription can be used to support automation in the nursing documentation process and give nurses more time to interact with the patients. However, different challenges including (a) data privacy, (b) local languages and dialects, and (c) domain-specific vocabulary need to be addressed. In this case study, we investigate the case of home care nursing documentation in Switzerland. We assessed different transcription tools and models, and conducted several experiments with OpenAI Whisper, involving different variations of German (i.e., dialects, foreign accent) and manually curated example texts by a domain expert of home care nursing. Our results indicate that even the used out-of-the-box model performs sufficiently well to be a good starting point for future research in the field.
Environment Scan of Generative AI Infrastructure for Clinical and Translational Science
Idnay, Betina, Xu, Zihan, Adams, William G., Adibuzzaman, Mohammad, Anderson, Nicholas R., Bahroos, Neil, Bell, Douglas S., Bumgardner, Cody, Campion, Thomas, Castro, Mario, Cimino, James J., Cohen, I. Glenn, Dorr, David, Elkin, Peter L, Fan, Jungwei W., Ferris, Todd, Foran, David J., Hanauer, David, Hogarth, Mike, Huang, Kun, Kalpathy-Cramer, Jayashree, Kandpal, Manoj, Karnik, Niranjan S., Katoch, Avnish, Lai, Albert M., Lambert, Christophe G., Li, Lang, Lindsell, Christopher, Liu, Jinze, Lu, Zhiyong, Luo, Yuan, McGarvey, Peter, Mendonca, Eneida A., Mirhaji, Parsa, Murphy, Shawn, Osborne, John D., Paschalidis, Ioannis C., Harris, Paul A., Prior, Fred, Shaheen, Nicholas J., Shara, Nawar, Sim, Ida, Tachinardi, Umberto, Waitman, Lemuel R., Wright, Rosalind J., Zai, Adrian H., Zheng, Kai, Lee, Sandra Soo-Jin, Malin, Bradley A., Natarajan, Karthik, Price, W. Nicholson II, Zhang, Rui, Zhang, Yiye, Xu, Hua, Bian, Jiang, Weng, Chunhua, Peng, Yifan
This study reports a comprehensive environmental scan of the generative AI (GenAI) infrastructure in the national network for clinical and translational science across 36 institutions supported by the Clinical and Translational Science Award (CTSA) Program led by the National Center for Advancing Translational Sciences (NCATS) of the National Institutes of Health (NIH) at the United States. With the rapid advancement of GenAI technologies, including large language models (LLMs), healthcare institutions face unprecedented opportunities and challenges. This research explores the current status of GenAI integration, focusing on stakeholder roles, governance structures, and ethical considerations by administering a survey among leaders of health institutions (i.e., representing academic medical centers and health systems) to assess the institutional readiness and approach towards GenAI adoption. Key findings indicate a diverse range of institutional strategies, with most organizations in the experimental phase of GenAI deployment. The study highlights significant variations in governance models, with a strong preference for centralized decision-making but notable gaps in workforce training and ethical oversight. Moreover, the results underscore the need for a more coordinated approach to GenAI governance, emphasizing collaboration among senior leaders, clinicians, information technology staff, and researchers. Our analysis also reveals concerns regarding GenAI bias, data security, and stakeholder trust, which must be addressed to ensure the ethical and effective implementation of GenAI technologies. This study offers valuable insights into the challenges and opportunities of GenAI integration in healthcare, providing a roadmap for institutions aiming to leverage GenAI for improved quality of care and operational efficiency.
Multimodal Pragmatic Jailbreak on Text-to-image Models
Liu, Tong, Lai, Zhixin, Zhang, Gengyuan, Torr, Philip, Demberg, Vera, Tresp, Volker, Gu, Jindong
Diffusion models have recently achieved remarkable advancements in terms of image quality and fidelity to textual prompts. Concurrently, the safety of such generative models has become an area of growing concern. This work introduces a novel type of jailbreak, which triggers T2I models to generate the image with visual text, where the image and the text, although considered to be safe in isolation, combine to form unsafe content. To systematically explore this phenomenon, we propose a dataset to evaluate the current diffusion-based text-to-image (T2I) models under such jailbreak. We benchmark nine representative T2I models, including two close-source commercial models. Experimental results reveal a concerning tendency to produce unsafe content: all tested models suffer from such type of jailbreak, with rates of unsafe generation ranging from 8\% to 74\%. In real-world scenarios, various filters such as keyword blocklists, customized prompt filters, and NSFW image filters, are commonly employed to mitigate these risks. We evaluate the effectiveness of such filters against our jailbreak and found that, while current classifiers may be effective for single modality detection, they fail to work against our jailbreak. Our work provides a foundation for further development towards more secure and reliable T2I models.