Generative AI
Modeling Complex Disease Trajectories using Deep Generative Models with Semi-Supervised Latent Processes
Trottet, Cรฉcile, Schรผrch, Manuel, Allam, Ahmed, Barua, Imon, Petelytska, Liubov, Distler, Oliver, Hoffmann-Vold, Anna-Maria, Krauthammer, Michael, collaborators, the EUSTAR
In this paper, we propose a deep generative time series approach using latent temporal processes for modeling and holistically analyzing complex disease trajectories. We aim to find meaningful temporal latent representations of an underlying generative process that explain the observed disease trajectories in an interpretable and comprehensive way. To enhance the interpretability of these latent temporal processes, we develop a semi-supervised approach for disentangling the latent space using established medical concepts. By combining the generative approach with medical knowledge, we leverage the ability to discover novel aspects of the disease while integrating medical concepts into the model. We show that the learned temporal latent processes can be utilized for further data analysis and clinical hypothesis testing, including finding similar patients and clustering the disease into new sub-types. Moreover, our method enables personalized online monitoring and prediction of multivariate time series including uncertainty quantification. We demonstrate the effectiveness of our approach in modeling systemic sclerosis, showcasing the potential of our machine learning model to capture complex disease trajectories and acquire new medical knowledge.
It's time to have The Talk with kids about AI
The problem is, AI is not magic. Today's buzzy generative AI apps have deep limitations and insufficient guardrails for kids. Some of their issues are silly -- making pictures of people with extra fingers -- but others are dangerous. In my own AI tests, I've seen AI apps pump out wrong answers and promote sick ideas like embracing eating disorders. I've seen AI pretend to be my friend and then give terrible advice.
These lawyers used ChatGPT to save time. They got fired and fined.
While previous generations of technology allowed people to search for specific keywords and synonyms across documents, today's AI models have the potential to make more sophisticated inferences, said Irina Matveeva, chief of data science and AI at Reveal, a Chicago-based legal technology company. For instance, generative AI tools might have allowed a lawyer on the Enron case to ask, "Did anyone have concerns about valuation at Enron?" and get a response based on the model's analysis of the documents.
Generative AI for Hate Speech Detection: Evaluation and Findings
Pendzel, Sagi, Wullach, Tomer, Adler, Amir, Minkov, Einat
Automatic hate speech detection using deep neural models is hampered by the scarcity of labeled datasets, leading to poor generalization. To mitigate this problem, generative AI has been utilized to generate large amounts of synthetic hate speech sequences from available labeled examples, leveraging the generated data in finetuning large pre-trained language models (LLMs). In this chapter, we provide a review of relevant methods, experimental setups and evaluation of this approach. In addition to general LLMs, such as BERT, RoBERTa and ALBERT, we apply and evaluate the impact of train set augmentation with generated data using LLMs that have been already adapted for hate detection, including RoBERTa-Toxicity, HateBERT, HateXplain, ToxDect, and ToxiGen. An empirical study corroborates our previous findings, showing that this approach improves hate speech generalization, boosting recall performance across data distributions. In addition, we explore and compare the performance of the finetuned LLMs with zero-shot hate detection using a GPT-3.5 model. Our results demonstrate that while better generalization is achieved using the GPT-3.5 model, it achieves mediocre recall and low precision on most datasets. It is an open question whether the sensitivity of models such as GPT-3.5, and onward, can be improved using similar techniques of text generation.
YouTube to Require Creators to Disclose Use of Generative AI
YouTube is rolling out new rules for AI content, including a requirement that creators reveal whether they've used generative artificial intelligence to make realistic looking videos. In a blog post Tuesday outlining a number of AI-related policy updates, YouTube said creators that don't disclose whether they've used AI tools to make "altered or synthetic" videos face penalties including having their content removed or suspension from the platform's revenue sharing program. "Generative AI has the potential to unlock creativity on YouTube and transform the experience for viewers and creators on our platform," Jennifer Flannery O'Connor and Emily Moxley, vice presidents for product management, wrote in the blog post. "But just as important, these opportunities must be balanced with our responsibility to protect the YouTube community." The restrictions expand on rules that YouTube's parent company, Google, unveiled in September requiring that political ads on YouTube and other Google platforms using artificial intelligence come with a prominent warning label.
Behind Microsoft CEO Satya Nadella's push to get AI tools in developers' hands
Two days later on another stage, in another venue, at another developers' conference, Nadella made his second unannounced appearance of the week--this time at GitHub Universe. There Thomas Dohmke, GitHub's CEO, was showing off a new version of the company's AI programming tool, Copilot, that can generate computer code from natural language. Nadella was effusive: "I can code again!" he exclaimed. Today, Nadella will be onstage speaking to developers at Microsoft Ignite, where the company is announcing even more AI-based developer tools, including an Azure AI Studio that will let devs choose between model catalogs from not only Microsoft, but also the likes of Meta, OpenAI, and Hugging Face, as well as new tools for customizing Copilot for Microsoft 365. If it seems like Nadella is obsessed with developers, you're not wrong.
Microsoft will use custom-designed chips to bolster its AI services
Microsoft has announced a project it has been "refining in secret for years;" Its own custom silicon in the form of two new server chips. The company unveiled the fruits of its labor at Microsoft Ignite, showing off the Azure Maia AI Accelerator and the Azure Cobalt CPU. The latter of which, at least, the company is happy to admit is ARM-based, which can still feel unthinkable to eyes so used to Microsoft and Intel's hand-in-glove dominance of the computing market. The company turned to OpenAI to receive feedback on Azure Maia and to use the company's models for testing. OpenAI CEO Sam Altman said the updated Microsoft's Azure will also provide the opportunity for training improved models and making them more affordable for customers.
Microsoft rebrands its AI-powered Bing Chat as Copilot
Microsoft is rebranding Bing Chat and is now simply calling it "Copilot," giving its generative AI assistant a consistent identity across its products. Similarly, Bing Chat Enterprise will be known "Copilot Pro," and it will be generally available starting on December 1. It will still be free for specific Microsoft 365 licenses, which will include F3 accounts for frontline workers, though the $5-a-month standalone subscription will be available that day, as well. The Copilot Pro is based on OpenAI's latest models, GPT-4 and DALL-E 3, and the company says it will not save prompts and responses. Microsoft will not see interactions happening within Copilot Pro at all, and it will not use customers' chats to further train the underlying models.
How Sam Altman is pushing OpenAI into the 'Big Tech' pantheon
In May, the company began a hiring spree, poaching executives from Meta, Apple and Amazon Web Services. Last month, the company expanded its footprint in San Francisco, subleasing nearly 445,000 square feet of office space from Uber, purchased when then-CEO Travis Kalanick was still the most envied founder in the Valley.
Synthetically Enhanced: Unveiling Synthetic Data's Potential in Medical Imaging Research
Khosravi, Bardia, Li, Frank, Dapamede, Theo, Rouzrokh, Pouria, Gamble, Cooper U., Trivedi, Hari M., Wyles, Cody C., Sellergren, Andrew B., Purkayastha, Saptarshi, Erickson, Bradley J., Gichoya, Judy W.
Chest X-rays (CXR) are the most common medical imaging study and are used to diagnose multiple medical conditions. This study examines the impact of synthetic data supplementation, using diffusion models, on the performance of deep learning (DL) classifiers for CXR analysis. We employed three datasets: CheXpert, MIMIC-CXR, and Emory Chest X-ray, training conditional denoising diffusion probabilistic models (DDPMs) to generate synthetic frontal radiographs. Our approach ensured that synthetic images mirrored the demographic and pathological traits of the original data. Evaluating the classifiers' performance on internal and external datasets revealed that synthetic data supplementation enhances model accuracy, particularly in detecting less prevalent pathologies. Furthermore, models trained on synthetic data alone approached the performance of those trained on real data. This suggests that synthetic data can potentially compensate for real data shortages in training robust DL models. However, despite promising outcomes, the superiority of real data persists.