carpenter
ChatGPT's Horny Era Could Be Its Stickiest Yet
ChatGPT's Horny Era Could Be Its Stickiest Yet OpenAI will soon let adults create erotic content in ChatGPT. Experts say that could lead to "emotional commodification," or horniness as a revenue stream. In May of 2024, while I was combing through OpenAI's "Model Spec" laying out how ChatGPT should act, one comment buried in the document struck me as peculiar. It said OpenAI was "exploring" how to let adult ChatGPT users generate content with mature themes such as "erotica, extreme gore, slurs, and unsolicited profanity." Seems like the exploration phase is over.
- North America > United States > California (0.04)
- North America > Canada > Manitoba (0.04)
- Europe > Slovakia (0.04)
- Europe > Czechia (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.81)
CellPainTR: Generalizable Representation Learning for Cross-Dataset Cell Painting Analysis
Caruzzo, Cedric, Ye, Jong Chul
Large-scale biological discovery requires integrating massive, heterogeneous datasets like those from the JUMP Cell Painting consortium, but technical batch effects and a lack of generalizable models remain critical roadblocks. To address this, we introduce CellPainTR, a Transformer-based architecture designed to learn foundational representations of cellular morphology that are robust to batch effects. Unlike traditional methods that require retraining on new data, CellPainTR's design, featuring source-specific context tokens, allows for effective out-of-distribution (OOD) generalization to entirely unseen datasets without fine-tuning. We validate CellPainTR on the large-scale JUMP dataset, where it outperforms established methods like ComBat and Harmony in both batch integration and biological signal preservation. Critically, we demonstrate its robustness through a challenging OOD task on the unseen Bray et al. dataset, where it maintains high performance despite significant domain and feature shifts. Our work represents a significant step towards creating truly foundational models for image-based profiling, enabling more reliable and scalable cross-study biological analysis.
Machine Learning-driven Multiscale MD Workflows: The Mini-MuMMI Experience
Pottier, Loïc, Georgouli, Konstantia, Carpenter, Timothy S., Aydin, Fikret, Tempkin, Jeremy O. B., Nissley, Dwight V., Streitz, Frederick H., Scogland, Thomas R. W., Bremer, Peer-Timo, Lightstone, Felice C., Ingólfsson, Helgi I.
Computational models have become one of the prevalent methods to model complex phenomena. To accurately model complex interactions, such as detailed biomolecular interactions, scientists often rely on multiscale models comprised of several internal models operating at difference scales, ranging from microscopic to macroscopic length and time scales. Bridging the gap between different time and length scales has historically been challenging but the advent of newer machine learning (ML) approaches has shown promise for tackling that task. Multiscale models require massive amounts of computational power and a powerful workflow management system. Orchestrating ML-driven multiscale studies on parallel systems with thousands of nodes is challenging, the workflow must schedule, allocate and control thousands of simulations operating at different scales. Here, we discuss the massively parallel Multiscale Machine-Learned Modeling Infrastructure (MuMMI), a multiscale workflow management infrastructure, that can orchestrate thousands of molecular dynamics (MD) simulations operating at different timescales, spanning from millisecond to nanosecond. More specifically, we introduce a novel version of MuMMI called "mini-MuMMI". Mini-MuMMI is a curated version of MuMMI designed to run on modest HPC systems or even laptops whereas MuMMI requires larger HPC systems. We demonstrate mini-MuMMI utility by exploring RAS-RAF membrane interactions and discuss the different challenges behind the generalization of multiscale workflows and how mini-MuMMI can be leveraged to target a broader range of applications outside of MD and RAS-RAF interactions.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Tennessee > Anderson County > Oak Ridge (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- (2 more...)
- Workflow (1.00)
- Research Report (0.82)
RxRx3-core: Benchmarking drug-target interactions in High-Content Microscopy
Kraus, Oren, Comitani, Federico, Urbanik, John, Kenyon-Dean, Kian, Arumugam, Lakshmanan, Saberian, Saber, Wognum, Cas, Celik, Safiye, Haque, Imran S.
High Content Screening (HCS) microscopy datasets have transformed the ability to profile cellular responses to genetic and chemical perturbations, enabling cell-based inference of drug-target interactions (DTI). However, the adoption of representation learning methods for HCS data has been hindered by the lack of accessible datasets and robust benchmarks. To address this gap, we present RxRx3-core, a curated and compressed subset of the RxRx3 dataset, and an associated DTI benchmarking task. At just 18GB, RxRx3-core significantly reduces the size barrier associated with large-scale HCS datasets while preserving critical data necessary for benchmarking representation learning models against a zero-shot DTI prediction task. RxRx3-core includes 222,601 microscopy images spanning 736 CRISPR knockouts and 1,674 compounds at 8 concentrations. RxRx3-core is available on HuggingFace and Polaris, along with pre-trained embeddings and benchmarking code, ensuring accessibility for the research community. By providing a compact dataset and robust benchmarks, we aim to accelerate innovation in representation learning methods for HCS data and support the discovery of novel biological insights.
- North America > United States (0.46)
- Asia > Middle East > Republic of Türkiye > Corum Province > Corum (0.05)
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area > Musculoskeletal (0.68)
- Health & Medicine > Therapeutic Area > Oncology > Leukemia (0.46)
- Government > Regional Government > North America Government > United States Government > FDA (0.46)
The Morning After: A 6 million fine for robocalls from fake Biden
The Federal Communications Commission (FCC) has officially issued its full recommended fine against political consultant Steve Kramer. This is after he initiated a series of robocalls to New Hampshire residents with pre-recorded audio of President Biden's voice, using deepfake AI technology. The fake Biden told voters not to vote in the upcoming primary, saying "Your vote makes a difference in November, not this Tuesday." Kramer must pay 6 million in fines in the next 30 days or the Department of Justice will handle collection, according to a FCC statement. Kramer doesn't just face a fine; he also has criminal charges against him.
- North America > United States > New Hampshire (0.29)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.06)
- North America > United States > California (0.06)
- Law (1.00)
- Information Technology (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
FCC fines political consultant 6 million for deepfake robocalls
The Federal Communications Commission (FCC) has officially issued its full recommended fine against political consultant Steve Kramer for a series of illegal robocalls using deepfake AI technology and caller ID spoofing during the New Hampshire primaries. Kramer must pay 6 million in fines in the next 30 days or the Department of Justice will handle collection, according to a FCC statement. Kramer violated the Truth in Caller ID Act passed in 2009 that prohibits anyone from "knowingly transmit misleading or inaccurate caller identification information with the intent to defraud, cause harm or wrongfully obtain anything of value," according to legislative records. The law preceded the widespread usage of AI, but the FCC voted unanimously to have it apply to such deepfakes this past February. The phony robocalls delivered pre-recorded audio of President Biden's voice using deepfake AI technology to New Hampshire residents leading up to the 2024 presidential primary election.
- North America > United States > New Hampshire (0.53)
- North America > United States > New York (0.09)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.06)
The power of Prompts: Evaluating and Mitigating Gender Bias in MT with LLMs
Sant, Aleix, Escolano, Carlos, Mash, Audrey, Fornaciari, Francesca De Luca, Melero, Maite
This paper studies gender bias in machine translation through the lens of Large Language Models (LLMs). Four widely-used test sets are employed to benchmark various base LLMs, comparing their translation quality and gender bias against state-of-the-art Neural Machine Translation (NMT) models for English to Catalan (En $\rightarrow$ Ca) and English to Spanish (En $\rightarrow$ Es) translation directions. Our findings reveal pervasive gender bias across all models, with base LLMs exhibiting a higher degree of bias compared to NMT models. To combat this bias, we explore prompting engineering techniques applied to an instruction-tuned LLM. We identify a prompt structure that significantly reduces gender bias by up to 12% on the WinoMT evaluation dataset compared to more straightforward prompts. These results significantly reduce the gender bias accuracy gap between LLMs and traditional NMT systems.
- South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.05)
- South America > Argentina > Pampas > Buenos Aires Province (0.04)
- Africa > Southern Africa (0.04)
- (22 more...)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
CaLM: Contrasting Large and Small Language Models to Verify Grounded Generation
Hsu, I-Hung, Wang, Zifeng, Le, Long T., Miculicich, Lesly, Peng, Nanyun, Lee, Chen-Yu, Pfister, Tomas
Grounded generation aims to equip language models (LMs) with the ability to produce more credible and accountable responses by accurately citing verifiable sources. However, existing methods, by either feeding LMs with raw or preprocessed materials, remain prone to errors. To address this, we introduce CaLM, a novel verification framework. CaLM leverages the insight that a robust grounded response should be consistent with information derived solely from its cited sources. Our framework empowers smaller LMs, which rely less on parametric memory and excel at processing relevant information given a query, to validate the output of larger LMs. Larger LM responses that closely align with the smaller LMs' output, which relies exclusively on cited documents, are verified. Responses showing discrepancies are iteratively refined through a feedback loop. Experiments on three open-domain question-answering datasets demonstrate significant performance gains of 1.5% to 7% absolute average without any required model fine-tuning.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Asia > China (0.04)
- North America > United States > New York (0.04)
- (6 more...)
- Media > Music (1.00)
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
Stars take over Paris for sporty Vogue fashion show
Singers, supermodels and sports stars descended on Paris as Vogue World took over a city square and turned it into a runway. The fashion magazine turned the historic Place Vendôme into a catwalk to celebrate 100 years of French fashion. A different sport was used as a backdrop for each decade of fashion from the 1920s to the present day - a month before the capital city hosts the Olympic Games. They're the biggest-selling act in the world, and they're about to play the Pyramid Stage.22 hrs agoCulture1 day ago Many have hit out at the brand online, suggesting they would return fewer items if sizing was consistent.1 day agoBusiness2 days ago As a new exhibition opens in London exploring the career of Naomi Campbell, Britain's first black supermodel, a look at the women who forged a path in fashion.2 The acclaimed fashion designer says it taught her a lesson - that fear was not an option.2
- South America (0.17)
- North America > Central America (0.17)
- Oceania > Australia (0.07)
- (16 more...)
- Textiles, Apparel & Luxury Goods (1.00)
- Leisure & Entertainment > Sports > Olympic Games (0.57)
Masked Autoencoders for Microscopy are Scalable Learners of Cellular Biology
Kraus, Oren, Kenyon-Dean, Kian, Saberian, Saber, Fallah, Maryam, McLean, Peter, Leung, Jess, Sharma, Vasudev, Khan, Ayla, Balakrishnan, Jia, Celik, Safiye, Beaini, Dominique, Sypetkowski, Maciej, Cheng, Chi Vicky, Morse, Kristen, Makes, Maureen, Mabey, Ben, Earnshaw, Berton
Featurizing microscopy images for use in biological research remains a significant challenge, especially for large-scale experiments spanning millions of images. This work explores the scaling properties of weakly supervised classifiers and self-supervised masked autoencoders (MAEs) when training with increasingly larger model backbones and microscopy datasets. Our results show that ViT-based MAEs outperform weakly supervised classifiers on a variety of tasks, achieving as much as a 11.5% relative improvement when recalling known biological relationships curated from public databases. Additionally, we develop a new channel-agnostic MAE architecture (CA-MAE) that allows for inputting images of different numbers and orders of channels at inference time. We demonstrate that CA-MAEs effectively generalize by inferring and evaluating on a microscopy image dataset (JUMP-CP) generated under different experimental conditions with a different channel structure than our pretraining data (RPI-93M). Our findings motivate continued research into scaling self-supervised learning on microscopy data in order to create powerful foundation models of cellular biology that have the potential to catalyze advancements in drug discovery and beyond.
- Asia > Middle East > Republic of Türkiye > Corum Province > Corum (0.05)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
- (2 more...)