Goto

Collaborating Authors

 Large Language Model


MedSafetyBench: Evaluating and Improving the Medical Safety of Large Language Models

Neural Information Processing Systems

As large language models (LLMs) develop increasingly sophisticated capabilities and find applications in medical settings, it becomes important to assess their medical safety due to their far-reaching implications for personal and public health, patient safety, and human rights. However, there is little to no understanding of the notion of medical safety in the context of LLMs, let alone how to evaluate and improve it. To address this gap, we first define the notion of medical safety in LLMs based on the Principles of Medical Ethics set forth by the American Medical Association. We then leverage this understanding to introduce MedSafetyBench, the first benchmark dataset designed to measure the medical safety of LLMs. We demonstrate the utility of MedSafetyBench by using it to evaluate and improve the medical safety of LLMs. Our results show that publicly-available medical LLMs do not meet standards of medical safety and that fine-tuning them using MedSafetyBench improves their medical safety while preserving their medical performance. By introducing this new benchmark dataset, our work enables a systematic study of the state of medical safety in LLMs and motivates future work in this area, paving the way to mitigate the safety risks of LLMs in medicine.


Musk and Altman's bitter feud over OpenAI to be laid bare in court

The Guardian

The tech titans are slated to duke it out in court. The tech titans are slated to duke it out in court. Musk and Altman's bitter feud over OpenAI to be laid bare in court Tesla chief believes Altman broke company's founding agreement - and legal battle promises to be explosive T he bitter rivalry between two of the tech world's most powerful men arrives in court this week, as Elon Musk's lawsuit against Sam Altman and OpenAI heads to trial in Oakland, California. The case is set to feature some of the biggest names in Silicon Valley, and its outcome could affect the course of the AI boom. Musk's suit, filed in 2024, focuses on the formative years of OpenAI when he, Altman and others co-founded the artificial intelligence company as a nonprofit with a grand purpose.


Distribution of Mentioned IDs17R2>= 3# of IDs

Neural Information Processing Systems

For each image's list of candidate objects, we heuristically downsample to a set of "most interesting" regions by: 1) selecting the at-most k " 4 largest/most central people; 2) keeping the most central/large objects; 3) over-sampling rarer objects according to prior frequency of detection in the LVIS vocabulary; 4) limiting the number of objects of a single type per-image; and 5) downsampling overlapping region proposals to encourage broader coverage of the pixel area of the image.





OpenAI's Sam Altman apologizes for not reporting ChatGPT account of Tumbler Ridge suspect to police

Engadget

OpenAI's Sam Altman apologizes for not reporting ChatGPT account of Tumbler Ridge suspect to police Altman penned a letter addressed to the community of Tumbler Ridge, two months following the mass shooting incident. Two months following the deadly shooting in Tumbler Ridge, British Columbia, OpenAI's Sam Altman has formally apologized for not informing police of the alarming ChatGPT conversations seen with the suspect's account. Before the incident, OpenAI banned the account belonging to the alleged shooter, Jesse Van Rootselaar, for violating its usage policy due to potential for real-world violence. I am deeply sorry that we did not alert law enforcement to the account that was banned in June, Altman wrote in the letter. While I know words can never be enough, I believe an apology is necessary to recognize the harm and irreversible loss your community has suffered.



Evaluating and Inducing Personality in Pre-trained Language Models

Neural Information Processing Systems

Standardized and quantified evaluation of machine behaviors is a crux of understanding LLMs. In this study, we draw inspiration from psychometric studies by leveraging human personality theory as a tool for studying machine behaviors. Originating as a philosophical quest for human behaviors, the study of personality delves into how individuals differ in thinking, feeling, and behaving. Toward building and understanding human-like social machines, we are motivated to ask: Can we assess machine behaviors by leveraging human psychometric tests in a principled and quantitative manner? If so, can we induce a specific personality in LLMs? To answer these questions, we introduce the Machine Personality Inventory (MPI) tool for studying machine behaviors; MPI follows standardized personality tests, built upon the Big Five Personality Factors (Big Five) theory and personality assessment inventories.