time stronger
Performance of GPT-5 Frontier Models in Ophthalmology Question Answering
Antaki, Fares, Mikhail, David, Milad, Daniel, Mammo, Danny A, Sharma, Sumit, Srivastava, Sunil K, Chen, Bing Yu, Touma, Samir, Sevgi, Mertcan, El-Khoury, Jonathan, Keane, Pearse A, Chen, Qingyu, Tham, Yih Chung, Duval, Renaud
Importance: Novel large language models (LLMs) such as GPT-5 integrate advanced reasoning capabilities that may enhance performance on complex medical question-answering tasks. For this latest generation of reasoning models, the configurations that maximize both accuracy and cost-efficiency have yet to be established. Objective: To evaluate the performance and cost-accuracy trade-offs of OpenAI's GPT-5 compared to previous generation LLMs on ophthalmological question answering. Design, Setting, and Participants: In August 2025, 12 configurations of OpenAI's GPT-5 series (three model tiers across four reasoning effort settings) were evaluated alongside o1-high, o3-high, and GPT-4o, using 260 closed-access multiple-choice questions from the AAO Basic Clinical Science Course (BCSC) dataset. The study did not include human participants. Main Outcomes and Measures: The primary outcome was accuracy on the 260-item ophthalmology multiple-choice question set for each model configuration. Secondary outcomes included head-to-head ranking of configurations using a Bradley-Terry (BT) model applied to paired win/loss comparisons of answer accuracy, and evaluation of generated natural language rationales using a reference-anchored, pairwise LLM-as-a-judge framework. Additional analyses assessed the accuracy-cost trade-off by calculating mean per-question cost from token usage and identifying Pareto-efficient configurations. Results: The configuration GPT-5-high achieved the highest accuracy (0.965; 95% CI, 0.942-0.985),
- North America > United States > Ohio > Cuyahoga County > Cleveland (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- (4 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.47)
Super-sticky hydrogel is 10 times stronger than other glues underwater
A rubber duck that was stuck to a seaside rock for more than a year has proved the strength of a new sticky material. The adhesive could be used in deep-sea robots and repair work, or as surgical glue for medical procedures. "We developed a super-adhesive hydrogel that works extremely well even underwater – something very few materials can achieve," says Hailong Fan at Shenzhen University in China. Hydrogels are stretchy and soft materials. Fan, then at Hokkaido University in Japan, and his colleagues analysed 24,000 sticky protein sequences from many different organisms to identify the stickiest combinations of amino acids, the building blocks of proteins.
- Asia > Japan > Hokkaidō (0.28)
- Asia > China > Guangdong Province > Shenzhen (0.26)
- North America > United States > New York (0.06)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.58)
- Education > Health & Safety > School Nutrition (0.58)