Goto

Collaborating Authors

 time stronger


Performance of GPT-5 Frontier Models in Ophthalmology Question Answering

Antaki, Fares, Mikhail, David, Milad, Daniel, Mammo, Danny A, Sharma, Sumit, Srivastava, Sunil K, Chen, Bing Yu, Touma, Samir, Sevgi, Mertcan, El-Khoury, Jonathan, Keane, Pearse A, Chen, Qingyu, Tham, Yih Chung, Duval, Renaud

arXiv.org Artificial Intelligence

Importance: Novel large language models (LLMs) such as GPT-5 integrate advanced reasoning capabilities that may enhance performance on complex medical question-answering tasks. For this latest generation of reasoning models, the configurations that maximize both accuracy and cost-efficiency have yet to be established. Objective: To evaluate the performance and cost-accuracy trade-offs of OpenAI's GPT-5 compared to previous generation LLMs on ophthalmological question answering. Design, Setting, and Participants: In August 2025, 12 configurations of OpenAI's GPT-5 series (three model tiers across four reasoning effort settings) were evaluated alongside o1-high, o3-high, and GPT-4o, using 260 closed-access multiple-choice questions from the AAO Basic Clinical Science Course (BCSC) dataset. The study did not include human participants. Main Outcomes and Measures: The primary outcome was accuracy on the 260-item ophthalmology multiple-choice question set for each model configuration. Secondary outcomes included head-to-head ranking of configurations using a Bradley-Terry (BT) model applied to paired win/loss comparisons of answer accuracy, and evaluation of generated natural language rationales using a reference-anchored, pairwise LLM-as-a-judge framework. Additional analyses assessed the accuracy-cost trade-off by calculating mean per-question cost from token usage and identifying Pareto-efficient configurations. Results: The configuration GPT-5-high achieved the highest accuracy (0.965; 95% CI, 0.942-0.985),


Super-sticky hydrogel is 10 times stronger than other glues underwater

New Scientist

A rubber duck that was stuck to a seaside rock for more than a year has proved the strength of a new sticky material. The adhesive could be used in deep-sea robots and repair work, or as surgical glue for medical procedures. "We developed a super-adhesive hydrogel that works extremely well even underwater – something very few materials can achieve," says Hailong Fan at Shenzhen University in China. Hydrogels are stretchy and soft materials. Fan, then at Hokkaido University in Japan, and his colleagues analysed 24,000 sticky protein sequences from many different organisms to identify the stickiest combinations of amino acids, the building blocks of proteins.