Generative AI
Evaluating Generative AI Tools for Personalized Offline Recommendations: A Comparative Study
Salinas-Buestan, Rafael, Parra, Otto, Condori-Fernandez, Nelly, Granda, Maria Fernanda
Background: Generative AI tools have become increasingly relevant in supporting personalized recommendations across various domains. However, their effectiveness in health-related behavioral interventions, especially those aiming to reduce the use of technology, remains underexplored. Aims: This study evaluates the performance and user satisfaction of the five most widely used generative AI tools when recommending non-digital activities tailored to individuals at risk of repetitive strain injury. Method: Following the Goal/Question/Metric (GQM) paradigm, this proposed experiment involves generative AI tools that suggest offline activities based on predefined user profiles and intervention scenarios. The evaluation is focused on quantitative performance (precision, recall, F1-score and MCC-score) and qualitative aspects (user satisfaction and perceived recommendation relevance). Two research questions were defined: RQ1 assessed which tool delivers the most accurate recommendations, and RQ2 evaluated how tool choice influences user satisfaction.
A Single Poisoned Document Could Leak 'Secret' Data Via ChatGPT
The latest generative AI models are not just stand-alone text-generating chatbots--instead, they can easily be hooked up to your data to give personalized answers to your questions. OpenAI's ChatGPT can be linked to your Gmail inbox, allowed to inspect your GitHub code, or find appointments in your Microsoft calendar. But these connections have the potential to be abused--and researchers have shown it can take just a single "poisoned" document to do so. New findings from security researchers Michael Bargury and Tamir Ishay Sharbat, revealed at the Black Hat hacker conference in Las Vegas today, show how a weakness in OpenAI's Connectors allowed sensitive information to be extracted from a Google Drive account using an indirect prompt injection attack. In a demonstration of the attack, dubbed AgentFlayer, Bargury shows how it was possible to extract developer secrets, in the form of API keys, that were stored in a demonstration Drive account.
Microsoft's agentic HTML can leak passwords and AI keys, researcher finds
With new AI systems comes new AI vulnerabilities, and a big one was just discovered. Microsoft calls this technique NLWeb, which is a kind of HTML for AI agents. The company unveiled this at its Build conference this spring, and has since leaned into that vision with an experimental Copilot Mode for its Edge browser. Researcher Aonan Guan, however, has discovered a vulnerability in NLWeb: a path traversal bug that lets any remote user read sensitive files like system configurations and cloud credentials via a malformed URL. In a Medium post, Guan showed how he was able to download a list of the system passwords along with Google Gemini and OpenAI keys. This would let an attacker run additional server-dependent AI applications "for free," without being charged by OpenAI.
OpenAI Announces Massive US Government Partnership
OpenAI is partnering with the US government to make its leading frontier models available to federal employees. Under the agreement, federal agencies can access OpenAI's models for 1 for the next year, per a Wednesday announcement from the company and the General Services Administration (GSA). The partnership is the culmination of months of effort on the part of OpenAI CEO Sam Altman and other OpenAI executives, who have been cozying up to the Trump administration since before President Donald Trump retook the White House in January. Since at least May of this year, high-ranking OpenAI employees have been meeting with the GSA and other government agencies, such as the Food and Drug Administration, to promote the company's tools, according to documents obtained by WIRED. On July 23, OpenAI chief operating officer Brad Lightcap and other OpenAI executives were invited to a private after-party hosted by the Hill and Valley Forum in Washington, DC.
OpenAI in talks on share sale that would price it above Elon Musk's SpaceX
OpenAI is reportedly in early talks about a sale of shares held by current and former employees that would value it at half a trillion dollars, overtaking Elon Musk's SpaceX. If the transaction goes ahead, the value of the ChatGPT developer would rise by about two-thirds, from 300bn ( 225bn). Musk's rocket companyis currently worth 350bn and is reportedly circling a 400bn price tag in a new fundraising. Bloomberg, which first reported the OpenAI talks, said existing investors, including Thrive Capital, have approached the company about buying employee shares. Other investors in OpenAI, which is based in San Francisco, include the Japanese investment company SoftBank, which led the 300bn financing, and Microsoft.
SCOOP: Trump admin, OpenAI partner to unleash artificial intelligence on federal government
NVIDIA CEO and co-founder Jensen Huang commends President Donald Trump's A.I. agenda and outlines what the country's job future will look like on'Special Report.' FIRST ON FOX: The federal government is stepping into the future and embracing artificial intelligence, specifically ChatGPT, across its agencies, which proponents say will streamline productivity while solidifying President Donald Trump's pledge to keep the U.S. in the driver's seat of the cutting-edge technology, Fox News Digital exclusively learned. The U.S. General Services Administration announced Wednesday that OpenAI's ChatGPT Enterprise is now available to all federal agencies to incorporate into their workflow at a 1 per agency cost, the GSA told Fox Digital. The deal with OpenAI, the tech company behind ChatGPT, is part of GSA's OneGov Strategy that aims to modernize "how the federal government purchases goods and services" under the Trump administration. "The use of this tool has been deployed and tested with responsible policy makers, with responsible legal folks," GSA Federal Acquisition Service Commissioner Josh Gruenbaum told Fox News Digital of integrating AI into the federal government.
The Download: OpenAI's open-weight models, and the future of internet search
The news: OpenAI has finally released its first open-weight large language models since 2019's GPT-2. Unlike the models available through OpenAI's web interface, these new open models can be freely downloaded, run, and even modified on laptops and other local devices. Why it matters: These releases re-establish OpenAI as a presence for users of open models. That's particularly notable at a time when Meta, which had previously dominated the American open-model landscape with its Llama models, may be reorienting toward closed releases--and when Chinese open models are becoming more popular than their American competitors. MIT Technology Review Narrated: AI means the end of internet search as we've known it The biggest change to the way search engines deliver information to us since the 1990s is happening right now.
AI4Research: A Survey of Artificial Intelligence for Scientific Research
Chen, Qiguang, Yang, Mingda, Qin, Libo, Liu, Jinhao, Yan, Zheng, Guan, Jiannan, Peng, Dengyun, Ji, Yiyan, Li, Hanjing, Hu, Mengkang, Zhang, Yimeng, Liang, Yihao, Zhou, Yuhang, Wang, Jiaqi, Chen, Zhi, Che, Wanxiang
Recent advancements in artificial intelligence (AI), particularly in large language models (LLMs) such as OpenAI-o1 and DeepSeek-R1, have demonstrated remarkable capabilities in complex domains such as logical reasoning and experimental coding. Motivated by these advancements, numerous studies have explored the application of AI in the innovation process, particularly in the context of scientific research. These AI technologies primarily aim to develop systems that can autonomously conduct research processes across a wide range of scientific disciplines. Despite these significant strides, a comprehensive survey on AI for Research (AI4Research) remains absent, which hampers our understanding and impedes further development in this field. To address this gap, we present a comprehensive survey and offer a unified perspective on AI4Research. Specifically, the main contributions of our work are as follows: (1) Systematic taxonomy: We first introduce a systematic taxonomy to classify five mainstream tasks in AI4Research. (2) New frontiers: Then, we identify key research gaps and highlight promising future directions, focusing on the rigor and scalability of automated experiments, as well as the societal impact. (3) Abundant applications and resources: Finally, we compile a wealth of resources, including relevant multidisciplinary applications, data corpora, and tools. We hope our work will provide the research community with quick access to these resources and stimulate innovative breakthroughs in AI4Research.
Cognitive Loop via In-Situ Optimization: Self-Adaptive Reasoning for Science
Cheng, Newman, Broadbent, Gordon, Chappell, William
The capacity for artificial intelligence (AI) to formulate, evolve, and test altered thought patterns under dynamic conditions indicates advanced cognition that is crucial for scientific discovery. The existing AI development landscape falls into two categories: 1) frameworks over non-reasoning models that natively incorporate opinions on how humans think, and 2) reasoning models that abstract precise control of the reasoning intuition away from end users. While powerful, for scientists to maximize utility of AI in scientific discovery, they not only require accuracy and transparency in reasoning, but also steerability. Hence, we introduce an alternative approach that enables deep and precise control over the reasoning process called: a cognitive loop via in-situ optimization (CLIO). CLIO enables large language models (LLMs) to self-formulate ways of approaching a problem, adapt behavior when self-confidence is low, and ultimately provide scientists with a final belief or answer. Through CLIO's open design, scientists can observe uncertainty levels, understand how final belief states are formulated using graph structures, and interject corrections. Without any further post-training, OpenAI's GPT-4.1 with CLIO yields an accuracy of 22.37\% in text-based biology and medicine questions on Humanity's Last Exam (HLE). This yields a 13.82\% net or 161.64\% relative increase when compared to the base GPT-4.1 model and surpasses OpenAI's o3 performance in high and low reasoning effort modes. We further discovered that oscillations within internal uncertainty measures are key in determining the accuracy of CLIO's results, revealing how its open design and internal mechanisms can provide insight and control into scientific decision-making processes.
OpenAI releases two 'open' AI models after DeepSeek's success
OpenAI is releasing a pair of open and freely available artificial intelligence models that can mimic the human process of reasoning, months after China's DeepSeek gained global attention with its own open AI software. The two models, called GPT-oss-120b and GPT-oss-20b, will be available on AI software hosting platform Hugging Face and can produce text -- but not images or videos -- in response to user prompts, OpenAI said on Tuesday. These models can also carry out complex tasks like writing code and looking up information online on a user's behalf, the company said. Crucially, the models are both open-weight systems, similar to Meta Platforms' Llama. The term "weight" refers to the parameters in an AI model.